Multithreading

Started by Wrecker013, August 24, 2018, 03:55:15 PM

Previous topic - Next topic

Wrecker013

I'm well aware that there are innumerable other priorities currently present regarding the development and final release of Rimworld, and that the presence of a 64-bit executable in the experimental version is more than I could have hoped for, but I can't help but wonder if at this current moment in time multithreading support is within the realm of possibility, given that it's not something that modders could eventually add as far as I'm aware.

Rimworld's heavy reliance on individual path-finding and job selection makes it an easy fit for the delegation of processing tasks to individual cores, at least I believe from a layman's perspective it would.

Of course, I could be entirely wrong and such is a moot discussion, but I can't help but ask!

Nafensoriel

#1
It already is to a degree.

You won't get pure 100% parallel processing, however. For one it wouldn't really "do" anything other than increase complexity and the rare events where it would do anything fall pretty squarely outside "sane gameplay".

Multithread is kinda like DX12. Most people really just don't understand it well enough to know why it won't work the way the media releases claim it to work. For one most parallelism has to be baked in from the start. Trying to crib it in after the fact is often a full rewrite for something large scale. In many cases the applications of things like DX12 to overcome drawcall limits(the GPU version of multithreading) come at a huge manhour cost and unless you design exclusively around that one API/function you get next to zero benefits out of it.
If you want a weird mental image to describe how full parallelism works try to picture sending out 10 pizza deliveries to the same house but..
1] all of them must take different routes.
2] all of them must arrive exactly in the order they need to.
3] if one crashes all 10 cars crash.. No one gets pizza. Everyones sad.

For practical purposes rimworld is multithreaded. Most games actually are. You very rarely will find the audio thread on the same one as pathfinding for example.

/edit
Just thought of something.
Galactic Civilization 3 is the type of full parallelism done right.. but it has its rather major drawbacks. One of them is resources. To do everything "at once" snowballs really damn fast on larger maps which people absolutely insist on playing because of the fact that its an option available to them. It's improved slowly over time but its initial memory usage for the largest map size with a moderate to small amount of AI enemies could easily consume 64gb of memory. It also absolutely hammers the CPU when it does this. For GalCiv3 this isn't really a problem because your turn times(its turn-based) cause natural breaks in processing. If you tried this on a real-time game of the same scale it would be punishing on your system's cooling and processor. Any tiny flaw would cause a bluescreen.  This doesn't even go into the fact that the nightmare of NOT having PREDICTABLE times to do things would exponentially increase the complexity.

In a turn-based game setting, you know x-y-z 1-2-3 are going to happen x-y-z 1-2-3 every single time unless you, the coder, made a mistake.
This means you know you can play each AI on its own thread and when they all "stop" you can complete the turn thread. Doing this in real time requires you to make arbitrary moments for things to enter the main "things happening" thread or you risk different events happening simultaneously. This means latency.. and the practical effect of latency is why speeding in a city doesn't get you to your destination any faster. Every "stop" costs more "time" than you save. 

RawCode

Just imagine, you have 10 fully functional hands, number of objects and single table.

Your first task is to place all objects on table, you can use all hands at same time, but this will result in objects, put into each other.

ever if hand had checked is space free before putting object, in moment after check is complete but before object is actually placed, other object may be placed in that space by other hand, that happened to be just a bit faster.

as result, you multithreaded table will be complete mess and you need some kind of synchonisation, you bring 11th hand that does all checks but do not put any objects on it's own.

as result, 11th hand will be major bottleneck and may work slower then single thread due to context switch overhead.

more threads != faster
64 bits != faster

Snafu_RW

Quote from: Nafensoriel on August 24, 2018, 05:43:34 PMMultithread is kinda like DX12. Most people really just don't understand it well enough to know why it won't work the way the media releases claim it to work. For one most parallelism has to be baked in from the start. Trying to crib it in after the fact is often a full rewrite for something large scale. In many cases the applications of things like DX12 to overcome drawcall limits(the GPU version of multithreading) come at a huge manhour cost and unless you design exclusively around that one API/function you get next to zero benefits out of it.
Ummm.. My 'informed layman' understanding of MT programming thinks of some objections: Yes, MT will help with GFX & sound, but RW isn't media-oriented (ie AV content) for that to be any problem IMO (interrupts notwithstanding). Again, yes: MT programming architecture does generally require a complete rewrite of the core modules (hence Toady's reluctance to use it in DF), but it could be implemented as a project post-release, perhaps with limited contributors (outsource is /sometimes/ good practice!)

QuoteIf you want a weird mental image to describe how full parallelism works try to picture sending out 10 pizza deliveries to the same house but..
1] all of them must take different routes.
2] all of them must arrive exactly in the order they need to.
3] if one crashes all 10 cars crash.. No one gets pizza. Everyones sad.
Umm.. that's a little extreme I think. If an  individual  thread fails it shouldn't cause a cascade, even if the result of its doing so may ultimately result in a core crash: ie bad programming (altho this may be unavoidable in certain situations)

Quote
Galactic Civilization 3 is the type of full parallelism done right.. but it has its rather major drawbacks. One of them is resources. [..] the largest map size with a moderate to small amount of AI enemies could easily consume 64gb of memory. It also absolutely hammers the CPU when it does this.
GC3 is more AV-oriented, despite its turn-based strategy, so I'm not that surprised that it takes up loadsa RAM (whether CPU or GFX) for a few seconds while turns ar processing One of the advantages of RW for this kind of thinking is that sprites are extremely light on GFX use, & the audio, as mentioned, is easily farmed off to another thread
Dom 8-)

Nafensoriel

Don't read to much into my attempt to dumb down parallelism Snafu. Rawcode has a better analogy than I did but both work.

As for why I compared DX12 vs MT.. In essence, they do the same task breakup. DX12s main advantage is you are not limited to one thread for drawcalls thus you can scale as far as the CPU side can handle it. Rimworld itself doesn't require media but it would probably require a single controlling thread thanks to so many things going on that need to happen together. To be fair the coding I do involves instruments not games so someone else with more game experience will have to chime in if I'm totally off my rocker. The point I was attempting to make is that even with parallelism rimworlds design isn't really built to handle it and even if you rebuilt it you wouldn't see some insane performance advantage for the average player.

Additionally, galciv3s memory issue has nothing to do with its gfx components during turn processing. The game gobbles memory for pathfinding and decision making. In excess of 60gb for the initial release. I think they have drawn it down to around 16-32gb at the moment though. Anything dealing with GFX typically gets loaded first and stay loaded or are streamed as needed from a physical cache. GFX memory loads don't normally change unless you load a new area.

As far as Rimworld is concerned things like audio are already farmed off to another thread. The game is unity based and simple things like that are baked into its toolkit. The areas where rimworld would be... fun to design parallelism around is components like the social system considering its interactions with so much.

RawCode

no reason to move things like social tracker or health tracker off main thread due to lighweight nature of such things.
it's just like calculation of simple things like A+B offthread, you will switch context 1000++ times longer then actual calculation.


zizard

It's likely that there is a lot of conventional optimisation available with better value for effort before multithreading.