Multi-core support

Started by Nickname34, January 19, 2016, 12:43:34 PM

Previous topic - Next topic

Ectoplasm

Quote from: minami26 on February 12, 2016, 03:13:43 AM
all in all I think Tynan is doing superb on RimWorld and I hope this piece of software he made will be a legend once its brought high above the atmosphere for everyone to enjoy. Once RimWorld goes live we'll all just look back at how we all argued what RimWorld should be, rather than what RimWorld has become.

Well said.

Vas

Quote from: hoho on February 12, 2016, 02:08:29 AM
Quote from: Vas on February 04, 2016, 08:09:01 AM
Thread 1 handles temperature changes.  Thread 2 handles pawn hauling jobs.  Thread 3 handles Pawn Cleaning jobs including snow and calculates whee they should go next each job.  Thread 4 handles pawn hunger and tiredness levels, beauty calculations and other such.  Thread 5 handles mental levels and thought defs.  Thread 6 handles power grid states.  Thread 7 handles random event calculations.  Thread 8 handles Animals and Plants on the map.  Game runs at normal/fast speeds because there is nothing waiting in line to be executed.
Nice hypothesis. Too bad all these things are depending on each other meaning there is tons of synchronization and waiting for intermediate results between threads meaning it that parallelization won't give anywhere near linear increase in speed.
You do realize the game doesn't need to wait for any reason to get a temp right?  Temp can entirely be separated into a different thread, nothing needs to wait on it.  Heat/Cold pushers while powered, will be processed by the temperature thread.
Path finding can also be separated I believe, and path finding is quite the intensive event.  Ever had 100 tribals spawn on your map?
Sure, some things will need to wait on a thread, but it may not be as much work.  I'm not entirely sure but I'm guessing here:
Thread 1 (main thread) : Pawn starts cleaning? == True
Thread 1: Push to Thread 4
Thread 4: Path find to nearest dirt spot.
Thread 4: Clean spot.
Thread 4: Loop 10x times.
Thread 4: Check if pawn needs to change jobs.  If False, loop 10x times.  If True, push to Thread 1.

Quote from: hoho on February 12, 2016, 02:08:29 AM
Quote from: Vas on February 04, 2016, 08:09:01 AM
It really sucks everyone here has taken up to trolling the community for requesting multi-core support all the time.
Most people I've seen "trollin" here are people that actually know a bit or two about programming.
I was talking about the first few posts, not the discussions after I came in.  It doesn't matter if you know a thing or two about programming, it doesn't give you the right to troll someone who doesn't.  You explain why x feature can or can't be done, and you leave it at that.  You don't go "Hah, sure, when pigs fly." like some ass hat.

Quote from: hoho on February 12, 2016, 02:34:40 AM
QuoteBy the way, as stated before some things in this game CAN be offloaded to other cores.  Path finding, for example.
I'm almost certain that pathfinding in this game is one of the least expensive algorithms, unless awfully inefficient algorithms are being used right now. Maps are relatively small and there aren't many pawns running around. No, 100 isn't many for half-decent pathfinders on a 500k square area.
I've had large raider spawns land on my map, mostly Tribals, that do it, causing massive lag as they all path find to their idle spot, then the lag settles slightly, then starts up again when they start path finding to my elaborate base setup.  This even happens on the default sized map with little to no mods.

Quote from: hoho on February 12, 2016, 02:34:40 AM
QuoteAlso, temperature calculations can be offloaded to another core in order to speed up temp checks.
As I said earlier, those checks aren't independent and depend on other things leading to having to synchronize things leading to nowhere near as fast speedups.

For example, if temperature check has to alter parameters of something it means nothing else can access that something during the operation. The data of the "something" has to be locked, modified and unlocked. Every single time anything else access that object, that lock has to be checked no matter if it is locked or not in reality. It means often times the locked status has to be transferred from one core to another, through RAM in worst-case scenarios. If something does have it locked then everyone else are just dead in the water busylooping and checking for when the lock is released. In other words, everything else waiting behind the lock are wasting CPU time while doing nothing useful.
If an object in the game needs to check the temp that is using Thread 1, and Thread 6 is handling temperatures.  Thread 6 does the temp calculations and sets the temp of a room into memory so that any Thread 1,2,3,4,5,7,8 object can check it any time.  It keeps outputting the temp to those memory values as often as it is told to.  Anything that pushes heat or pushes cooling, is processed by thread 6.  If such object relies on power, the power thread, thread 4 (for example) can keep a list of powered objects in memory, and alter their state like that so thread 6 simply needs to check that Item624722 Power == 1 before processing it's heat/cold pusher stats.  Thread 6 does not need to alter the power net to check the state of an object.  Thread 4 does not need to alter the temperature of anything to check the amount of heat in an area.  They check a memory value before running their command.

Instead of Thread 1:
Check Power Net;
Power Net Positive == False;
Check power net storage;
Power net storage == Positive;
Set Item624722 power state to True;
Set Item11935 power state to True;
Set other items power state to True; (probably 40 more lines)
Check Item624722 power state;
Item624722 power state == True
Item624722 pushes 16 units of heat to Room662;
Check Room662 Size;
Room 662 size == 30; (cells)
Alter Room662 temperature + 0.53;
Is Room662 temperature dangerous;
Room662 temperature < 400;
Check Item11935 power state; (+7 more lines)
Check other heat/cold pusher power states; (In my case, normally 20 items here, +7 each one)
Check OutdoorsTemp;
Alter all temps inside walls touching Outdoors -0.7; (Probably 20 lines of stuff here)
pawn job7225 (and following 6 lines);
(+60 more lines of miscellaneous stuff);
Check Power Net; (looping back to beginning for the next tick)


You'll get Threads 1, 4, 6:

Simultaneously:
Thread 1: Pawn Jobs (+30 more lines)
Thread 1: Countdowns for story teller events (+5 lines)
Thread 1: Set memory values for various things so other threads can access quickly (+12 lines)

Simultaneously:
Thread 4: Check Power Net;
Thread 4: Power Net Positive == False;
Thread 4: Check power net storage12; (memory value for that particular networked storage)
Thread 4: Power net storage12 == Positive;
Thread 4: Set Item624722 power state to True;
Thread 4: Check Power Net; (loop back to beginning of this stage, for all other items)

Simultaneously:
Thread 6: Check Item624722 power state;
Thread 6: Item624722 power state == True;
Thread 6: Item624722 pushes 16 units of heat to Room662;
Thread 6: Check Room662 Size;
Thread 6: Room 662 size == 30; (cells)
Thread 6: Alter Room662 temperature + 0.53;
Thread 6: Is Room662 temperature dangerous;
Thread 6: Room662 temperature < 400;
Thread 6: Check Item11935 power state; (loop back to beginning for all other heat/cold pushers)
Thread 6: Check OutdoorsTemp;
Thread 6: Alter all temps inside walls touching Outdoors -0.7; (Probably 20 lines of stuff here)

I'm just, making a rough guess here for this stuff.  But Power Net (thread 4) uploads the state of the power grid for each item to a memory value, while temperature (thread 6) checks those states before executing the heat/cold pushing power of that thing and altering the temperature before uploading the temperature of that room to memory both doing this simultaneously and synchronization doesn't matter here because the temperature from the previous check 1 tick ago is still there and it is doubtful to have changed dramatically enough in a single tick to matter if it checked 1 nanosecond too soon.  While Thread 1 is free to carry out the other commands minus all temperature and power net calculations, where it only needs to run a single "Check state" command to check a value from memory that the other thread had uploaded there.

Granted, I don't know how this stuff works.  This is just how I imagine it would work if it were optimized with multi thread.
Click to see my steam. I'm a lazy modder who takes long breaks and everyone seems to hate.

TheGentlmen

#62
Argument ad nausum.

Quote from: Vas on February 14, 2016, 02:41:40 PM
Quote from: hoho on February 12, 2016, 02:08:29 AM
Quote from: Vas on February 04, 2016, 08:09:01 AM
Thread 1 handles temperature changes.  Thread 2 handles pawn hauling jobs.  Thread 3 handles Pawn Cleaning jobs including snow and calculates whee they should go next each job.  Thread 4 handles pawn hunger and tiredness levels, beauty calculations and other such.  Thread 5 handles mental levels and thought defs.  Thread 6 handles power grid states.  Thread 7 handles random event calculations.  Thread 8 handles Animals and Plants on the map.  Game runs at normal/fast speeds because there is nothing waiting in line to be executed.
Nice hypothesis. Too bad all these things are depending on each other meaning there is tons of synchronization and waiting for intermediate results between threads meaning it that parallelization won't give anywhere near linear increase in speed.
You do realize the game doesn't need to wait for any reason to get a temp right?  Temp can entirely be separated into a different thread, nothing needs to wait on it.  Heat/Cold pushers while powered, will be processed by the temperature thread.
Path finding can also be separated I believe, and path finding is quite the intensive event.  Ever had 100 tribals spawn on your map?
Sure, some things will need to wait on a thread, but it may not be as much work.  I'm not entirely sure but I'm guessing here:
Thread 1 (main thread) : Pawn starts cleaning? == True
Thread 1: Push to Thread 4
Thread 4: Path find to nearest dirt spot.
Thread 4: Clean spot.
Thread 4: Loop 10x times.
Thread 4: Check if pawn needs to change jobs.  If False, loop 10x times.  If True, push to Thread 1.

Gawd no. Tynan will not rewrite the jobs system just for you.

Also, having 1 thread per pawn won't end well. The game *will not* be faster, if anything slower due to the extra overhead starting and calling another thread.

Quote from: Vas on February 14, 2016, 02:41:40 PM
Quote from: hoho on February 12, 2016, 02:34:40 AM
QuoteBy the way, as stated before some things in this game CAN be offloaded to other cores.  Path finding, for example.
I'm almost certain that pathfinding in this game is one of the least expensive algorithms, unless awfully inefficient algorithms are being used right now. Maps are relatively small and there aren't many pawns running around. No, 100 isn't many for half-decent pathfinders on a 500k square area.
I've had large raider spawns land on my map, mostly Tribals, that do it, causing massive lag as they all path find to their idle spot, then the lag settles slightly, then starts up again when they start path finding to my elaborate base setup.  This even happens on the default sized map with little to no mods.
Tribals don't only do pathfinding... you know that right?
They need to decide what to do, ect.

Quote from: Vas on February 14, 2016, 02:41:40 PM
Quote from: hoho on February 12, 2016, 02:34:40 AM
QuoteAlso, temperature calculations can be offloaded to another core in order to speed up temp checks.
As I said earlier, those checks aren't independent and depend on other things leading to having to synchronize things leading to nowhere near as fast speedups.

For example, if temperature check has to alter parameters of something it means nothing else can access that something during the operation. The data of the "something" has to be locked, modified and unlocked. Every single time anything else access that object, that lock has to be checked no matter if it is locked or not in reality. It means often times the locked status has to be transferred from one core to another, through RAM in worst-case scenarios. If something does have it locked then everyone else are just dead in the water busylooping and checking for when the lock is released. In other words, everything else waiting behind the lock are wasting CPU time while doing nothing useful.
If an object in the game needs to check the temp that is using Thread 1, and Thread 6 is handling temperatures.  Thread 6 does the temp calculations and sets the temp of a room into memory so that any Thread 1,2,3,4,5,7,8 object can check it any time.  It keeps outputting the temp to those memory values as often as it is told to.  Anything that pushes heat or pushes cooling, is processed by thread 6.  If such object relies on power, the power thread, thread 4 (for example) can keep a list of powered objects in memory, and alter their state like that so thread 6 simply needs to check that Item624722 Power == 1 before processing it's heat/cold pusher stats.  Thread 6 does not need to alter the power net to check the state of an object.  Thread 4 does not need to alter the temperature of anything to check the amount of heat in an area.  They check a memory value before running their command.

Instead of Thread 1:
Check Power Net;
Power Net Positive == False;
Check power net storage;
Power net storage == Positive;
Set Item624722 power state to True;
Set Item11935 power state to True;
Set other items power state to True; (probably 40 more lines)
Check Item624722 power state;
Item624722 power state == True
Item624722 pushes 16 units of heat to Room662;
Check Room662 Size;
Room 662 size == 30; (cells)
Alter Room662 temperature + 0.53;
Is Room662 temperature dangerous;
Room662 temperature < 400;
Check Item11935 power state; (+7 more lines)
Check other heat/cold pusher power states; (In my case, normally 20 items here, +7 each one)
Check OutdoorsTemp;
Alter all temps inside walls touching Outdoors -0.7; (Probably 20 lines of stuff here)
pawn job7225 (and following 6 lines);
(+60 more lines of miscellaneous stuff);
Check Power Net; (looping back to beginning for the next tick)


You'll get Threads 1, 4, 6:

Simultaneously:
Thread 1: Pawn Jobs (+30 more lines)
Thread 1: Countdowns for story teller events (+5 lines)
Thread 1: Set memory values for various things so other threads can access quickly (+12 lines)

Pawns need the power state to know which buildings to use. Thread 4 locks the power state varibal so thread 1 does nothing untell thread 4 finishes.

Simultaneously:
Thread 4: Check Power Net;
Thread 4: Power Net Positive == False;
Thread 4: Check power net storage12; (memory value for that particular networked storage)
Thread 4: Power net storage12 == Positive;
Thread 4:[b] Set Item624722 power state to True;[/b]
Thread 4: Check Power Net; (loop back to beginning of this stage, for all other items)

Simultaneously:
Thread 6: [b]Check Item624722 power state;[/b]
Thread 6: Item624722 power state == True;
Thread 6: Item624722 pushes 16 units of heat to Room662;
Thread 6: Check Room662 Size;
Thread 6: Room 662 size == 30; (cells)
Thread 6: Alter Room662 temperature + 0.53;
Thread 6: Is Room662 temperature dangerous;
Thread 6: Room662 temperature < 400;
Thread 6: Check Item11935 power state; (loop back to beginning for all other heat/cold pushers)
Thread 6: Check OutdoorsTemp;
Thread 6: Alter all temps inside walls touching Outdoors -0.7; (Probably 20 lines of stuff here)

Weather needed to acsess the power state of an object. the power state is locked instantly when thread 4 starts and is only released when thread 4 ends. So thread 6 does nothing aswell.

YOU have just described the EXACT senerio which hoho was talking about. Thread 6 and 1 do nothing untell 4 finsishes. Thread one does nothing until 6 finishes as it needs to know the temperature... (to calculate if your pawn freeze, for example)

You have only added the overhead of creating these threads and having the scheduler handle them.


I'm just, making a rough guess here for this stuff.  But Power Net (thread 4) uploads the state of the power grid for each item to a memory value (not how computers work.), while (No. thread 6 needs to know the new power state before it starts. so he just waits thier doing nothing. he does it AFTER, not while.) temperature (thread 6) checks those states before executing the heat/cold pushing power of that thing and altering the temperature before uploading the temperature of that room to memory both doing this simultaneously and synchronization doesn't matter here because the temperature from the previous check 1 tick ago is still there and it is doubtful to have changed dramatically enough in a single tick to matter if it checked 1 nanosecond too soon.  (Gawd no. WHILE thread 4 works on calculating the power state THE VARIABLES ARE LOCKED. LOCKED. THEY ARE LOCKED. How is that hard to understand. NOTHING other than thread 4 can use them WHILE THEIR LOCKED. HE will not accsess them 1 ns to early, because they are LOCKED.  thread 6 will sit their doing nothing unwell their are UNLOCKED. If thread 6 tries to access the LOCKED variable you will recive a memory exception followed by RW promptly crashing.) While Thread 1 is free to carry out the other commands minus all temperature and power net calculations, (But the pawn calculations NEED to know the powerstate AND tempetruer. Those varibles are LOCKED. This thread will just sit their and do nothing. ) where it only needs to run a single "Check state" command (This is the third time I say this. THEY ARE LOCKED. YOU CANNOT USE THEM.) to check a value from memory that the other thread had uploaded there.

Quote from: Vas on February 14, 2016, 02:41:40 PM
Granted, I don't know how this stuff works.  This is just how I imagine it would work if it were optimized with multi thread.

Welcome to Computing 101. The compiler nor the computer give a flying fuck about what you think.

EDIT: Please note: The red ain't for anger, its just that I'm to lazy to split it into 1 thousand seperate quotes and teh color red makes my comment stand out. If you have another color you'ld prefer please tell me.

Fluffy (l2032)


hoho

#64
Quote from: Vas on February 14, 2016, 02:41:40 PMYou do realize the game doesn't need to wait for any reason to get a temp right?  Temp can entirely be separated into a different thread, nothing needs to wait on it.  Heat/Cold pushers while powered, will be processed by the temperature thread.
So, when will food on ground be updated depending on ... temperature? What about feelings of pawns and animals? What about plant growth? When will the rooms decide how to change their temperature depending on heat sources and heat "bleeding" through walls? All these things depend on each other at *some* point and *all* these points have to be synchronized between threads.

Quote from: Vas on February 14, 2016, 02:41:40 PMPath finding can also be separated I believe, and path finding is quite the intensive event
I have no idea how the pathfinding has been implemented in this game but I do know that on 2d grid, there are some extremely efficient algorithms for doing it. Again, look up Factorio and how they optimized their pathfinder by several orders of magnitude. Having thousands of *actively moving* mobs on a map size of several million squares doesn't really make a dent in game performance there and *everything* but rendering is running in a single thread in that game.
[/quote]It is. One can do thousands of calculations when thread hits a mutex lock, millions if that thread gets thrown off the core to pull in another and that doesn't even take into account the time it takes to propagate mutex changes between cores.

Quote from: Vas on February 14, 2016, 02:41:40 PM
Thread 1 (main thread) : Pawn starts cleaning? == True
Thread 1: Push to Thread 4
Thread 4: Path find to nearest dirt spot.
Thread 4: Clean spot.
Thread 4: Loop 10x times.
Thread 4: Check if pawn needs to change jobs.  If False, loop 10x times.  If True, push to Thread 1.
"Push to thread 4" - what exactly does that mean? You mean main thread builds a "command" for the "clean dirt" handling thread, adds it to some global queue (that can be modified by a single thread at any time, anyone wanting to add/remove stuff from it *must* wait until previous actions are complete) and sends a message to the thread 4 that it has some work to do?

Sure, it can be done like that. Only problem is that the cleaning thread, as you described, does a ton of stuff that depends on all sorts of external variables that can be changed at any point and the job it does is spread over several seconds and several dozen "ticks" during each everything can change (e.g the pawn gets hit by a bullet). Also, that would mean you *can't* separate pathfinding to a new, separate thread.

Also, that "push to thread 1" at the end seems rather weird. Do you mean that it gives over the CPU for the main thread to run? I sure hope not as that's nothing like multithreading works in reality.
Quote from: Vas on February 14, 2016, 02:41:40 PMYou explain why x feature can or can't be done, and you leave it at that.
Sadly, you are trying to offer ideas that are, well, useless. Parallel programming doesn't seem to be a topic you know much about.

Quote from: Vas on February 14, 2016, 02:41:40 PMI've had large raider spawns land on my map, mostly Tribals, that do it, causing massive lag as they all path find to their idle spot, then the lag settles slightly, then starts up again when they start path finding to my elaborate base setup
So, what makes you think it's pathfinding instead of the pawns updating their mental states to changing surroundings?
Quote from: Vas on February 14, 2016, 02:41:40 PMIf an object in the game needs to check the temp that is using Thread 1, and Thread 6 is handling temperatures.  Thread 6 does the temp calculations and sets the temp of a room into memory so that any Thread 1,2,3,4,5,7,8 object can check it any time
... and every time something tries to read that temp value, they need to check if mutex protecting it is locked or not.


Quote from: Vas on February 14, 2016, 02:41:40 PMIf such object relies on power, the power thread, thread 4 (for example) can keep a list of powered objects in memory, and alter their state like that so thread 6 simply needs to check that Item624722 Power == 1 before processing it's heat/cold pusher stats.
So, powered items have to additionally check if they are allowed to access the powered state before even beginning their heat calculations. That happens for *every single powered item* that produces heat.

I assume the "simultaneously" there meant that threads 1, 4 and 6 are ran in parallel and not that each thread is somehow duplicated.
- Thread 1: Pawn Jobs (+30 more lines)

What does that "+30 more lines" mean? That thread 1 handles pawn jobs? What does "handles a job" mean, exactly? That it takes into account the pawn mental state, resources availiable and jobs availiable, pathfinds to gather stuff up and do whatever else is required?

- Thread 1: Countdowns for story teller events (+5 lines)

I'm almost certain that story teller stuff is triggered at best once a second and takes less than 1000 CPU cycles to run. I wouldn't be surprised if it's usually even much less than that. In other words, I believe storyteller takes almost no CPU power at all.

- Thread 1: Set memory values for various things so other threads can access quickly (+12 lines)

Will you lock each element individually or everything together during the time they get updated?
If you do it individually it would mean a ton of mutexes are required (yay increased cache load) and everything has to checked every time things are accessed. Though at least you can have one thread handling object N in the array while another thread handles object N+1.

When locking everyhting you'll make everything stop while those updates are made but at least you won't have to worry about checking for each object mutex and can just assume that if the object array is usable, all are. Less mutexes at cost of less granularity. Though I'm extremely doubtful that object-level mutexes would make much sense in this game.

Thread 4, as you described, seems to handle a single powered element. I would guess that would take just a few hundred CPU cycles without accounting for mutexes or locking. Throwing that sort of thing to a separate thread makes little to no sense.


I highly doubt separating electricity handling to a separate thread would make sense. Rimworld has *significantly* simpler power networks than Factorio and it's clearly possible to optimize them to take neglible CPU time in single thread. I wouldn't be surprised if hitting just a single mutex in there would slow it down by 2x vs having it run in main thread.


- Thread 6: Check Item624722 power state - assuming per-object locking it would mean checking for that lock before checking for it.
- Thread 6: Item624722 pushes 16 units of heat to Room662; - meaning it has to lock the data structure where it writes to


Quote from: Vas on February 14, 2016, 02:41:40 PMI'm just, making a rough guess here for this stuff
That you do :)

Most games don't even attempt to separate threads by task-types all that much. Sure, input, rendering, sound and physics are generally
each in a different thread but each one of them has drastically different CPU requirements and that alone rarely has much impact on performance. I've yet to hear any game getting anywhere near 2x speedup from just that going from single core to any number of cores you throw at it - Amdahl's law.

What most games do is to figure out what *actually* takes CPU time and parallelize that. Usually it's physics and often preparing data to be sent for rendering on GPU. In rimworld, neither is anyhwere near taxing on CPU (or GPU). Not knowing the details of inner workings of Rimworld it's really hard to make suggestions on how to parallelize it successfully but I can say for a fact that the scenarios you proposed would be rather inefficient for any type of game if the aim is to gain actual performance increase.

Quote from: Vas on February 14, 2016, 02:41:40 PMBut Power Net (thread 4) uploads the state of the power grid for each item to a memory value, while temperature (thread 6) checks those states before executing the heat/cold pushing power of that thing and altering the temperature before uploading the temperature of that room to memory both doing this simultaneously and synchronization doesn't matter here because the temperature from the previous check 1 tick ago is still there and it is doubtful to have changed dramatically enough in a single tick to matter if it checked 1 nanosecond too soon
Sure, double-buffering can work in some scenarios. Problems emerge if you have feedback loops and if several things can effect same things - depending on what sequence threads get executed you can get wildly varying results. Not to mention that you'll have to synchronize everything at every tick/frame anyway. Again, Amdahl's law hits hard here as everything will be running at the speed of the slowest part.


Adding in multithreading to anything that was written without having that in mind from the early designs is pretty close to insanity. The ones that have done that have generally ended up rewriting almost entirety of their codebase. Again, there is a reason why Kerbal Space Program took *years* to update to Unity 5.

In addition, it's always preferred to first optimize the algorithms as much as possible. It helps both single-threaded and multi-threaded cases and while parallelizing over infinite threads usually doesn't give more than 2x speedup in games, optimizing algorithms tends to often give exponential speedups.

For example, if one were to optimize whatever it is that makes huge swarms slow to run at (being optimistic here) 3x the speed over 8 cores, fine, you'll get 30FPS instead of 10 you had before. Now, if one manages to replace a quadratic algorithm with linear or even constant speed one, you can essentially increase the swarm size by several orders of magnitude long before you'll even feel any slowdowns.

Sadly, without knowing where are the bottlenecks in Rimworld it's impossible to even guess what might benefit from optimization or if parallelizing it would help at all.


Rant on Factorio:
QuoteAgain, looking at Factorio, my current factory is just a beginning one with lousy 5000 solar panels, 2000 accumulators (think batteries) and ~100 coal fired power plants. They power ~500 machines + ~10000 inserters (things that move items from one place to another, I use bullet-based turrets and each requires one inserter, I essentially have a wall of them around my base and have killed over 300,000 enemy units).

The game has no problems running at 60 updates per second. According to the debug screen, electric network takes about 1/10'th of entire update per-tick and at less than 1ms. I'm producing on averaeg about 20,000 items per minute or ~180 per second. For almost every item, inserter has to move it from one place to another and majority of items move on conveyor belts hundreds of tiles long. My pollution cloud covers around 5 million squares (and gets absorbed/moves/spreads according to forests/winds) and influences the actions of all the mobs in it's area (though pollution gets calculated by chunks, each 256 squares). Overall I've explored an area of around a billion squares (equivalent of Rimworld map size of ~32,000 times 32,000) = thousands of enemy spawners generating mobs to throw at me as pollution hits them or just wander around idling (but still requiring path finding).

In addition to all that, the game is *massively* modded with mods written in Lua - zero compiled code in mods as there is in Rimworld. The game does have 2 threads but one of them is purely for rendering while the other deals with all the heavy-lifting.

In other words, it's perfectly possible to optimize a game to be able to handle huge scale and loads of entities/pawns without too much problem. Obviously, it's not easy. Those guys have had several optimization passes, each bringing significant increase to the size of factory one can make before it slows down too much.


Also note that my factory is tiny compared to what some people have created after spending over thousand hour building on one. I know some require nearly an hour of real-time to travel between the mining outposts at opposite sides of the map using an in-game train


TL;DR

It is possible to multithread Rimworld. I don't think it's required nor do I think it'd give a measurable performance increase. I'm absolutely positive that there are MASSIVE gains to be had from improving the algorithms that are currently used.

Also, sorry if I sound harsh. It's just that your suggestions aren't really helping (you can be quite certain that Tynan knows *much* better than any of us what and how would make sense to parallelize, if anything) and make claims about parallelization that simply don't work in real world.

TheGentlmen

You hit the nail on the head, hoho. Way better then I could either.

Fluffy (l2032)

Thanks hoho, I've been trying to say this stuff for weeks, but you've said it better and with much more detail than I even could.

Also, I feel like I may be partially responsible for the belief that pathfinding is such a bottleneck, I've repeated that several times but I'm actually not sure where I got the information from. Oh well, if only we could attach a debugger/profiler.

Vas

Quote from: TheGentlmen (GENTZ /'jen(t)z/) on February 14, 2016, 07:32:33 PM
Gawd no. Tynan will not rewrite the jobs system just for you.

Also, having 1 thread per pawn won't end well. The game *will not* be faster, if anything slower due to the extra overhead starting and calling another thread.
Wow, when did I say 1 thread per pawn?  I was making an example.
Also, when did I say "just for me"?  Quit exaggerating.  Everyone but you wants multicore to happen.  See, I can do it too.

Quote from: TheGentlmen (GENTZ /'jen(t)z/) on February 14, 2016, 07:32:33 PMBitchy red rant of bitchyness and being dickish
Perhaps no one told me in a nice way that threads lock a variable, when you are able to do this, I will respond to you.  As I had said at the end, which you also quoted below.  I don't know exactly how it works.  I assumed that if it can upload a thing to memory, the thread may lock something but the memory value can be read at any time.  If something needs to check the state, it looks for the memory value and reads it.  I assumed, you could READ locked things.

Quote from: TheGentlmen (GENTZ /'jen(t)z/) on February 14, 2016, 07:32:33 PM
Quote from: Vas on February 14, 2016, 02:41:40 PM
Granted, I don't know how this stuff works.  This is just how I imagine it would work if it were optimized with multi thread.
Welcome to Computing 101. The compiler nor the computer give a flying fuck about what you think.
Well if you're going to be a rude little dick, I don't give a flying fuck what you think either.



Quote from: hoho on February 15, 2016, 04:50:19 PMSo, when will food on ground be updated depending on ... temperature? What about feelings of pawns and animals? What about plant growth? When will the rooms decide how to change their temperature depending on heat sources and heat "bleeding" through walls? All these things depend on each other at *some* point and *all* these points have to be synchronized between threads.

---------------------------------------------------------------
TL;DR

It is possible to multithread Rimworld. I don't think it's required nor do I think it'd give a measurable performance increase. I'm absolutely positive that there are MASSIVE gains to be had from improving the algorithms that are currently used.

Also, sorry if I sound harsh. It's just that your suggestions aren't really helping (you can be quite certain that Tynan knows *much* better than any of us what and how would make sense to parallelize, if anything) and make claims about parallelization that simply don't work in real world.
Previously, I had assumed that a thread can look up read only memory values to perform its next calculation.  AKA, A plant that depends on the temp can look up the memory value for the room temp it is in, to decide if it should continue growing or not.  It doesn't need to write to the temp to be able to grow after all.  It only needs to know what it is.  The ticker event handles how often the plant will check the temperature.  It doesn't need to be perfectly synchronized after all.  That is what I assumed.  And after reading half of what Gent said and him saying "don't give a fuck what you think", I'm rather pissed and don't feel like reading much now.  I'll come back and read the rest of what you said a bit later but I give up.  No one here gives a shit about my original intentions of posting here.

I was merely posting before to get people to stop being dicks towards people who suggest multi-threading.  I should have left it at that and not said anymore, but everyone here who programs likes to intervene and say "NO DON'T POST IT" and/or troll the person who posted it.  My original point, my original goal, was to get everyone to stop trolling someone over an idea.  Unless that idea is obviously outlandish and stupid, there should be no cause for trolling and multicore is not an outlandish or stupid idea.  Its difficult, but possible, and most games these days support it.  So there's no need to troll someone over it.

I'm sure Tynan does know more though.  I was just trying to provide my own perspective.  Also sometimes it takes a different perspective for someone to catch an idea of how to do something.  My perspective may not be possible but someone could see it and think "Holy crap, I got an idea" from it.  Until I read your post, I may not understand why a thread can't read a read only memory value to do its next thing, so I'll wait on anything further.  For now, I just give up.
Click to see my steam. I'm a lazy modder who takes long breaks and everyone seems to hate.

TheGentlmen

Quote from: Vas on February 18, 2016, 03:37:27 PM
Quote from: TheGentlmen (GENTZ /'jen(t)z/) on February 14, 2016, 07:32:33 PM
Gawd no. Tynan will not rewrite the jobs system just for you.

Also, having 1 thread per pawn won't end well. The game *will not* be faster, if anything slower due to the extra overhead starting and calling another thread.
Wow, when did I say 1 thread per pawn?  I was making an example.
Also, when did I say "just for me"?  Quit exaggerating.  Everyone but you wants multicore to happen.  See, I can do it too.
I know for a fact not everyone wants a multi core system.
Quote from: TheGentlmen (GENTZ /'jen(t)z/) on February 14, 2016, 07:32:33 PMBitchy red rant of bitchyness and being dickish
Perhaps no one told me in a nice way that threads lock a variable, when you are able to do this, I will respond to you.  As I had said at the end, which you also quoted below.  I don't know exactly how it works.  I assumed that if it can upload a thing to memory, the thread may lock something but the memory value can be read at any time.  If something needs to check the state, it looks for the memory value and reads it.  I assumed, you could READ locked things.

Hmm, is green a good color?
Have you read Moho's response? I think he said it nicely and clearly.
PS: Next time before spewing crap out of your ass, maybe you should consider reading about multi threading. You'll sound less like an ignorant little fuck.

Here is a great quote from StackOverflow (which literally took 13 seconds to find)

Quote from: StackOverflow
When I am having a big heated discussion at work, I use a rubber chicken which I keep in my desk for just such occasions. The person holding the chicken is the only person who is allowed to talk. If you don't hold the chicken you cannot speak. You can only indicate that you want the chicken and wait until you get it before you speak. Once you have finished speaking, you can hand the chicken back to the moderator who will hand it to the next person to speak. This ensures that people do not speak over each other, and also have their own space to talk.

Replace Chicken with Mutex and person with thread and you basically have the concept of a mutex.

Of course, there is no such thing as a rubber mutex. Only rubber chicken. My cats once had a rubber mouse, but they ate it.

Of course, before you use the rubber chicken, you need to ask yourself whether you actually need 5 people in one room and it would not just be easier with one person in the room on their own doing all the work. Actually, this is just extending the analogy, but you get the idea.

From: http://stackoverflow.com/questions/34524/what-is-a-mutex

Also:
http://www.albahari.com/threading/
http://www.albahari.com/threading/part2.aspx
http://www.albahari.com/threading/part3.aspx
http://www.albahari.com/threading/part4.aspx
http://www.albahari.com/threading/part5.aspx

Quote from: TheGentlmen (GENTZ /'jen(t)z/) on February 14, 2016, 07:32:33 PM
Quote from: Vas on February 14, 2016, 02:41:40 PM
Granted, I don't know how this stuff works.  This is just how I imagine it would work if it were optimized with multi thread.
Welcome to Computing 101. The compiler nor the computer give a flying fuck about what you think.
Well if you're going to be a rude little dick, I don't give a flying fuck what you think either.

I never said that _I_ don't give a fly fuck for what you think, but I instead said the _compiler_ doesn't give a flying fuck. Then again, it goes without saying that I don't give a flying fuck either.  :P



Quote from: hoho on February 15, 2016, 04:50:19 PMSo, when will food on ground be updated depending on ... temperature? What about feelings of pawns and animals? What about plant growth? When will the rooms decide how to change their temperature depending on heat sources and heat "bleeding" through walls? All these things depend on each other at *some* point and *all* these points have to be synchronized between threads.

---------------------------------------------------------------
TL;DR

It is possible to multithread Rimworld. I don't think it's required nor do I think it'd give a measurable performance increase. I'm absolutely positive that there are MASSIVE gains to be had from improving the algorithms that are currently used.

Also, sorry if I sound harsh. It's just that your suggestions aren't really helping (you can be quite certain that Tynan knows *much* better than any of us what and how would make sense to parallelize, if anything) and make claims about parallelization that simply don't work in real world.
Previously, I had assumed that a thread can look up read only memory values to perform its next calculation.  AKA, A plant that depends on the temp can look up the memory value for the room temp it is in, to decide if it should continue growing or not.  It doesn't need to write to the temp to be able to grow after all.  It only needs to know what it is.  The ticker event handles how often the plant will check the temperature.  It doesn't need to be perfectly synchronized after all.  That is what I assumed.  And after reading half of what Gent said and him saying "don't give a fuck what you think", I'm rather pissed and don't feel like reading much now.  I'll come back and read the rest of what you said a bit later but I give up.  No one here gives a shit about my original intentions of posting here.

Such shame, you will not read a perfectly good answer just cause asshole called 'Gentz' broke your feelings. I'm sorry for your loss, but if you refuse to listen to the other side of the argument then why should I either bother responding?

I was merely posting before to get people to stop being dicks towards people who suggest multi-threading. 
No one was being dickish, they were being sarcastic, theirs a difference.
I should have left it at that and not said anymore, but everyone here who programs likes to intervene and say "NO DON'T POST IT"
Well, maybe people should start using the search function. TBH this 'debate' has been settled a long time ago with the results being "Not happening".
and/or troll the person who posted it. 
If they won't bother searching it up then we (or at least I) will take the equivalent amount of effort and just troll them.
My original point, my original goal, was to get everyone to stop trolling someone over an idea. 
Yup, but your far from that original goal, now your pretending to have some knowledge about multithreading and spewing out pseudocode which has no basis in reality.
Unless that idea is obviously outlandish and stupid,
If you've reading anything we've said you'd realize the idea IS outlandish and stupid FOR this type of application. You don't even need to read what WE said, you can use the mystical search function and see what others have said, even Tynan himself.
there should be no cause for trolling and multicore is not an outlandish or stupid idea. 
Learn how multithreading works, then you will realize how stupid and outlandish it is for this type of application.
Its difficult, but possible, and most games these days support it. 
Argument ad populum. Logic is:
IF many things have done it THEN it is plausible AND beneficial.
Okay, many people spoke roman in the past SO everyone speaking roman NOW must be a GREAT idea, and surly everyone will benefit, right?
God no.
OTHER games have done it, it doesn't mean it will actually work with THIS game.
OTHER people have benefited a lot in the past from learning latin, it doesn't mean that I will benefit from learning latin to the same degree, if at all.

So there's no need to troll someone over it.
Thier is no reason for you to spew random crap out about how multithreading should be done when you know nothing about multithreading or computers.

I'm sure Tynan does know more though. 
Great. You can agree on ONE thing.
I was just trying to provide my own perspective. 
Great. I don't care about how you THINK computers should work, I only care about HOW THEY ACTUALLY WORK
Also sometimes it takes a different perspective for someone to catch an idea of how to do something. 
Thier is nothing we can learn from YOUR perspective other than that you lack basic knowledge in this Field. Anything other than that that we can "learn" is false.
My perspective may not be possible but someone could see it and think "Holy crap, I got an idea" from it. 
Professionals in this field have worked for more than a decade to improve multithreading, and you just think you can come in with no experience whatsoever and revolutionize multi-core computing? Ego much?
MAYBE if you has SOME knowledge in multi threading and then CLEARLY defined an method that can THEORETICALLY increase speeds by lets say 0.01% then we could get an idea from it. But you just spewing crap gives no ideas.

Until I read your post, I may not understand why a thread can't read a read only memory value to do its next thing, so I'll wait on anything further.  For now, I just give up.

Vas

#69
Quote from: TheGentlmen (GENTZ /'jen(t)z/) on February 18, 2016, 08:07:08 PMUgly wall of text that I'm still not reading.
How about you just drop it, and leave the god damn thread alone.  Stop being such a rude little asshole.  I swear, is it really so hard for you to fucking drop it?  Maybe if you fucking open your god damn eyes, you'll see I said "I give up" and explained some other things that you're still fighting against because you wanna sound like a super genius and win the thread war.  I'd be more inclined to read what you say if you didn't come at me all angry and pissy and with the "i'm right your wrong go back to school" attitude.  But nope.  You're still coming here coloring shit and not bothering to try and make your post look nice, instead you think color fixes it when you should be cutting out bits to make it less of a wall of text.  EG, the links on "threading" where you linked all part pages which clearly was useless.

It seems your only purpose here is to pick a fight you can win and drive in the nail as deep as you can get it simply so everyone can go "Oh wow, Gent knows everything, that guy's awesome.  He sure showed Vas!"

Quote from: TheGentlmen (GENTZ /'jen(t)z/) on February 18, 2016, 08:07:08 PMIf they won't bother searching it up then we (or at least I) will take the equivalent amount of effort and just troll them.
Interesting.  Instead of being a "gentleman" and say "This has been debated before, here's the thread link." then report the post to have it locked by a moderator.  You go into the post and troll them like a dick.  Got it.



P.S. I know I was wrong on my multi-core assumptions and theories (when I didn't know anything about how it worked) (still don't particularly know) but I'm not gonna sit here and let some guy treat me like I'm a dumbass and sit here publicly doing his best to make me out to be the bad guy and everything.  A moderator or admin needs to delete or lock this thread.  Whichever.  At this point it is a useless thread where people come here for a laugh or to troll someone.
Click to see my steam. I'm a lazy modder who takes long breaks and everyone seems to hate.

TheGentlmen

Quote from: Vas on February 18, 2016, 10:39:34 PM
Quote from: TheGentlmen (GENTZ /'jen(t)z/) on February 18, 2016, 08:07:08 PMUgly wall of text that I'm still not reading.
The ignorance is strong with this one, if you won't bother reading it why bother replying?
How about you just drop it, and leave the god damn thread alone.
How about you just drop it, and leave the god damn thread alone.
Stop being such a rude little asshole. 
And your not being rude either? ???
I swear, is it really so hard for you to fucking drop it? 
Yes, it actually is hard for me to drop it.
Maybe if you fucking open your god damn eyes, you'll see I said "I give up"
If you gave up then why reply?
and explained some other things that you're still fighting against
Well your explanations are invalid, or I wouldn't have to reply to it.
because you wanna sound like a super genius and win the thread war. 
I don't need your confirmation to 'sound like a super genuis'. TBH anyone who knows a thing about computers should know this stuff.
I'd be more inclined to read what you say if you didn't come at me all angry and pissy and with the "i'm right your wrong go back to school" attitude. 
You seem to have that same attitude, yet I still read what you write.
But nope.  You're still coming here coloring shit and not bothering to try and make your post look nice,
I can't be bothered to separate it into 50 different quotes. Just like you can't be bothered to learn how multithreading actually works.
instead you think color fixes it when you should be cutting out bits to make it less of a wall of text. 
But I responded to every line. TBH the only thing I should have cutout was the last line. Does that ONE line bother you?
EG, the links on "threading" where you linked all part pages which clearly was useless.
Or you didn't read them. I've been using those as a reference sheet (see I don't talk out of my ass, unlike you) the whole time so I find it hard to believe they are useless. They describe EXACTLY how to multithread in C# (what RW is made in). What more can you ask for?

It seems your only purpose here is to pick a fight you can win and drive in the nail as deep as you can get it simply so everyone can go "Oh wow, Gent knows everything, that guy's awesome.  He sure showed Vas!"
Nope. My goal is for you to learn ANYTHING about how threading works.

Quote from: TheGentlmen (GENTZ /'jen(t)z/) on February 18, 2016, 08:07:08 PMIf they won't bother searching it up then we (or at least I) will take the equivalent amount of effort and just troll them.
Interesting.  Instead of being a "gentleman"
Dispite what my name may lead you to believe, I'm for from a "gentleman"
and say "This has been debated before, here's the thread link." then report the post to have it locked by a moderator. 
OR they can use the fucking search function.
You go into the post and troll them like a dick.  Got it.
TBH in this post I'm yet to troll anyone cause I got here late. The only thng I've done in this post is to reply to your extreme concentration of bullshit you call a reply.


P.S. I know I was wrong on my multi-core assumptions and theories (when I didn't know anything about how it worked)
Well, why make assumptions when you can read up about how it ACTUALLY works.
(still don't particularly know) but I'm not gonna sit here and let some guy treat me like I'm a dumbass
So sorry I broke your heart.
and sit here publicly doing his best to make me out to be the bad guy and everything. 
Err, no. This ain't my 'best'... my 'best' would get me temp banned.
A moderator or admin needs to delete or lock this thread. 
Okay then.
Whichever.  At this point it is a useless thread where people come here for a laugh or to troll someone.
Laughing is good for your health you know.




I just realized your not gonna read it at all. Just shows how ignorant you are.

Fluffy (l2032)

Wow. Can both of you cool it down a bit?

@Vas; I'm curious, do you actually want Tynan to implement multicore, or are you merely suggesting it would be a good thing?
Thing is, several people who know more about the subject have told you that it's basically not worth the effort, but you keep arguing that it is. Anyhow, this thread is turning nasty - I'm out.

hoho

Quote from: Vas on February 18, 2016, 03:37:27 PMI assumed that if it can upload a thing to memory, the thread may lock something but the memory value can be read at any time.
Technically, you are correct - everything can be read at any time.

Problems emerge when some parameters are more complex than the simpliest variables (single integer or floating point value, boolean). In those cases, it's likely that when someone is changing that multi-number parameter then the one reading it gets first half from old state and second half from new state. In those cases, all write *and* read accesses must check for the locks.

Also, the locking doesn't work automatically. Programmer has to specifically write stuff like:

check if lock is already locked by someone else
If yes, put the thread to sleep with wake condition of "this lock is released"*
If not, lock it yourself
change or read the variable
remove the lock

If theread does happen to get thrown off the core while it's in sleeping state then that alone will take hundreds of CPU cycles, getting it back there again costs hundreds to move data, thousands if some data was thrown out of caches completely.

Quote from: Vas on February 18, 2016, 03:37:27 PMNo one here gives a shit about my original intentions of posting here.
"Problem" is, everyone with half a brain knows that hypothetically, multithreading can make things faster. There aren't much reason to point that out as I'm quite certain Tynan is fully aware of it already.

As has been said here, multithreading can at best increase speed lineary by at best 2-3x. Optimizing algorithms will often give much more performance than that and will make the game scale vastly better. It makes MUCH more sense to optimize algorithms than spending huge amounts of time to make the game run in threads.
Quote from: Vas on February 18, 2016, 03:37:27 PMIts difficult, but possible, and most games these days support it.
I can assure you, almost every AAA game out there is *not* parallelizing their core logic to run in more than one thread.

What they probably are doing is separating things out by tasks. E.g rendering in one thread, physics in other, preparing data for rendering in third, AI in fourth etc. They also usually use third-party libraries for some tasks, especially physics, and those libraries are often internally multithreaded. As most of those games are rather graphics-intensive, the threads dealing with rendering and preparing data for rendering take a LOT of computing time so spreading them out to different threads makes sense.

Rimworld is essentially nothing like those games. Its rendering is trivial (I'd think it could run at several thousand FPS if we'd turn off all other processing) and it has no real physics. Almost all the computing time is taken by the "AI" and general game logic and it's extremely hard to split that to run in several threads. In those AAA games, that part of the game barely takes a couple dozen percentages from a single core to run.

hoho

"Multi number parameters" could be having to check for all body parts for injuries-upgrades to figure out how fast someone runs, having to check for positions of animals to figure out who to hunt, having to check objects in path of a bullet to figure out where it hits etc.

Another HUGE issue with multithreading is when objects can be removed or created at pretty much random. For example, if one is applying medicine to another character and the char dies in the middle of it (and the data structures describing their health go with it), the one doing the healing *must* know that the person died before it attempts to do anything with the data of the character they were healing.

Essentially, even if one thread wants to only read data that is managed by another thread it's not enough to just check if the data exists to start reading it. It must be enforced that the data doesn't get deleted before all the reading is complete. Trying to read data that has been deleted generally results in application crashing. Same with trying to write to structures that have been deleted. Only protection against such crashes is, again, locking them leading to bottlenecks.


Yet another issue with threads are deadlocks. Deadlock is what happens when you have a circular dependency between locks and threads. Here is a simple example code demonstrating deadlocks: http://stackoverflow.com/a/1385876 That sync() call there is locking the variable. If a deadlock occurs, the application has hanged - it can't proceed further as two (or more) threads are stuck waiting after each other and none can proceed.

Avoiding deadlocks is by far the hardest thing in threading.

Ramsis

Just wanted to say that all the fighting stops now or I'm looking at about six or so people looking at a full week ban. The good news? One of you buggers is about to win the perma-ban lottery if you have to be warned/banned again!

I grow sick and tired of the childish bickering. There are better ways to have a jovial slap-fight and you all seem to be really bad at it.

Ugh... I have SO MANY MESSES TO CLEAN UP. Oh also I slap people around who work on mods <3

"Back off man, I'm a scientist."
- Egon Stetmann


Awoo~