Multi-threading discussion (split from A18 features thread)

Started by FrodoOf9Fingers, October 04, 2017, 04:42:21 PM

Previous topic - Next topic

Bozobub

Quote from: Kiame on October 06, 2017, 12:42:25 PMStuff.
I think you are confusing the (potential for) "atomization" of a given task with "multithreading".  It's not the same thing at all.

Really, the biggest problem is no one really knows how to code all that well for parallel processes, at least in a general sense.
Thanks, belgord!

FrodoOf9Fingers

Quote from: Kiame on October 06, 2017, 01:14:15 PM
Quote from: FrodoOf9Fingers on October 05, 2017, 07:35:20 PM
Heh, I guess we derailed too much :P

I should have known better, i just really like the idea of multithreading  :P

Just curious, since we have a thread for this now, where's the multi-threading happening? My initial go at this would be to have the rendering engine be one thread and the backend/updating part broken down into multiple consumer threads with some type of load balancing (round robin probably assuming everything takes about the same amount of time to update)

I don't have the ability to really profile the code and see where it's needed, so I'm just doing at the main game update loop. In particular, I'm splitting threads in the TickList.Tick method. In there, there's a list of every object that has some update to do during a tick. In single threaded mode, it loops through each one. Currently, I make 8 threads (subject to change) that take an object, do it's tick work, and then return and grab the next one. It's using a thread pool too, so no need to worry about constant instantiation of threads.

FrodoOf9Fingers

Quote from: Bozobub on October 06, 2017, 02:47:40 PM
Quote from: Kiame on October 06, 2017, 12:42:25 PMStuff.
I think you are confusing the (potential for) "atomization" of a given task with "multithreading".  It's not the same thing at all.

Really, the biggest problem is no one really knows how to code all that well for parallel processes, at least in a general sense.

Nah, there are people who know how to code for parallel processes. It's called writing functional code. I like to think of it in two ways: first is the reliance on using values. A value doesn't change, if you need a new value you create a new value. The second is that a function should always return the same thing given the same input, which is somewhat in opposition to Object Oriented programming, where a no-input getter returns different values depending on state.

Once you have those two, multi-threading is not far of a stretch, but I personally don't believe it is a "fix-all" solution. A good intro video on the subject can be found below: https://www.infoq.com/presentations/Value-Values

Kiame

Quote from: Bozobub on October 06, 2017, 02:47:40 PM
Quote from: Kiame on October 06, 2017, 12:42:25 PMStuff.
I think you are confusing the (potential for) "atomization" of a given task with "multithreading".  It's not the same thing at all.

Really, the biggest problem is no one really knows how to code all that well for parallel processes, at least in a general sense.

Sorry i just don't know what you're trying to say...

Are you talking about atomization as in atomic functions like test&set? Are you talking about some sort of synchronization (which is not atomic)? Are you talking about proper approaches/design of parrallel programming?

In either case there are people who do understand and can do it. I will admit I am no expert. I've worked with and designed prgrams that are multithreaded though so do have working experience in the field.

Quote from: FrodoI don't have the ability to really profile the code and see where it's needed, so I'm just doing at the main game update loop. In particular, I'm splitting threads in the TickList.Tick method. In there, there's a list of every object that has some update to do during a tick. In single threaded mode, it loops through each one. Currently, I make 8 threads (subject to change) that take an object, do it's tick work, and then return and grab the next one. It's using a thread pool too, so no need to worry about constant instantiation of threads.

I was afraid of that... You're able to get the backend with the consumers but the GUI is still tied to the updating - render thread handles the ticks, populates the consumer/update threads, once everything's updated the rendering takes over again. << This is mainly based off of my modding experience and my observations, not from actually looking at the code so please correct me if i'm mistaken here.

If this is correct, hint for Bozobub, this is linear execution with partial parallel threading where one thread still has to wait on other threads to complete before continuing.

Performance should increase certainly but there will still be times where the graphics freeze when one object takes a long time to update (though this should happen much less frequently if at all if the thread population is properly load balanced)


If we had free reign over the code/the code was refactored with multithreading in mind I'd recommend divorcing the renderer from the backend. The shared resources would be what the renderer needs to draw the objects and where to draw them (position/rotation/scale). In C# assigning memory locations is atomic so one way to share the items between the two is: A static list which contains the items to be updated/rendered. The renderer is read-only. The update creates a copy of the object when the object is fed to an update-consumer thread. When the update thread is done updating the object it can then set the object back in the list (using a static index for the item) assuming the object has not changed. If it has, redo the update from the beginning with the changed version of the object to update. -- This is a lock-less approach

Some drawbacks from initial thoughts:
The list will need to be resized which is costly with array lists and c# does not have an async implementation of that built in (this can be gotten around with 3rd party libs).
If objects are updated outside of the update thread while the object is being updated it will need to be re-updated from scratch. This should be rare but can still add some overhead. (or just re-write this in Haskell  :o )

In either case, this later approach would allow the game to render as fast as the computer can handle and would not slow down no matter how long an object take to update. What could happen is a pawn could stop moving if it's stuck updating or it's a case of a race condition failing (a bug)

FrodoOf9Fingers

>The list will need to be resized which is costly with array lists and c# does not have an async implementation of that built in (this can be gotten around with 3rd party libs).

I believe Lists maintain the memory they ask for after a resize, and since the list isn't reinstantiated, we should see no more than 10 or so resizes (have more than 1000 objects with tick operations on your map at once)? 10 resizes once every game load is very small in cost. Negligible compared to the cost of an update loop even if it was resized every update loop cycle.

>If objects are updated outside of the update thread while the object is being updated it will need to be re-updated from scratch. This should be rare but can still add some overhead.

Not necessarily true, if the only update for a pawn1 is the position, and if pawn 2's update tick affects pawn1's HP, then those two can happen concurrently.

Kiame

Quote from: FrodoOf9Fingers on October 06, 2017, 05:05:50 PM
>The list will need to be resized which is costly with array lists and c# does not have an async implementation of that built in (this can be gotten around with 3rd party libs).

I believe Lists maintain the memory they ask for after a resize, and since the list isn't reinstantiated, we should see no more than 10 or so resizes (have more than 1000 objects with tick operations on your map at once)? 10 resizes once every game load is very small in cost. Negligible compared to the cost of an update loop even if it was resized every update loop cycle.

Hmm, not sure how C#'s array list handles removes and if it can shrink. If it can shrink this is still an issue to consider. If not then it's nothing to worry about. I remember back when i was first playing with multithreading i had a storage container of an array list (which would not shrink) and a queue. If an object was removed the index would be populated with null. The index would then be put into the queue. As the caller iterated over the list it would skip null entries. If a new object needed to be inserted it would first remove from the queue for a place in the list. If the queue was empty then it would .add() to the list.

Quote>If objects are updated outside of the update thread while the object is being updated it will need to be re-updated from scratch. This should be rare but can still add some overhead.

Not necessarily true, if the only update for a pawn1 is the position, and if pawn 2's update tick affects pawn1's HP, then those two can happen concurrently.

I'm coming from a perspective that each update should keep the pawns as up to date as possible. In the case of a pawn being updated. If the pawn is being updated and another update - say a bullet - hits the pawn (while the pawn is being updated) causing the originally stored pawn's hp to decrease, when the pawn-update thread reaches the point to store the pawn back into the list it will detect a change of the original pawn in regards to it's copy of the pawn. It will then take the new copy and perform the update again. (when i say compared i'm thinking something like comparing the hash value of the two instances)

As i was typing i think i see where you're coming from though. If it's decided that persisting a difference in state and result for an update cycle then you are correct. Extra logic could be added that compares each value and as long as a similar value does not conflict a union of the changes could be compiled into the new version of the pawn until the next update which would work too.

A. Pawn gets shot and dies
B. Pawn gets shot. Pawn dies.

In both cases the result is the same and TBH the user would never see the difference. Less overhead too with your idea.

CannibarRechter

BTW, as a side note on this whole topic, the Zen of muiltithreading is a design where you globally minimize the need for thread control primitives at all. It requires an application that can support a pattern that buffers change.

Consider RimWorld. Suppose one of the many simulant constructs wanted to change any of the shared simulation state. Suppose further that there was an abstract class called "BufferedChange". Every time a simulant wanted to change the shared state, it would generate a buffered change.

Under this CONOPS, you can have as many threads as you wish. During execution of the threads, each individual thread keeps its own individual Change buffer. Since by definition the thread is the only one writing to its own buffer, no thread control primitives are needed here.

You do need a super thread. This super thread periodically temp locks all the other threads, takes a reference copy of all the change buffers, clears the thread bufffers, and releases the threads. This is very small time quantum, due to the reference quantity.

This super thread then joins all the thread buffers, and applies them to the simulation state.

This is a design pattern from very large scale computing, where the use of any thread controls will bring the whole simulation to its knees. Consider a simulation that uses hundreds or thousands of computers, each with 40+ cores, and now you're thinking about a scale where thread control just doesn't work very well.

Trick with this pattern, though, is that it really wants to be designed in from the start.
CR All Mods and Tools Download Link
CR Total Texture Overhaul : Gives RimWorld a Natural Feel
CR Moddable: make RimWorld more moddable.
CR CompFX: display dynamic effects over RimWorld objects

Snafu_RW

Quote from: CannibarRechter on October 06, 2017, 06:52:40 PMThis is a design pattern from very large scale computing, where the use of any thread controls will bring the whole simulation to its knees. Consider a simulation that uses hundreds or thousands of computers, each with 40+ cores, and now you're thinking about a scale where thread control just doesn't work very well.
Trick is tho: how would you scale up from multithreading to multi-core parallel processing, which STM to be the true bottleneck (as it is in DF)?

I admit I know next to nothing about programming on this level, but in general focussing on a goal is more useful rather than building a half-way-house.. & then tearing it down to rebuild it differently (except, natch, in Rimworld;))

Many tks for this thread BTW; it's helped me to understand more of the logic underlying multithreading programming
Dom 8-)

CannibarRechter

> how would you scale up from multithreading to multi-core parallel processing...

I can understand the words, but not the intent of this question. These are the same thing in my mind, except in the edge case where you are running multiple threads on a single core / single CPU computer, which I wasn't talking about in the first place. Now, granted, I also happened to be referencing multi-computer parallelism, but that is more an example of a use case where thread-locking and thread intercommunication become even more disastrous.
CR All Mods and Tools Download Link
CR Total Texture Overhaul : Gives RimWorld a Natural Feel
CR Moddable: make RimWorld more moddable.
CR CompFX: display dynamic effects over RimWorld objects

hoffmale

multithreading = multi core parallel processing (if on a multi core machine).

You split your program logic into multiple threads, which then get distributed on the available cores. If there's only a single core, one thread has to go after another (most likely in parts, i.e. the OS only runs them for X amount of time before another thread gets to run). If there are multiple cores, unless you specifically restrict the threads to one core (but then, why use multiple threads in the first place?), they run truly parallel.

That's why multithreading gets so complicated, as you now not only have to account for your own actions, but also for the actions of all other threads running in parallel.

CannibarRechter

Well, yes, but since all modern processors are multicore, when discussing multi-threaded computing these days, they are basically synonymous.
CR All Mods and Tools Download Link
CR Total Texture Overhaul : Gives RimWorld a Natural Feel
CR Moddable: make RimWorld more moddable.
CR CompFX: display dynamic effects over RimWorld objects

hoffmale

@CannibarRechter That was intended as an answer to Snafu_RW ;) Sorry for the confusion.