Multithreading

Started by Turbo, October 06, 2013, 08:25:25 PM

Previous topic - Next topic

Ravine

#15
That's what i'm saying. A LOD can be a graphical one (a model less detailed) as it can be not computing the WindVector every frame and updating _WindVector for every material for every plant on screen as you zoom out. I'd love to have a look at the code running into Unity's profiler and cut stuff here and there, but obviously that's not possible ^^. Merely giving hints here.

Another easy one is to stay away from LINQ and all those nice and easy extension methods that come on IEnumerable and collections. It's so easy to shoot yourself in the foot performance wise.

Galileus


Tynan

Regarding performance, I just don't worry about it until it becomes and issue. It hasn't so far, really. The real issues facing the game are lack of content, poor training and hard to learn systems, bugs, imbalances, lack of longevity and choice, and so on. Performance is a tiny concern.

Which is why the game is so un-optimized (as anyone who has crawled the code will notice). There are some complex systems in there for grouping lumps of geometry together to render all at once, because it would be utterly unplayable otherwise. But for the most part I haven't much worried about performance.

So the first optimizations I'll be doing are the insanely easy ones. Things I've already done:

-Discovered that the GUI allocated 2.4KB of memory every time I changed fonts, even if the font didn't actually change. This amounted to like 50KB per frame. Ten minutes work and I had reduced memory allocs by like 40%.
-Not emitting a tooltip for every single piece of mineral on the map. That saved a massive amount of performance by itself, in a couple lines.
-Caching a dictionary of thing definitions indexed by the entity type enum, instead of doing an O(n) search every time anything wanted a definition.

You may see significant performance improvements from these alone, and others I may do before next release.

Later, I may get into hard optimizations like multithreading, aggressive texture atlas generation, and so on. But for now, it would be a mistake. There's great satisfaction for a programmer watching performance numbers go up. But players don't care unless it affects their game. The time I really need to save right now is not the CPU's time, it's my time, so I can focus better on design, training, clarity, balance, and so on. Hence the lack of aggressive optimizations.
Tynan Sylvester - @TynanSylvester - Tynan's Blog

linkxsc

Quote from: Tynan on January 18, 2014, 08:55:09 PM
-Discovered that the GUI allocated 2.4KB of memory every time I changed fonts, even if the font didn't actually change. This amounted to like 50KB per frame. Ten minutes work and I had reduced memory allocs by like 40%.
-Not emitting a tooltip for every single piece of mineral on the map. That saved a massive amount of performance by itself, in a couple lines.
-Caching a dictionary of thing definitions indexed by the entity type enum, instead of doing an O(n) search every time anything wanted a definition.
That font change will probably help people out a ton who have laptops and such with cheap ram.

Tynan

Yeah, I want to do a bit more to capture the very-low hanging fruit like that. I think a day of work with the profiler could yield some really easy gains without making the codebase any harder to maintain.
Tynan Sylvester - @TynanSylvester - Tynan's Blog

Ravine

Quote from: Tynan on January 18, 2014, 08:55:09 PM
-Caching a dictionary of thing definitions indexed by the entity type enum, instead of doing an O(n) search every time anything wanted a definition.

Care here, you traded O(n) for O(1), but in the meantime you just added a garbage generator to your systems. Enum keys in dictionary are doing what is called "boxing". This will have a slight performance hit, but worse, they will allocate memory that will put some strain on the GC, and you really want to avoid triggering your GC (been there, done that).

The quick way to fix that is migrating to a Dictionary<int, definition> and just cast the Enum to an int when you want to get a definition (that can be cumbersome to do that in code all the time, so you can wrap that in a GetDefinition( EntityType type ) which does the job.

(it's sunday, got plenty of time : http://pastebin.com/zfUSX8Hp )

Quote
There's great satisfaction for a programmer watching performance numbers go up. But players don't care unless it affects their game. The time I really need to save right now is not the CPU's time, it's my time, so I can focus better on design, training, clarity, balance, and so on. Hence the lack of aggressive optimizations.

Words of wisdom

Tynan

Quote from: Ravine on January 19, 2014, 05:28:41 AM
Quote from: Tynan on January 18, 2014, 08:55:09 PM
-Caching a dictionary of thing definitions indexed by the entity type enum, instead of doing an O(n) search every time anything wanted a definition.

Care here, you traded O(n) for O(1), but in the meantime you just added a garbage generator to your systems. Enum keys in dictionary are doing what is called "boxing". This will have a slight performance hit, but worse, they will allocate memory that will put some strain on the GC, and you really want to avoid triggering your GC (been there, done that).

The quick way to fix that is migrating to a Dictionary<int, definition> and just cast the Enum to an int when you want to get a definition (that can be cumbersome to do that in code all the time, so you can wrap that in a GetDefinition( EntityType type ) which does the job.

(it's sunday, got plenty of time : http://pastebin.com/zfUSX8Hp )


This sounds kind of insane. I thought that, from the machine's point of view, enums and ints were basically the same thing. You're saying that using an enum as a dictionary key somehow causes a box/unbox every time the collection is accessed? I couldn't find any docs about this, if you have a web page explaining it I'd be appreciative. Not that it's likely to be instrumental on performance anywhere (yet), but it sounds good to know.
Tynan Sylvester - @TynanSylvester - Tynan's Blog

Ravine

#22
Aye, sounds definitely counter intuitive. (Dont get me wrong, i was like "WTF" the first time i read about that, but that make sense when you go further). This is explained in detail (especially how to speed it up) here http://www.codeproject.com/Articles/33528/Accelerating-Enum-Based-Dictionaries-with-Generic .

The whole point is : Enum dont implement IEquatable, which is used by Dictionary<K,V>. What sound silly at first glance makes actually sense when you realize that having IEquatable implemented in Enum means that we can potentially compare enums to other enums. Which, in itself, from a programmer standpoint, doesnt make sense. So in the end, you dont want that kind of comparison happening at all. Which becomes an issue when you're using Enums in Dictionary which are using IEquatable for their comparison and GetHashCode stuff. Which leads to the boxing. Since you call TryGetValue or Contains with an enum, it goes to the (object).GetHashCode, which means a cast to object and boxing. Funny one, right ?

Bottom line : Enum don't implement IEquatable. Dictionary are based on that. If you are using a Dictionary<Enum, value> you're facing garbage generation. Either way you want to have your own comparer for that, or go the Encapsulate-cast.to.int way of doing it.

The quick way is the one i wrote in the pastebin. The "proper" way is actually implementing the comparer yourself and giving that to your Dictionary when you create it. This can probably be expressed as a lambda as you create it, but you'll have to remember to do that when you use a Dictionary based on enums. (something like  new Dictionary<Enum, Value> ( (k1, k2) => ((int)k1).CompareTo((int)k2) ); will probably do the trick. I'll have to check tomorrow in VS).

In the end, you dont want to mess up with Unity's GC (like i said earlier, been there, done that). Try to address that kind of issue from the beginning if you can. "Premature optimization is the root of all evil" as they say. Knowing what you're doing is not premature optimization. It's programming.


Tynan

Quote from: Ravine on January 20, 2014, 06:00:40 AM
Quick update with the links
http://blogs.msdn.com/b/shawnhar/archive/2007/07/02/twin-paths-to-garbage-collector-nirvana.aspx
and http://beardseye.blogspot.co.uk/2007/08/nuts-enum-conundrum.html

Cool, thanks. I've also been working through the links on your dropbox textfile. I'll probably do a few days of optimizations before release, just to get a handle on the worst of the worst.
Tynan Sylvester - @TynanSylvester - Tynan's Blog

Ravine

The video with the !!! mention is probably the most interesting of the links in that file. It gives a fairly good amount of tips and tricks, and especially a good tutorial on how to efficiently use the profiler

Haplo

On the Beard's Eye blog it is mentioned, that at least the Enum allocation problem seems to be fixed with .Net 4.0.
Is that true, or did I misread the meaning of the entry near the end?

But thanks  for the links. It is an interesting reading even for someone not building games  :D
Until now, I never knew how the garbage collector works at all, only that he works..  ;D

Tynan

Quote from: Haplo on January 20, 2014, 04:05:10 PM
On the Beard's Eye blog it is mentioned, that at least the Enum allocation problem seems to be fixed with .Net 4.0.
Is that true, or did I misread the meaning of the entry near the end?

But thanks  for the links. It is an interesting reading even for someone not building games  :D
Until now, I never knew how the garbage collector works at all, only that he works..  ;D

It's not true; I just saved 5KB per frame by putting one IEqualityComparer in.

In other news, GC alloc has gone down by about half today, from ~60KB/frame to ~30. All savings were in thing processing. Next up I'll investigate the GUI.

I love low-hanging fruit :D
Tynan Sylvester - @TynanSylvester - Tynan's Blog

Ravine

Quote from: Haplo on January 20, 2014, 04:05:10 PM
On the Beard's Eye blog it is mentioned, that at least the Enum allocation problem seems to be fixed with .Net 4.0.
Is that true, or did I misread the meaning of the entry near the end?

But thanks  for the links. It is an interesting reading even for someone not building games  :D
Until now, I never knew how the garbage collector works at all, only that he works..  ;D
Care with all the different elements that compose the .Net world. You have the languages (C#, and all his friends like VB.Net, Iron-Stuff (python, ruby, etc), F#), the CLR, which is the runtime (where the code is executed), and the Framework (where the base classes are). So features of the language is not tied to the same version of the CLR it runs on (C# 3.0 is CLR 2.0 for example).

All in all, keep in mind that Unity uses Mono, which is an open source implementation of the CLR. So what you can test directly on windows is not necessarily 1:1 with Mono (implementation differs, as well as performance or results). The best example that comes to my mind is the Random class in the .Net framework, which implementation differs in Mono. So, with the same seed, using the same code, depending where you execute it (.Net CLR, or Mono), you end up with different results.

That is the "problem" when it comes to Unity, since they have their own fork of Mono (made around v2.0 of Mono) with an outdated GC, which works, but is not the best out there, and that's why their Profiler is invaluable when it comes to debugging that kind of stuff.