[SOLVED Linux] Game consistently crashes after running fine many IG years

Started by tuk0z, October 18, 2021, 11:48:10 AM

Previous topic - Next topic

tuk0z

1. Circumstances

  I started a new v1.3/Ideology game after a 2 weeks sandbox one to test some mods stability. New campaign run flawlessly for weeks i.e. 3-4 IG years, before it crashed for the first time without me having touched my modlist for at least a week. Then crashed about once per week for 3 weeks with a similar log to the one below. Then last week end this game just kept crashing and crashing, either a couple minutes after loading my save or fully idle/AFK & paused.

2. What happens

  Campaign starts fine, absolutely stable. Then after a few years in game (when things start to spice up quite a bit with Randy Loosing is Fun) the game starts to crash just whatever I do or don't do. Already happened with 1.2 [1] (gave up my campaign after months testing and trying to narrow down where the issue came from).

3. What I expect

  Some hints on what to look for / provide and where to ask for some sort of help.

4. Steps to follow to make the bug appear

  Load the savegame below on a computer with a Linux OS (I use Arch) with all (even only some) of the mods used and try playing for a few minutes.

5. Savegame file

  This file will stay up for the next 6 days.

6. Log file

  Common part in all of them is:
    mmap(PROT_NONE) failed                                                       
  Caught fatal signal - signo:6 code:-6 errno:0 addr:0x3e900000799             
  (stack frames follow up)


7. Some references just in case

- Other players also reported v1.3 instantly crashing when using Harmony and some mods (previously patched by this v1.2 only mod) on Steam > Harmony Fix For Penguins comments, eg 2021-07-22:
Quote from: Dorsai!"Animal Controls", "Visual Exceptions" crash the game right on the start screen, after opening and closing a menu (like Options or Mods).
Also, as many others mentioned, VFE: Mechanoids crashes the game on loading or starting a game.
- A somehow similar Weird crash #24 (with Native stacktrace) was reported on Giddy-Up issues.
- [solved][ il2cpp android] mmap(PROT_NONE) failed - 2021-05

And since my previous game on v1.2 also ended up unplayable for a possibly Unity3D/Mono caused crash I'd like to share:
- CTD on Linux without useful error messages using only two mods : Same error I had hundreds of time during my last v1.2 campaign attempt. Even using Charlotte v1.2 only Harmony Fix For Penguins mod patches. Also the older  fix found in RimWorld 1.1.2552 [+ DLC] fails to start - Arch Linux & Gentoo did not work.
- Upon looking at the provided lofs modder Mehni noted:
                                 
    > Seems like most of the crashes happen pretty early during the DoWindowContents method.
    > `Widgets.Label(thingCount, "koisama.Numbers.Count".Translate() + ": " + Pawns.Count());`
    > and nothing there really stands out at me. The Pawns.Count() does some things which might not be the most beautiful, but certainly nothing ugly enough to crash a game.
    > Googling points me at mono, which would make sense - that's the Linux implementation of the .NET framework. Which seems odd that it crashes on a really simple Count().
    > That's the garbage collection. Only relevant thing I found is this: https://forum.unity.com/threads/gc-crash-on-debian.961422/

Canute

Hi,
does rimworld crash, then you get a popup window that say it crash and a crash folder get created with special player.log and error.log.
Or does rimworld just exist.

But anyway at first you should find the mod's that couse error's at startup.
QuoteError: Cannot create FMOD::Sound instance for clip "" (FMOD error: Unsupported file or audio format. )
That isn't realy a mod error, more your mashine issue that the mod (or Rimworld) can't play that kind of audioformat.

QuoteCould not load reference to Verse.JobDef...
Not sure, maybe you miss a required mod,wrong mod order  or a mod is just missing something.
Don't forget to check mod's for their suggested loadorder (Rocketman as last one).

Btw. you use Hugslib, why you didn't used the Share logs feature (green button at the ingame log window) ? :-)

tuk0z

Rimworld just exits. Which makes it rather hard to use any RimWorld/Hugslib log tool after it crashed.

> anyway at first you should find the mod's that couse error's at startup.

Any suggestions on a clean way to do so quite a few years into a game? I removed / unsubed most of the mods that do not create things to no extend. Won't I just kill the game by removing those mods the colony is created from (i.e. Alpha Animals, KV Weopon storage)? I know, that's why I test mod for weeks before launching a new campaign.
    Error: Cannot create FMOD::Sound instance for clip "" (FMOD error: Unsupported file or audio format. )
    Could not load reference to Verse.JobDef...

Those I've seen ever since I first looked at a Rimworld / Linux log years ago, and they seem harmless unlike these (from first logs after my current game crashed):

    mmap(PROT_NONE) failed
    Caught fatal signal - signo:6 code:-6 errno:0 addr:0x3e900000799
    Obtained 16 stack frames.
    ....
    #14 0x00000040e6df07 in (wrapper managed-to-native) object:__icall_wrapper_ves_icall_object_new_specific (intptr)
    #15 0x007fb4343162db in (wrapper dynamic-method) Verse.AI.Pawn_MindState:Verse.AI.Pawn_MindState.MindStateTick_Patch0 (Verse.AI.Pawn_MindState)


    mmap(PROT_NONE) failed
    Caught fatal signal - signo:6 code:-6 errno:0 addr:0x3e900008ca7
    Obtained 34 stack frames.
    ....
    #13 0x0000004136d27f in (wrapper managed-to-native) object:__icall_wrapper_ves_icall_array_new_specific (intptr,int)
    #14 0x00000041a5a7e4 in Verse.ModSummaryWindow:DrawWindow (UnityEngine.Vector2,bool)
    #15 0x007f06bc6c729b in (wrapper dynamic-method) Verse.LongEventHandler:Verse.LongEventHandler.LongEventsOnGUI_Patch1 ()
    #16 0x007f06f03df67b in (wrapper dynamic-method) Verse.Root:Verse.Root.OnGUI_Patch1 (Verse.Root)
    #17 0x000000413785aa in (wrapper runtime-invoke) object:runtime_invoke_void__this__ (object,intptr,intptr,intptr)


    mmap(PROT_NONE) failed
    Caught fatal signal - signo:6 code:-6 errno:0 addr:0x3e90000d238             
    Obtained 35 stack frames.
    ...
    #14 0x000000413d1f07 in (wrapper managed-to-native) object:__icall_wrapper_ves_icall_object_new_specific (intptr)
    #15 0x007fd7185ded13 in (wrapper dynamic-method) RimWorld.PawnWoundDrawer:RimWorld.PawnWoundDrawer.RenderOverBody_Patch1 (RimWorld.PawnWoundDrawer,UnityEngine.Vector3,UnityEngine.Mesh,UnityEngine.Quaternion,bool,RimWorld.BodyTypeDef/WoundLayer,Verse.Rot4,System.Nullable`1<bool>)
    #16 0x007fd71832ca0b in (wrapper dynamic-method) Verse.PawnRenderer:Verse.PawnRenderer.RenderPawnInternal_Patch1 (Verse.PawnRenderer,UnityEngine.Vector3,single,bool,Verse.Rot4,Verse.RotDrawMode,Verse.PawnRenderFlags)
    #17 0x007fd7183225f3 in (wrapper dynamic-method) Verse.PawnRenderer:Verse.PawnRenderer.RenderPawnAt_Patch2 (Verse.PawnRenderer,UnityEngine.Vector3,System.Nullable`1<Verse.Rot4>,bool)
    #18 0x000000427f7de0 in Verse.Pawn_DrawTracker:DrawAt (UnityEngine.Vector3) 
    #19 0x007fd860690923 in (wrapper dynamic-method) Verse.Map:Verse.Map.MapUpdate_Patch0 (Verse.Map)
    #20 0x007fd7a84b2a5b in (wrapper dynamic-method) Verse.Game:Verse.Game.UpdatePlay_Patch1 (Verse.Game)
    #21 0x000000413de5aa in (wrapper runtime-invoke) object:runtime_invoke_void__this__ (object,intptr,intptr,intptr)


    mmap(PROT_NONE) failed
    Caught fatal signal - signo:6 code:-6 errno:0 addr:0x3e90000d238             
    Obtained 37 stack frames.
    ...
    #14 0x000000415ccf07 in (wrapper managed-to-native) object:__icall_wrapper_ves_icall_object_new_specific (intptr)
    #15 0x000000416dff94 in System.Linq.Enumerable/WhereEnumerableIterator`1<TSource_REF>:ToArray ()
    #16 0x00000041756b3c in System.Linq.Enumerable:ToArray<TSource_REF> (System.Collections.Generic.IEnumerable`1<TSource_REF>)
    #17 0x007efe40013b9d in (wrapper dynamic-method) Verse.InspectTabBase/<>c__DisplayClass14_0:Verse.InspectTabBase+c__DisplayClass14_0.<DoTabGUI>b__0_Patch0 (Verse.InspectTabBase/<>c__DisplayClass14_0)
    #18 0x000000405b711d in Verse.ImmediateWindow:DoWindowContents (UnityEngine.Rect)
    #19 0x0000004018fb91 in UnityEngine.GUI:CallWindowDelegate (UnityEngine.GUI/WindowFunction,int,int,UnityEngine.GUISkin,int,single,single,UnityEngine.GUIStyle)


    mmap(PROT_NONE) failed
    Caught fatal signal - signo:6 code:-6 errno:0 addr:0x3e90000d238             
    Obtained 20 stack frames.
    ...
    #14 0x00000041a6c5a6 in (wrapper managed-to-native) string:FastAllocateString (int)
    #15 0x0000004195f07c in RimWorld.StatWorker:GetExplanationFull (RimWorld.StatRequest,Verse.ToStringNumberSense,single)
    #16 0x0000004195d344 in RimWorld.StatDrawEntry:GetExplanationText (RimWorld.StatRequest)
    #17 0x0000004195b440 in RimWorld.StatsReportUtility:DrawStatsWorker (UnityEngine.Rect,Verse.Thing,RimWorld.Planet.WorldObject)
    #18 0x007fe2b0089f2b in (wrapper dynamic-method) RimWorld.StatsReportUtility:RimWorld.StatsReportUtility.DrawStatsReport_Patch0 (UnityEngine.Rect,Verse.Thing)
    #19 0x007fe2702d295b in (wrapper dynamic-method) Verse.Dialog_InfoCard:Verse.Dialog_InfoCard.FillCard_Patch2 (Verse.Dialog_InfoCard,UnityEngine.Rect)

I'm not a dev, just a web dev and admin but the single consistent lines I can see after each crash are the first 3. With that "mmap(PROT_NONE) failed" being related to a memory allocation error according to that Unity / Linux related thread quoted above.

EDIT: exits, not exist lol

Canute

Never saw these mmap(PROT_NONE) failed,
must be linux specific error or maybe some mod i don't know (well).

Ok, since you crash right from the start, you should remove all mod's at first.
Unsubscribe all, or just delete the ModCondig.xml at the Config folder.

Simple just check if the vanilla gameplay works without error.
Then add libaries mod other QoL mods, check if anything works well then.
Hint, with hugslib you can use the hugslib quickstart too. Icon on top right.

Add more mods.
And at last add Rocketman if anything works fine so far. Same for other performance enhancer like RimThreaded.

tuk0z

This I'll do. Back in B18 days I've fully borked a super long & good game by removing mods, therefore my initial reluctance to do so. 
EDIT: Also my game does crash minutes after loading the savefile and starting to play. And rather 30' after removing another batch of mods tonight (same eroor in the log).

tuk0z

So whole map is mostly empty but half my pawns, slaves and some animals (or their names at least since all are invisible) "wandering" in the dark, upon renaming ModsConfig.xml and reloading the save. Not sure if that is a good testing ground? Really if whatever that causes this nmap crash could manifest before the fifth IG year it'd be so freaking usefull.

Canute

I am not aware that you got allready a colony.
Yes removing mods from a colony is mosttimes not a good idea, depend on the mod.
That's why it is important that you have no error's before you start a colony, that increase the chance for a succesful playthrough.

When Rimworld just force quit you can check the logfile.
When you got repeating messages at the end, it is a good chance that they cause a memory leak that let rimworld crash.
But mosttimes rimworld don't even log the reason that let rimworld quit.

tuk0z

I understand Ludeon forum recommends we give hints on circumstances and what happens when reporting a bug?

Quote from: tuk0z on October 18, 2021, 11:48:10 AM
1. Circumstances
    new v1.3/Ideology game after a 2 weeks sandbox one to test some mods stability. New campaign run flawlessly for weeks i.e. 3-4 IG years, before it crashed for the first time without me having touched my modlist ... Then crashed about once per week for 3 weeks

2. What happens
    Campaign starts fine, absolutely stable. Then after a few years in game (when things start to spice up

Problem here:
1. something in the game has prevented me to advance any campaign since v1.2 went out, as it starts to constantly crashing after running smoothly for years IG.
2. Logs give a strong hint on what's causing it but my tech background is definitively insufficient to make use of it.

tuk0z

OK so I found a mod that's crashing my 5 IG years old game, also maybe why the freak it happens only after IG years (it's NOT Harmony :D) I'll report this to its dev. With one hour testing per mod (I've seen crashes after the 30' mark) testing is taking time. But I wanna be sure there aren't any other(s). Also second COVID-19 vaccine yesterday left me feel almost like a scyther having received his EMP gift lol

tuk0z

A follow up and a question about how to debug this game with mods. 

  • note I tested this modlist for 2 weeks before starting a v1.3+DLC game and they worked fine (early August).
  • following the exchange with @Canute I removed most of the mods I could without breaking current game
  • reactivating one mod at a time I found a mod that consistently causes the 'nmap(PROT_NONE)' crash when activated, game running fine without this mod
  • reactivating another mod caused another crash though with a fully different error, but re-restarting changing nothing had the game consistently run smoothly with no error
  • reactivating yet another mod had the game crash after ~45', with the same 'nmap(PROT_NONE)' error
So which mod is causing the crash: the one that crashes when restarting once, twice, after many IG years, else?

Incidentally, is RimWorld with mods supposed to be (at least in my eyes) so inconsistent? Or is it rather RimWorld with mods played on Linux ? Or there is a way to use the stack frames it gives when it crashes -- it's just me being still ignorant atm?

I ask because hundreds of hours testing per game/campaign is a bit over what I can afford (~200H testing before starting the game + ~2h per mod 2 months after it started crashing later in the same game = 60h + the research and reports). Being a tech person I'm used to debuggin, but I'd also like to enjoy my time playing the game and DLC I bought.

Canute

1. The game change, the mod change with every new major version.
Sometimes the mod author's just need to recompile their .dll if they made one, but mosttimes some XML changes are made too.
When a modlist works at 1.2 there is no garantee that it will works fine at 1.3 too.

When you found such mod's that cause problems, you should try to use the alone. And report any problems to the author. Maybe it is just minor problem based on linux/MacOS.
But normaly all mod's should work overall.
And yes it is a pain to find the troublemaker, special on larger modlists.
And when the modlist get larger and larger, it can happen that you got mod conflicts with other mods too.
A few trouble maker can ruin the whole modding expirience.

tuk0z

Quote from: Canute on October 22, 2021, 08:28:40 AM
1. The game change, the mod change with every new major version.
Sometimes the mod author's just need to recompile their .dll if they made one, but mosttimes some XML changes are made too.
When a modlist works at 1.2 there is no garantee that it will works fine at 1.3 too.
Yes, having played the game a lot since B18 I know that (otherwise probably wouldn't take weeks to validate mods after any major release or DLC)

Quote from: Canute on October 22, 2021, 08:28:40 AMWhen you found such mod's that cause problems, you should try to use the alone. And report any problems to the author. Maybe it is just minor problem based on linux/MacOS.
You mean using the mod alone? we've checked earlier one can't remove all (other) mods from a long time running game without breaking it, so this is a non applicable advice. Rather, if a new player was to read this I'd tell him or her: Remove any mod that did not add any pawn/faction/item in your game yet. E.g. most UI, texture, QoL mods (and even other if one deletes any stuff the mod added before removing it).

Back to debugging this nmap error :
I found 2 mods that cause it by themselves now (after having worked fine for IG years). Colonist History alone[1] crashes my game rather instantly upon loading my save. And Light Radius alone[1] crashes it too but later after loading (about 30'). Info has been shared with their developers.

[1] "alone", that is among the mods I can deactivate.

Canute

Testing not on an existing colony, yes that don't work until it was a vanilla colony.
Just creating a new colony, just to see if start anything fine (hugslib offer a quickstart function).

But when these nmap thing just happen later during gameplay, it is hard to track down anyway.
You never can say if it is a single mod issue or a mod conflict of 2 or more mods that cause it.


But

tuk0z

Yeah, I didn't think enough about an active mod upgrade doing wrong. A quick and clean new base with the mod would help to pinpoint where the issue comes from in thefirst place yeah.  Fact I never could finish a campaign (start no prob, crash only many years IG) since v1.2 came out might blur my vision.

Quote from: Canute on October 22, 2021, 01:03:21 PM
But when these nmap thing just happen later during gameplay, it is hard to track down anyway.
You never can say if it is a single mod issue or a mod conflict of 2 or more mods that cause it.
Which is why I'm here man. These MS/Unity/Mono error stack frames must have a meaning for someone in this world. Wait, no?

tuk0z

Quote from: tuk0z on October 22, 2021, 02:41:06 PMThese MS/Unity/Mono error stack frames must have a meaning for someone in this world. Wait, no?
I guess the answer is no.

Anyway today with Light Radius on, the game run fine for over 3 hours, then suddently crashes with this (slightly different) stack:
mmap(PROT_NONE) failed
Caught fatal signal - signo:6 code:-6 errno:0 addr:0x3e900000c0f
Obtained 15 stack frames.
#0  0x007fcfb6482870 in (Unknown)
#1  0x007fcfb62dfd22 in (Unknown)
#2  0x007fcfb62c9862 in (Unknown)
#3  0x007fcfb0ec1312 in (Unknown)
#4  0x007fcfb0ec137f in (Unknown)
#5  0x007fcfb0ec427f in (Unknown)
#6  0x007fcfb0ec44e8 in (Unknown)
#7  0x007fcfb0ec4956 in (Unknown)
#8  0x007fcfb0ec5788 in (Unknown)
#9  0x007fcfb0ec58ac in (Unknown)
#10 0x007fcfb0e9b8fb in (Unknown)
#11 0x007fcfb0e50de0 in (Unknown)
#12 0x007fcfb0e50f57 in (Unknown)
#13 0x007fcfb0e50f9f in (Unknown)
#14 0x000000417c6f07 in (wrapper managed-to-native) object:__icall_wrapper_ves_icall_object_new_specific (intptr)

The good news is that it had been 8 days since my game run this long before crashing. The less good news is I can't imagine making a new base and test it that many hours for each mod. Unless there is a way to reduce the number of mods that may be linked in these crash, and therefore to test for so long?