[1.2.3005] Game crashes with SIGSEGV when closing to OS with NULL ptr deref

Started by qwattash, July 16, 2021, 09:51:37 AM

Previous topic - Next topic

qwattash

Hi All,

I have been experiencing crashes when closing RimWorld for some time. After some debugging I think I tracked down what is happening.

System: Arch Linux arch-dell 5.12.15-arch1-1 #1 SMP PREEMPT Wed, 07 Jul 2021 23:35:29 +0000 x86_64 GNU/Linux
Rimword version: 1.2.3005 rev1191
MESA version: 21.1.4 (bug reproduced with a MESA debug build at commit fae28b0fce7)

Reproducing:
Just run the game and quit to OS from the main menu.

Symptom:
When closing to OS the game crashes with SIGSEGV in fclose() as called from the mesa disk_cache handling due to a NULL pointer dereference.
Thread 1 "RimWorldLinux" received signal SIGSEGV, Segmentation fault.
0x00007ffff7ca263b in fclose@@GLIBC_2.2.5 () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff7ca263b in fclose@@GLIBC_2.2.5 () at /usr/lib/libc.so.6
#1  0x00007ffff3ab5eec in foz_destroy (foz_db=foz_db@entry=0x23f3e30)
    at ../src/util/fossilize_db.c:337
#2  0x00007ffff3ab4404 in disk_cache_destroy (cache=0x23f3d10)
    at ../src/util/disk_cache.c:238
#3  0x00007ffff3a45811 in brw_destroy_screen (sPriv=0x23e8360)
    at ../src/mesa/drivers/dri/i965/brw_screen.c:1747
#4  0x00007ffff3ab2a1f in driDestroyScreen (psp=0x23e8360)
    at ../src/mesa/drivers/dri/common/dri_util.c:238
#5  0x00007ffff4b2dd47 in dri3_destroy_screen (base=0x23b8620) at ../src/glx/dri3_glx.c:619
#6  0x00007ffff4b1f28a in FreeScreenConfigs (priv=0x23b5210) at ../src/glx/glxext.c:259
#7  glx_display_free (priv=0x23b5210) at ../src/glx/glxext.c:282
#8  0x00007ffff4b1f3dd in __glXCloseDisplay (dpy=0x2371480, codes=<optimized out>)
    at ../src/glx/glxext.c:331
#9  0x00007ffff4f0230b in XCloseDisplay (dpy=0x2371480) at ClDisplay.c:65
#10 0x000000000138deed in  ()
#11 0x000000000138be0e in  ()
#12 0x00000000013798fc in  ()


Cause:
The call to `fclose()` is made via `disk_cache_destroy()` which only triggers the code path if the environment variable `MESA_DISK_CACHE_SINGLE_FILE` is set.
This assumes that the `disk_cache_create` function initializes the file pointer under the same conditions.
During RimWorld startup the following happens:
- the `disk_cache_create()` function is called by the DRI layer during a call to `glXChooseVisual()`. This occurs before the MESA env var is set, causing the initialization to be skipped.
#0  disk_cache_create (gpu_name=gpu_name@entry=0x7fffffffc225 "i965_0166",
    driver_id=driver_id@entry=0x7fffffffc230 "fece63fd0bf0705104a035fbcf0dce8de6d956e6",
    driver_flags=0) at ../src/util/disk_cache.c:74
#1  0x00007ffff3a14699 in brw_disk_cache_init (screen=screen@entry=0x23e9640)
    at ../src/mesa/drivers/dri/i965/brw_disk_cache.c:415
#2  0x00007ffff3a48595 in brw_init_screen (dri_screen=<optimized out>)
    at ../src/mesa/drivers/dri/i965/brw_screen.c:2836
#3  0x00007ffff3ab2c46 in driCreateNewScreen2 (scrn=0, fd=4, extensions=<optimized out>,
    driver_extensions=<optimized out>, driver_configs=0x7fffffffce78, data=0x23b8620)
    at ../src/mesa/drivers/dri/common/dri_util.c:160
#4  0x00007ffff4b2e33c in dri3_create_screen (screen=0, priv=<optimized out>)
    at ../src/glx/dri3_glx.c:930
#5  0x00007ffff4b1f689 in AllocAndFetchScreenConfigs (priv=0x23b5210, dpy=0x2371480)
    at ../src/glx/glxext.c:830
#6  __glXInitialize (dpy=dpy@entry=0x2371480) at ../src/glx/glxext.c:953
#7  0x00007ffff4b1ae04 in GetGLXPrivScreenConfig (ppsc=<synthetic pointer>,
    ppriv=<synthetic pointer>, scrn=0, dpy=0x2371480) at ../src/glx/glxcmds.c:173
#8  glXChooseVisual (dpy=0x2371480, screen=0, attribList=0x7fffffffd0c0)
    at ../src/glx/glxcmds.c:1259
#9  0x00000000013c2907 in ?? ()

- the `MESA_DISK_CACHE_SINGLE_FILE` environment variable is set by `SteamAPI_Init()`, as detected by breaking on `setenv`.
Thread 1 "RimWorldLinux" hit Breakpoint 4, 0x00007ffff7c6d0f0 in setenv ()
   from /usr/lib/libc.so.6
(gdb) printf "%s\n", $rdi         /* arg0 */
MESA_DISK_CACHE_SINGLE_FILE
(gdb) p/c *$rsi                       /* arg1 */
$6 = 49 '1'
(gdb) bt
#0  0x00007ffff7c6d0f0 in setenv () at /usr/lib/libc.so.6
#1  0x00007fff81ef4a04 in  () at /home/qwattash/.local/share/Steam/linux64/steamclient.so
#2  0x00007ffff180622b in SteamAPI_Init ()
    at /home/qwattash/.local/share/Steam/steamapps/common/RimWorld/RimWorldLinux_Data/Plugins/libsteam_api.so
#3  0x000000004002a077 in  ()
#4  0x0000000004f4c070 in  ()
#5  0x0000000000000000 in  ()
(gdb)


I do not know enough about how RimWorld works internally and about the Steam API, but it looks like the call ordering may be wrong. If so, a simple reordering of the call to `SteamAPI_Init()` might fix the issue.

Hope this helps!

Pheanox

Thanks for the well written and researched report.  I have put this in front of the devs for them to review.  Appreciate the time and effort.