Looks like this is on the way to be fixed https://gitlab.freedesktop.org/mesa/mesa/-/issues/4763
This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
Pages1
#1
Support / Re: Arch linux xcb_xlib_threads_sequence_lost assertion failed.
July 22, 2021, 01:01:23 PM #2
Support / Re: Arch linux xcb_xlib_threads_sequence_lost assertion failed.
July 20, 2021, 04:03:10 PM
Thanks for the suggestion! I have Intel graphics so AFAIK the open source drivers are directly supported by Intel in MESA, I'm not aware of proprietary intel drivers but I will look it up!
GDB is not that terrible, but I get that it's not the friendliest debugger out there...
I'd like to switch to radare2 at some point but I haven't gotten around to it yet. If you get plugins or a GUI for easier visualization is more bearable, although without debugging symbols it's kind of a mess, that's why I ended up rebuilding libx11 and mesa locally.
Btw If you are curious I have debugged a bit more. I think the issue is triggered by a synchronization issue with libxcb. Essentially Xlib uses xcb to exchange messages with the X server, where xcb manages the IPC channels for the messages that the client sends to the X server. At some point Xlib grabs the IPC channel to write to it directly and to do so it needs to synchronize the message sequence numbers. I suspect this mess is triggered when the `XNoOp()` happens to cross the 1-byte boundary of the message sequence number, from 0xff to 0x100. I have no idea why this is though.
GDB is not that terrible, but I get that it's not the friendliest debugger out there...
I'd like to switch to radare2 at some point but I haven't gotten around to it yet. If you get plugins or a GUI for easier visualization is more bearable, although without debugging symbols it's kind of a mess, that's why I ended up rebuilding libx11 and mesa locally.Btw If you are curious I have debugged a bit more. I think the issue is triggered by a synchronization issue with libxcb. Essentially Xlib uses xcb to exchange messages with the X server, where xcb manages the IPC channels for the messages that the client sends to the X server. At some point Xlib grabs the IPC channel to write to it directly and to do so it needs to synchronize the message sequence numbers. I suspect this mess is triggered when the `XNoOp()` happens to cross the 1-byte boundary of the message sequence number, from 0xff to 0x100. I have no idea why this is though.
#3
Bugs / [1.2.3005] Game crashes with SIGSEGV when closing to OS with NULL ptr deref
July 16, 2021, 09:51:37 AM
Hi All,
I have been experiencing crashes when closing RimWorld for some time. After some debugging I think I tracked down what is happening.
System: Arch Linux arch-dell 5.12.15-arch1-1 #1 SMP PREEMPT Wed, 07 Jul 2021 23:35:29 +0000 x86_64 GNU/Linux
Rimword version: 1.2.3005 rev1191
MESA version: 21.1.4 (bug reproduced with a MESA debug build at commit fae28b0fce7)
Reproducing:
Just run the game and quit to OS from the main menu.
Symptom:
When closing to OS the game crashes with SIGSEGV in fclose() as called from the mesa disk_cache handling due to a NULL pointer dereference.
Cause:
The call to `fclose()` is made via `disk_cache_destroy()` which only triggers the code path if the environment variable `MESA_DISK_CACHE_SINGLE_FILE` is set.
This assumes that the `disk_cache_create` function initializes the file pointer under the same conditions.
During RimWorld startup the following happens:
- the `disk_cache_create()` function is called by the DRI layer during a call to `glXChooseVisual()`. This occurs before the MESA env var is set, causing the initialization to be skipped.
- the `MESA_DISK_CACHE_SINGLE_FILE` environment variable is set by `SteamAPI_Init()`, as detected by breaking on `setenv`.
I do not know enough about how RimWorld works internally and about the Steam API, but it looks like the call ordering may be wrong. If so, a simple reordering of the call to `SteamAPI_Init()` might fix the issue.
Hope this helps!
I have been experiencing crashes when closing RimWorld for some time. After some debugging I think I tracked down what is happening.
System: Arch Linux arch-dell 5.12.15-arch1-1 #1 SMP PREEMPT Wed, 07 Jul 2021 23:35:29 +0000 x86_64 GNU/Linux
Rimword version: 1.2.3005 rev1191
MESA version: 21.1.4 (bug reproduced with a MESA debug build at commit fae28b0fce7)
Reproducing:
Just run the game and quit to OS from the main menu.
Symptom:
When closing to OS the game crashes with SIGSEGV in fclose() as called from the mesa disk_cache handling due to a NULL pointer dereference.
Code Select
Thread 1 "RimWorldLinux" received signal SIGSEGV, Segmentation fault.
0x00007ffff7ca263b in fclose@@GLIBC_2.2.5 () from /usr/lib/libc.so.6
(gdb) bt
#0 0x00007ffff7ca263b in fclose@@GLIBC_2.2.5 () at /usr/lib/libc.so.6
#1 0x00007ffff3ab5eec in foz_destroy (foz_db=foz_db@entry=0x23f3e30)
at ../src/util/fossilize_db.c:337
#2 0x00007ffff3ab4404 in disk_cache_destroy (cache=0x23f3d10)
at ../src/util/disk_cache.c:238
#3 0x00007ffff3a45811 in brw_destroy_screen (sPriv=0x23e8360)
at ../src/mesa/drivers/dri/i965/brw_screen.c:1747
#4 0x00007ffff3ab2a1f in driDestroyScreen (psp=0x23e8360)
at ../src/mesa/drivers/dri/common/dri_util.c:238
#5 0x00007ffff4b2dd47 in dri3_destroy_screen (base=0x23b8620) at ../src/glx/dri3_glx.c:619
#6 0x00007ffff4b1f28a in FreeScreenConfigs (priv=0x23b5210) at ../src/glx/glxext.c:259
#7 glx_display_free (priv=0x23b5210) at ../src/glx/glxext.c:282
#8 0x00007ffff4b1f3dd in __glXCloseDisplay (dpy=0x2371480, codes=<optimized out>)
at ../src/glx/glxext.c:331
#9 0x00007ffff4f0230b in XCloseDisplay (dpy=0x2371480) at ClDisplay.c:65
#10 0x000000000138deed in ()
#11 0x000000000138be0e in ()
#12 0x00000000013798fc in ()Cause:
The call to `fclose()` is made via `disk_cache_destroy()` which only triggers the code path if the environment variable `MESA_DISK_CACHE_SINGLE_FILE` is set.
This assumes that the `disk_cache_create` function initializes the file pointer under the same conditions.
During RimWorld startup the following happens:
- the `disk_cache_create()` function is called by the DRI layer during a call to `glXChooseVisual()`. This occurs before the MESA env var is set, causing the initialization to be skipped.
Code Select
#0 disk_cache_create (gpu_name=gpu_name@entry=0x7fffffffc225 "i965_0166",
driver_id=driver_id@entry=0x7fffffffc230 "fece63fd0bf0705104a035fbcf0dce8de6d956e6",
driver_flags=0) at ../src/util/disk_cache.c:74
#1 0x00007ffff3a14699 in brw_disk_cache_init (screen=screen@entry=0x23e9640)
at ../src/mesa/drivers/dri/i965/brw_disk_cache.c:415
#2 0x00007ffff3a48595 in brw_init_screen (dri_screen=<optimized out>)
at ../src/mesa/drivers/dri/i965/brw_screen.c:2836
#3 0x00007ffff3ab2c46 in driCreateNewScreen2 (scrn=0, fd=4, extensions=<optimized out>,
driver_extensions=<optimized out>, driver_configs=0x7fffffffce78, data=0x23b8620)
at ../src/mesa/drivers/dri/common/dri_util.c:160
#4 0x00007ffff4b2e33c in dri3_create_screen (screen=0, priv=<optimized out>)
at ../src/glx/dri3_glx.c:930
#5 0x00007ffff4b1f689 in AllocAndFetchScreenConfigs (priv=0x23b5210, dpy=0x2371480)
at ../src/glx/glxext.c:830
#6 __glXInitialize (dpy=dpy@entry=0x2371480) at ../src/glx/glxext.c:953
#7 0x00007ffff4b1ae04 in GetGLXPrivScreenConfig (ppsc=<synthetic pointer>,
ppriv=<synthetic pointer>, scrn=0, dpy=0x2371480) at ../src/glx/glxcmds.c:173
#8 glXChooseVisual (dpy=0x2371480, screen=0, attribList=0x7fffffffd0c0)
at ../src/glx/glxcmds.c:1259
#9 0x00000000013c2907 in ?? ()- the `MESA_DISK_CACHE_SINGLE_FILE` environment variable is set by `SteamAPI_Init()`, as detected by breaking on `setenv`.
Code Select
Thread 1 "RimWorldLinux" hit Breakpoint 4, 0x00007ffff7c6d0f0 in setenv ()
from /usr/lib/libc.so.6
(gdb) printf "%s\n", $rdi /* arg0 */
MESA_DISK_CACHE_SINGLE_FILE
(gdb) p/c *$rsi /* arg1 */
$6 = 49 '1'
(gdb) bt
#0 0x00007ffff7c6d0f0 in setenv () at /usr/lib/libc.so.6
#1 0x00007fff81ef4a04 in () at /home/qwattash/.local/share/Steam/linux64/steamclient.so
#2 0x00007ffff180622b in SteamAPI_Init ()
at /home/qwattash/.local/share/Steam/steamapps/common/RimWorld/RimWorldLinux_Data/Plugins/libsteam_api.so
#3 0x000000004002a077 in ()
#4 0x0000000004f4c070 in ()
#5 0x0000000000000000 in ()
(gdb)I do not know enough about how RimWorld works internally and about the Steam API, but it looks like the call ordering may be wrong. If so, a simple reordering of the call to `SteamAPI_Init()` might fix the issue.
Hope this helps!
#4
Support / Re: Arch linux xcb_xlib_threads_sequence_lost assertion failed.
July 14, 2021, 08:23:10 PM
Update 3:
The bug appears to be racey, I now have debug builds for both mesa and libx11. Setting a breakpoint into `XNoOp` appears to sometimes skip past the issue. I'll debug this offline and consider this off-topic for this thread at this point.
The bug appears to be racey, I now have debug builds for both mesa and libx11. Setting a breakpoint into `XNoOp` appears to sometimes skip past the issue. I'll debug this offline and consider this off-topic for this thread at this point.
#5
Support / Re: Arch linux xcb_xlib_threads_sequence_lost assertion failed.
July 14, 2021, 02:52:26 PM
Update 2:
So I got a debug build of mesa. I believe the addition of the call to `XNoOp()` to `glXCreateContextAttribsARB(..)` is the cause of the symptom. It seems to have been introduced here:
I am unsure whether the issue lies with the caller not expecting to get into the libx11 event polling from here or there is something else going on.
So I got a debug build of mesa. I believe the addition of the call to `XNoOp()` to `glXCreateContextAttribsARB(..)` is the cause of the symptom. It seems to have been introduced here:
Code Select
commit 960c86d6787437b643825baa230bc0cd7f9f7540
Author: Bastian Beranek <[email protected]>
Date: Sat May 1 09:52:01 2021 +0200
glx: Assign unique serial number to GLXBadFBConfig error
Since commit f39fd3dce72 a new GLX error is issued in case context creation
fails. This broke wine on certain hardware: While wine installs an error handler
to ignore this kind of error, it does not function because it expects the
dpy->request serial number of the error to be incremented since the installation
of the handler.
Workaround this by artificially increasing the request number. This also
guarantees a unique serial number for the error.
Fixes: f39fd3dce72eaef59ab39a23b75030ef9efc2a40
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3969
Signed-off-by: Bastian Beranek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10565>
diff --git a/src/glx/create_context.c b/src/glx/create_context.c
index e3a513f58f6..7e1cec98c64 100644
--- a/src/glx/create_context.c
+++ b/src/glx/create_context.c
@@ -146,6 +146,9 @@ glXCreateContextAttribsARB(Display *dpy, GLXFBConfig config,
* somehow on the client side. clean up the server resource and panic.
*/
xcb_glx_destroy_context(c, xid);
+ /* increment dpy->request in order to give a unique serial number to the
+ * error */
+ XNoOp(dpy);
__glXSendError(dpy, GLXBadFBConfig, xid, 0, False);
} else {
gc->xid = xid;I am unsure whether the issue lies with the caller not expecting to get into the libx11 event polling from here or there is something else going on.
#6
Support / Re: Arch linux xcb_xlib_threads_sequence_lost assertion failed.
July 14, 2021, 01:50:15 PM
Update:
Looks like a mesa update is at fault here. I downgraded to mesa 20.3.4-3, this appears to fix the issue although I have not yet been able to track down the bug into mesa GLX / libx11.
The breakage was likely introduced with mesa 21.x, I tested both 21.1.2 and 21.1.4 and both cause the crash.
With mesa 20.3.4 I'm currently getting a SIGSEGV when closing the game, from `XCloseDisplay()` which ends up calling `fclose()` from i965_dri.so intel direct rendering library. But at least the game is runnable.
Looks like a mesa update is at fault here. I downgraded to mesa 20.3.4-3, this appears to fix the issue although I have not yet been able to track down the bug into mesa GLX / libx11.
The breakage was likely introduced with mesa 21.x, I tested both 21.1.2 and 21.1.4 and both cause the crash.
With mesa 20.3.4 I'm currently getting a SIGSEGV when closing the game, from `XCloseDisplay()` which ends up calling `fclose()` from i965_dri.so intel direct rendering library. But at least the game is runnable.
#7
Support / Arch linux xcb_xlib_threads_sequence_lost assertion failed.
July 13, 2021, 05:43:33 PM
Hi All,
Anybody else got the following crash at startup?
I attempted to have a clean RimWorld install without anything subscribed in the workshop but does not appear to help.
Offending versions:
Rimworld 1.2.3005 rev1191
libxcb: 1.14-1
Update: I ran RimWorld under gdb and got a stacktrace for the SIGABRT. Will follow up if it brings me somewhere.
The error message mentions XInitThreads not being called but I can confirm that it is being called before the crash, so the issue is different.
Anybody else got the following crash at startup?
Code Select
[xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
RimWorldLinux: xcb_io.c:269: poll_for_event: Assertion `!xcb_xlib_threads_sequence_lost' failed.
/home/qwattash/.local/share/Steam/steamapps/common/RimWorld/start_RimWorld.sh: line 27: 21149 Aborted (core dumped) LC_ALL=C
./$GAMEFILE $LOGI attempted to have a clean RimWorld install without anything subscribed in the workshop but does not appear to help.
Offending versions:
Rimworld 1.2.3005 rev1191
libxcb: 1.14-1
Update: I ran RimWorld under gdb and got a stacktrace for the SIGABRT. Will follow up if it brings me somewhere.
Code Select
#0 0x00007ffff7c6ad22 in raise () at /usr/lib/libc.so.6
#1 0x00007ffff7c54862 in abort () at /usr/lib/libc.so.6
#2 0x00007ffff7c54747 in _nl_load_domain.cold () at /usr/lib/libc.so.6
#3 0x00007ffff7c63616 in () at /usr/lib/libc.so.6
#4 0x00007ffff4f1ad2d in () at /usr/lib/libX11.so.6
#5 0x00007ffff4f1adc8 in () at /usr/lib/libX11.so.6
#6 0x00007ffff4f1b182 in _XEventsQueued () at /usr/lib/libX11.so.6
#7 0x00007ffff4f1e176 in _XGetRequest () at /usr/lib/libX11.so.6
#8 0x00007ffff4f09395 in XNoOp () at /usr/lib/libX11.so.6
#9 0x00007ffff49036e3 in () at /usr/lib/libGLX_mesa.so.0
#10 0x00007ffff4ab5428 in () at /usr/lib/libGLX.so.0
#11 0x00000000013c3674 in ()
#12 0x000000000138b1f1 in ()
#13 0x0000000000dde8a8 in ()
#14 0x0000000000de0bee in ()
#15 0x0000000000de0c90 in ()
#16 0x0000000000dcb8da in ()
#17 0x0000000000436446 in ()
#18 0x00007ffff7c55b25 in __libc_start_main () at /usr/lib/libc.so.6
#19 0x0000000000445d93 in ()
#20 0x00007fffffffdb08 in ()
#21 0x000000000000001c in ()
#22 0x0000000000000003 in ()
#23 0x00007fffffffdeec in ()
#24 0x00007fffffffdf36 in ()
#25 0x00007fffffffdf40 in ()
#26 0x0000000000000000 in ()The error message mentions XInitThreads not being called but I can confirm that it is being called before the crash, so the issue is different.
Pages1