Mutter-x11-frames keeps crashing

For CL 40690 I have noticed that when using GDM or logging into GNOME. The /usr/libexec/mutter-x11-frames binary keeps crashing.

Jan 20 08:05:53 clr-912e704f04584fb493b108f9f9e1d8be systemd[1]: Started systemd-coredump@332-18257-0.service.
Jan 20 08:05:53 clr-912e704f04584fb493b108f9f9e1d8be gnome-shell[18241]: ERROR:             ICD associated with VkPhysicalDevice does not support GetPhysicalDeviceCalibrateableTimeDomainsKHR
Jan 20 08:05:53 clr-912e704f04584fb493b108f9f9e1d8be gnome-shell[18241]: libEGL warning: egl: failed to create dri2 screen
Jan 20 08:05:53 clr-912e704f04584fb493b108f9f9e1d8be systemd[1]: systemd-coredump@331-18216-0.service: Deactivated successfully.
Jan 20 08:05:53 clr-912e704f04584fb493b108f9f9e1d8be systemd-coredump[18217]: [🡕] Process 18201 (mutter-x11-fram) of user 1000 dumped core.

     Stack trace of thread 18201:
#0  0x000055e170898f3b __pthread_kill_implementation (libc.so.6 + 0x98f3b)
#1  0x000055e170841682 __GI_raise (libc.so.6 + 0x41682)
#2  0x000055e17082649f __GI_abort (libc.so.6 + 0x2649f)
#3  0x000055e16c8ce1c3 terminator_GetPhysicalDeviceCalibrateableTimeDomainsKHR (libvulkan.so.1 + 0x2e1c3)
#4  0x000055e161fb37f2 check_have_device_time (zink_dri.so + 0xedb7f2)
#5  0x000055e161fb59ea zink_create_screen (zink_dri.so + 0xedd9ea)
#6  0x000055e161b52941 pipe_loader_sw_create_screen (zink_dri.so + 0xa7a941)
#7  0x000055e161b52829 pipe_loader_create_screen_vk (zink_dri.so + 0xa7a829)
#8  0x000055e16118ddda kopper_init_screen (zink_dri.so + 0xb5dda)
#9  0x000055e1611939d5 driCreateNewScreen2 (zink_dri.so + 0xbb9d5)
#10 0x000055e16cdcb6e6 dri2_create_screen (libEGL_mesa.so.0 + 0x236e6)
#11 0x000055e16cdd0e2e dri2_initialize_x11_swrast (libEGL_mesa.so.0 + 0x28e2e)
#12 0x000055e16cdc85b0 dri2_initialize (libEGL_mesa.so.0 + 0x205b0)
#13 0x000055e16cdb8a43 eglInitialize (libEGL_mesa.so.0 + 0x10a43)
#14 0x000055e1717f5a50 gdk_display_init_egl (libgtk-4.so.1 + 0x491a50)
#15 0x000055e1717c71db gdk_x11_display_init_gl_backend (libgtk-4.so.1 + 0x4631db)
#16 0x000055e1717f6064 gdk_display_init_gl (libgtk-4.so.1 + 0x492064)
#17 0x000055e1717c1f1b gdk_x11_display_open (libgtk-4.so.1 + 0x45df1b)
#18 0x000055e1717f0154 gdk_display_manager_open_display (libgtk-4.so.1 + 0x48c154)
#19 0x000055e1715128e7 gdk_display_open_default (libgtk-4.so.1 + 0x1ae8e7)
#20 0x000055e1715133d9 gtk_init (libgtk-4.so.1 + 0x1af3d9)
#21 0x000055e171c196f8 main (mutter-x11-frames + 0x36f8)
#22 0x000055e170827d77 __libc_start_call_main (libc.so.6 + 0x27d77)
#23 0x000055e170827e35 __libc_start_main_impl (libc.so.6 + 0x27e35)
#24 0x000055e171c19791 _start (mutter-x11-frames + 0x3791)
                                                                                  

While doing some Google searching I came across this Fedora thread.

Although I’m not sure it’s 100% related.

FYI, I’m using NVIDIA Driver Version: 545.23.06

I’m using NVIDIA Driver version 535.154.05 on CL 40690 with egl-gbm backported from 545.29.06. No issues running X11 Gnome. I prefer running the LTS kernel so there is that too. Some applications that I use are not yet ready for Wayland and the reason running X11.

Hey marioroy, thanks for responding and also thanks so much for all the hard work you put into your nvidia-driver-on-clear-linux repo. I’ve been using it for some time and makes the NVIDIA proprietary driver installation so much easier.

Anyway, on a hunch because of the dri2 error I disconnected my second GPU (running a dual-GPU setup with 2 RTX A6000 and nvlink). I still got the constant crashing, even before logging in while at the GDM login screen. I downgraded to NVIDIA driver 535.154.05 and I was able to run an GNOME on Xorg session. I then shutdown, reconnected my second GPU and booted back up. Still got the mutter-x11-frames crashing and when I try to do a GNOME on Xorg session, I get a black screen.

So right now I’ve disconnected my second GPU and running GNOME on Xorg. Things seem OK so far. I might try the LTS kernel as well on dual and single GPUs to see if i makes a difference.

I want to note that everything was indeed working fine prior to upgrade. I was running CL 40xxx from I think October 2023.

Also, it appears nvidia-x11-frames is still crashing during the GDM login screen. When I do start a GNOME on Xorg session and log in, it stops.

Looks like CL was a little ahead of the curve and the issue is now popping up more as distros go to Mesa 24. See

and
vkGetInstanceProcAddr returns non-NULL for unsupported functions · Issue #1443 · KhronosGroup/Vulkan-Loader · GitHub.

I was able to fix the issue by downgrading to mesa 23.3.6 which I built from source.