For recent CL releases, if eglinfo is crashing using NVIDIA graphics, than try the 545 display driver.
If experiencing high latency running the 6.6 kernel, than try the CL LTS kernel. Another option is to enable sched_autogroup by adding an entry to /etc/clr-power-tweaks.conf (create the file if missing).
/proc/sys/kernel/sched_autogroup_enabled 1
Update:
The eglinfo segfaulting is resolved for 525, 535, and Vulkan drivers by backporting libnvidia-egl-gbm.so from 545.29.06.
I tried CachyOS, and in my opinion, it has the smoothest UI responsiveness thanks to ‘v3 kernels’ of choice. https://cachyos.org. The UI responsiveness while compiling a kernel using all cores was unbelievable.
Have you tried the kernel knob sched_autogroup_enabled set to 1? With this enabled, I have not experienced UI slowness building kernels. This knob is enabled by default in CachyOS.
sudo tee -a "/etc/clr-power-tweaks.conf" >/dev/null <<'EOF'
/proc/sys/kernel/sched_autogroup_enabled 1
EOF
The NVIDIA proprietary driver may not build successfully for Linux kernels 6.8.0-rc2, 6.7.3, 6.6.15, and 6.1.76. Edit: See the next post. I refactored the driver installation script to apply the patch automatically.
ERROR: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol '__rcu_read_lock'
ERROR: modpost: GPL-incompatible module nvidia.ko uses GPL-only symbol '__rcu_read_unlock'
A workaround is patching the NVIDIA kernel source. Download the patch mentioned in the article. Here, I applied the patch to my NVIDIA installation. I’m running 535.154.05.
cd /usr/src/nvidia-535.154.05
sudo patch -p2 < ~/Downloads/nvidia-drivers-470.223.02-gpl-pfn_valid.patch
I refactored the installer driver script to allow patching the NVIDIA kernel sources to fix bugs. This must be done before calling DKMS, handled automatically. Here is a test run on my machine. DKMS succeeded for the recent XanMod kernels 6.1.76 and 6.7.3.
$ ./install-driver 535
Installing the NVIDIA proprietary driver...
Verifying archive integrity... OK
Uncompressing NVIDIA Accelerated Graphics Driver for Linux-x86_64 535.154.05...
...............................................................................
...............................................................................
...............................................................................
....................................
The NVIDIA driver installation succeeded.
Backporting libnvidia-egl-gbm.so* from 545.29.06.
Applying nvidia-gpl-pfn_valid.patch to /usr/src/nvidia-535.154.05
patching file common/inc/nv-linux.h
Hunk #1 succeeded at 2069 (offset 79 lines).
patching file nvidia/nv-mmap.c
Hunk #1 succeeded at 584 with fuzz 1 (offset 8 lines).
patching file nvidia/os-mlock.c
Hunk #1 succeeded at 115 (offset 13 lines).
Hunk #2 succeeded at 189 (offset 13 lines).
Registering the NVIDIA kernel module sources with DKMS.
Creating symlink /var/lib/dkms/nvidia/535.154.05/source -> /usr/src/nvidia-535.154.05
Building the NVIDIA kernel modules.
Checking dkms modules in 6.1.69-1330.ltsprev.
Checking dkms modules in 6.1.76-126.xmlts-preempt.
Checking dkms modules in 6.6.14-121.xmrt-preempt.
Checking dkms modules in 6.7.3-128.xmedge-preempt.
Updating the X11 output class configuration file.
Running the fix-nvidia-libGL-trigger service.
...
Above, the backporting libnvidia-egl-gbm.so* from 545.29.06 resolves eglinfo segfaulting under X11.
The XanMod kernels on Clear Linux clearmod repository will be updated separately (what to do if folks have not applied the patch to the NVIDIA sources).
I added the BORE scheduler patch to XanMod Edge variants in my XanMod on Clear Linux repository. The Burst-Oriented Response Enhancer CPU scheduler works great. For testing, be sure the sched_autogroup_enabled kernel knob is enabled.
I played a 1440p60 HD video in Google Chrome while computing prime numbers in 3 separate terminal windows. Video playback is on the CPU for me (using NVIDIA graphics); making possible to test the BORE CPU scheduler.
Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film
https://www.youtube.com/watch?v=aqz-KE-bpKQ (select 1440p60 HD)
https://github.com/marioroy/mce-sandbox (algorithm3.pl is found here)
XanMod Edge 6.7 with kernel.sched_bore enabled disabled cl-native
$ ./algorithm3.pl 2e12 in terminal #1 97.310 95.921 97.434
$ ./algorithm3.pl 2e12 in terminal #2 98.077 97.184 98.355
$ ./algorithm3.pl 2e12 in terminal #3 97.522 96.609 97.939
------- ------- -------
292.909 289.714 293.728
Next, I played the same video but increased the quality to 2160p60 4K.
Big Buck Bunny 60fps 4K - Official Blender Foundation Short Film
https://www.youtube.com/watch?v=aqz-KE-bpKQ (select 2160p60 4K)
https://github.com/marioroy/mce-sandbox (algorithm3.pl is found here)
XanMod Edge 6.7 with kernel.sched_bore enabled disabled cl-native
$ ./algorithm3.pl 2e12 in terminal #1 98.952 99.400 98.621
$ ./algorithm3.pl 2e12 in terminal #2 99.778 100.274 99.673
$ ./algorithm3.pl 2e12 in terminal #3 99.031 99.830 99.244
------- ------- -------
297.761 299.504 297.538
Summary:
The Clear Linux 6.7.3-native kernel drops over 1,000 frames (>25%) in the time taken to compute primes 2e12. Interesting, it requires a preempt-enabled kernel to complement the sched_autogroup_enabled knob.
The XanMod 6.7.4 preempt-enabled kernel with the BORE-scheduler patch took less time to compute primes 2e12, overall. Video playback is smooth and no frame drops with kernel.sched_bore enabled (default) or disabled.
Running 3x the number of logical cores is not typical. This was done to stress test the kernels. The Clear native kernel does okay running one worker per CPU thread. But, not 3x causing video playback to lose many frames.
The BORE CPU scheduler works so well that I bumped the LTS kernel to 6.6.x.
1. Bump to latest XanMod 6.6.x stable and RT kernels.
2. Rebase the LTS variants from 6.1.x to 6.6.x Clear patch set.
3. Apply the BORE CPU scheduler patch to 6.6.x, similar to 6.7.x.
4. Revert the Multi-LLC select_idle_sibling patch in Edge variants.
It was later requested by kernel developers to drop this patch.
5. Update the readme file.
With the system under load (3x the number of logical cores), opening a new terminal window and not affect the video playback (no stutter) is remarkable.
What about the RT kernel without BORE? Video playback is smooth under 3x load but requires running Chrome with realtime priority. However, launching a new terminal window causes a brief stutter.
Again, running 3x the number of threads is not typical. This was done to stress test the kernels. The BORE CPU scheduler is amazing. You have the option to disable it, mentioned in the README.
With the Feb-2024 update 2, one can quickly build the XanMod + BORE CPU Scheduler kernel. The default is to build the generic kernel. The LOCALMODCONFIG=1 enables trimming, building only the modules you have running.
I captured the time to build the generic and trimmed kernels using 3, 7, 15, and 31 CPU cores. Previously, the generic build took ~ 43 minutes consuming 3 CPU cores. The Feb-2024 update 2 decreased the time. A trimmed build saves beaucoup time and storage utilization.
The /lib/modules/[kernel] size includes the NVIDIA driver on my machine.
Generic kernel (all modules configured in Clear config - default):
$ ./fetch-src main
$ time ./xm-build main-preempt
3 CPUs 7 CPUs 15 CPUs 31 CPUs
real 41m35.725s 19m 5.350s 10m 5.097s 6m10.755s
user 112m 7.095s 114m12.451s 117m40.371s 127m55.995s
sys 9m38.373s 9m49.154s 10m39.254s 12m 2.763s
$ ls -lh rpmbuild.main/RPMS/x86_64/
total 106M
... 71M Feb 15 15:31 linux-xmmain-preempt-6.6.16-133.x86_64.rpm
... 109K Feb 15 15:31 linux-xmmain-preempt-cpio-6.6.16-133.x86_64.rpm
... 16M Feb 15 15:31 linux-xmmain-preempt-dev-6.6.16-133.x86_64.rpm
... 20M Feb 15 15:31 linux-xmmain-preempt-extra-6.6.16-133.x86_64.rpm
... 55K Feb 15 15:31 linux-xmmain-preempt-license-6.6.16-133.x86_64.rpm
$ ./xm-install main-preempt
$ du -sh /lib/modules/6.6.16-133.xmmain-preempt/
456M /lib/modules/6.6.16-133.xmmain-preempt/
Trimmed kernel (only the modules you have running; LOCALMODCONFIG=1):
$ time LOCALMODCONFIG=1 ./xm-build main-preempt
3 CPUs 7 CPUs 15 CPUs 31 CPUs
real 10m 0.565s 4m54.692s 2m52.431s 1m59.312s
user 26m25.156s 26m49.352s 27m31.223s 29m44.517s
sys 2m 3.640s 2m 6.321s 2m13.948s 2m28.507s
$ ls -lh rpmbuild.main/RPMS/x86_64/
total 51M
... 16M Feb 15 15:42 linux-xmmain-preempt-6.6.16-133.x86_64.rpm
... 89K Feb 15 15:42 linux-xmmain-preempt-cpio-6.6.16-133.x86_64.rpm
... 16M Feb 15 15:42 linux-xmmain-preempt-dev-6.6.16-133.x86_64.rpm
... 19M Feb 15 15:42 linux-xmmain-preempt-extra-6.6.16-133.x86_64.rpm
... 55K Feb 15 15:42 linux-xmmain-preempt-license-6.6.16-133.x86_64.rpm
$ ./xm-install main-preempt
$ du -sh /lib/modules/6.6.16-133.xmmain-preempt/
190M /lib/modules/6.6.16-133.xmmain-preempt/
Indeed, what a time saver! Some aspect of the build process consumes one CPU core, but mostly parallel. Clock speed decreases when consuming more CPU cores.
I fixed a bug. While doing so, I added support for the latest 535 and 550 driver releases.
1. Add check for 535.161.x (tested 535.161.07).
2. Add check for 550.54.x (tested 550.54.14).
3. Backport NVIDIA egl-gbm library from 550.54.14.
4. Fix if statement applying the DRM hotplug patch.
The new drivers can be found at nvidia.com. Select Tesla and CUDA 12.2 for the 535 update. Pass the path to the installer file as an argument to the install-driver script.
My Clear Linux repositories ClearMod and NVIDIA driver are completed. I had never intended to keep updating these forever. The reason is time constraint.
I captured results for HZ_1000, HZ_800, HZ_750, HZ_600, and HZ_500. Going forward, the ClearMod project defaults to HZ_800 for the Edge, Main, and LTS variants. Overriding the default is possible with HZ=value.