NVIDIA and XanMod CL updates

Latency results: The XanMod kernels have PREEMPT preemption, enabled. Notice the average latency (microseconds) and completion time when comparing kernels.

Unsure which kernel to select? The XanMod 6.1.x kernel is quite fast. However, the XanMod 6.6.x kernel is snappier for the desktop environment. Applications launch faster. The BORE CPU scheduler is amazing. Of course, the Clear patches help too.

2 Likes

Thank you for you work. 550 on CL, is working great.

1 Like

The ClearMod project supports building Clear’s native kernel with BORE (Burst-Oriented Response Enhancer) CPU Scheduler, and enable preemption.

LOCALMODCONFIG=1 ./xm-build clear-preempt
./xm-install clear-preempt
sync

How this came about? Above, I saw a regression running Clear’s native kernel for latency testing. The max latency are lowest, but the average latency under load, and the time to compute prime numbers took the longest. It turns out, the Clear native kernel performs similar to the XanMod kernel.

I witnessed Clear’s native kernel 6.8.1 + BORE 5.0.1 + preemption. :blossom:

$ ls /lib/modules
6.1.69-1331.ltsprev  6.6.22-160.xmmain-preempt  6.8.1-160.xmclear-preempt

ClearMod project: I replaced HZ_750 to HZ_720, resolves compute regression versus HZ_750. Also, I replaced HZ_600 to HZ_625, improves hackbench versus HZ_600. So, no odd behavior with the new entries HZ_625, HZ_720, and HZ_800.

1 /  100 = 0.01
1 /  250 = 0.004
1 /  300 = 0.00333333333333333
1 /  500 = 0.002
1 /  625 = 0.0016
1 /  720 = 0.00138888888888889
1 /  800 = 0.00125
1 / 1000 = 0.001

How the Hz values came about? HZ_800 inspiration from computing 1 / 800 = 0.00125. That looks “graceful” and powerful. I searched the web to see if HZ_800 is used elsewhere. HZ_625 came from hamadmarri’s baby_linux project. That inspired me to decrease HZ_750 down to HZ_720.

Notice how max latency progressively decreases with higher Hz. Low averages for all three Hz values. This was testing 4 million pings (1 million per sender, concurrently).

HZ_625

Sender 1: Minimum = 3.000us, Maximum = 20484.250us, Average = 6.947us
Sender 2: Minimum = 3.000us, Maximum = 20484.000us, Average = 7.416us
Sender 3: Minimum = 3.250us, Maximum = 20493.500us, Average = 9.395us
Sender 4: Minimum = 3.250us, Maximum = 20495.750us, Average = 9.214us

HZ_720

Sender 1: Minimum = 3.000us, Maximum = 20365.250us, Average = 6.563us
Sender 2: Minimum = 3.000us, Maximum = 20364.500us, Average = 7.451us
Sender 3: Minimum = 3.250us, Maximum = 20372.250us, Average = 9.186us
Sender 4: Minimum = 3.000us, Maximum = 20371.250us, Average = 8.934us

HZ_800

Sender 1: Minimum = 2.750us, Maximum = 19873.000us, Average = 6.463us
Sender 2: Minimum = 3.000us, Maximum = 19879.500us, Average = 7.356us
Sender 3: Minimum = 3.000us, Maximum = 19882.250us, Average = 9.223us
Sender 4: Minimum = 3.250us, Maximum = 19881.750us, Average = 8.972us

HZ_1000

Sender 1: Minimum = 2.750us, Maximum = 19972.250us, Average = 7.050us
Sender 2: Minimum = 3.000us, Maximum = 19973.250us, Average = 7.440us
Sender 3: Minimum = 2.500us, Maximum = 19983.000us, Average = 9.332us
Sender 4: Minimum = 3.000us, Maximum = 19990.750us, Average = 9.120us

The ClearMod project defaults to HZ_800.

I removed XanMod LTS 6.1.y, Main 6.6.y, and RT 6.6.y variants. That leaves only XanMod Edge and Clear’s Native; both 6.8.y. This makes it more manageable, consuming less time to QA.

ClearMod release 165: Bump kernels to 6.8.2

  1. Update Clear native and XanMod edge kernels to 6.8.2.
  2. Enhance fetch script to acquire latest from kernel.org, if needed for Clear.
  3. Add kbuild generic x86_64 levels for Clear.
  4. Disable watermark boosting by default.
  5. Refactor update_curr(), entity_tick() in sched/fair.

Off-topic:

The GNOME 46.0 Xorg environment, particularly gnome-terminal is not happy. There is minimum 1 second delay (occurs randomly) getting output from commands; e.g. ls. The issue began with Clear 41280. Unfortunately, NVIDIA drivers have not yet reached reliability using Wayland, particularly Xwayland.

Clear 41270 is currently the last stable Xorg/Gnome environment for NVIDIA graphics. In general it’s advised to move away from Xorg anyway. Maybe, the next NVIDIA 555 driver will be better.

Someone recently (using Radeon 780M graphics, embedded in the APU) tried Clear 41300 and 41270 to no avail. Black screen, live desktop image. What does one say? I mentioned about a time when Clear Linux was reliable.

I built the Clear native kernel (2) without preemption + BORE, (3) preemption + BORE, and (4) one using HZ=800.

1. 6.8.2-1420.native          HZ=1000
2. 6.8.2-166.xmclear-default  HZ=1000  BORE 5.0.3
3. 6.8.2-166.xmclear-preempt  HZ=1000  BORE 5.0.3
4. 6.8.2-166.xmclear-preempt  HZ=800   BORE 5.0.3

Compute only: Running with idle attribute reaches non-preempt performance.

                       Clear Native    With BORE  Preempt+BORE Preempt+BORE
                           HZ=1000      HZ=1000      HZ=1000      HZ=800

$ ./algorithm3.pl 1e12     14.857s      14.770s      15.283s      15.244s
$ chrt -i 0 \
  ./algorithm3.pl 1e12     14.691s      14.671s      14.633s      14.644s

Next, four tasks running concurrently to capture latency results.

Xorg/GNOME: YouTube playback consumes lesser CPU on Xorg using NVIDIA.

Chromium Browser:  https://slowroads.io/
   Google Chrome:  https://www.youtube.com/watch?v=aqz-KE-bpKQ (1440p60 HD)

                       Clear Native    With BORE  Preempt+BORE Preempt+BORE
                           HZ=1000      HZ=1000      HZ=1000      HZ=800
$ chrt -i 0 \
  ./algorithm3.pl 2e12     37.579s      37.774s      38.539s      38.767s
$ ./schbench 
Latency percentiles (usec)
            50.0th:           37           30           32           32
            75.0th:          835          476          739          769
            90.0th:         1694         1102         2060         1806
            95.0th:         2372         1790         2676         2668
           *99.0th:         3884         2972         4264         3884
            99.5th:         4168         3340         4840         4424
            99.9th:         5368         4264         6024         5464
            min=0, max=     7120         6119         7607         7512

Apples-to-apples comparison to Clear Linux’s native kernel is BORE without preemption. Running background jobs? Either chrt -i 0 or preempt kernel is helpful for smooth Slow Roads demonstration.

HZ=800 performs better on my system for preempted kernels. For background jobs, running with idle attribute reaches non-preempt performance.

Results captured on an AMD Ryzen Threadripper 3970X machine.

ClearMod Simplification; Release 168.

  1. Single rpmbuild folder where the SPEC files reside.
  2. Four kernels; Clear, Edge, BORE, and ECHO (new).
  3. Rename kernels to shorter names, without preempt suffix.
  4. Keep only essential sched-fair patches beneficial for BORE.
  5. Rename xm-list-kernels to xm-kernels.
  6. Change Hz default to 800. Remove HZ_720.
  7. Bump kernels to 6.8.4.
  8. Build kernels in tmp folder.

Already using ClearMod? Boot into a Clear OS installed kernel. Run ./xm-uninstall all. Afterwards, you can git pull or re-clone the repository. The xm-uninstall script will continue to support removal for older kernels, though deprecated.

clear - Clear Linux native kernel + preemption
bore  - XanMod Edge kernel + preemption + BORE
echo  - XanMod Edge kernel + preemption + ECHO
edge  - XanMod Edge kernel + preemption

The fetch-src script takes no arguments, due to single rpmbuild folder. I renamed xm-list-kernels to xm-kernels.

./fetch-src
./xm-build bore | clear | echo | edge
./xm-install bore | clear | echo | edge [<release>]
./xm-uninstall bore | clear | echo | edge [<release>]
./xm-uninstall all
./xm-kernels

I built two kernels in little time, possible with LOCALMODCONFIG=1.

$ LOCALMODCONFIG=1 ./xm-build bore
$ LOCALMODCONFIG=2 ./xm-build echo

What does installation look like? I captured the output. The process is NVIDIA-aware and will build the NVIDIA drivers automatically via dkms.

$ ./xm-install bore
Installing linux-xmbore
Verifying...                          ################################# [100%]
Preparing...                          ################################# [100%]
Updating / installing...
   1:linux-xmbore-license-6.8.4-168   ################################# [ 25%]
   2:linux-xmbore-6.8.4-168           ################################# [ 50%]
   3:linux-xmbore-extra-6.8.4-168     ################################# [ 75%]
   4:linux-xmbore-dev-6.8.4-168       ################################# [100%]
Building kernel drivers for NVIDIA graphics.
done.

$ ./xm-install echo
Installing linux-xmecho
Verifying...                          ################################# [100%]
Preparing...                          ################################# [100%]
Updating / installing...
   1:linux-xmecho-license-6.8.4-168   ################################# [ 25%]
   2:linux-xmecho-6.8.4-168           ################################# [ 50%]
   3:linux-xmecho-extra-6.8.4-168     ################################# [ 75%]
   4:linux-xmecho-dev-6.8.4-168       ################################# [100%]
Building kernel drivers for NVIDIA graphics.
done.

The BORE and ECHO CPU schedulers are amazing. Reminder, do no install too many kernels to not fill your boot partition. Starting fresh is no problem. Simply boot into a Clear OS installed kernel and run ./xm-uninstall all.

Don’t we all go through phases? :smile: Thanks @Businux for your patience. I am running CL again because of the clearmod, and CL has an input-remapper bundle!

I installed CL just to try your ECHO kernel! Thank you for your hard work @marioroy . I do not know about the nitty-gritty of kernels but the gnome system monitor during a ‘finetunig a diffusion model’ gives a good idea of the differences between the vanilla kernel and the ECHO!

Vanilla

ECHO

1 Like

ClearMod Release 172.

  1. Bump kernel config to Clear 6.8.6, without xz.
  2. Remove xz compression in the SPEC files.