New NVIDIA driver automation

@marioroy may you share the link to the public archived version of the repo pls?

The repo was taken offline due to many regressions outside my control; e.g. regressions in Firefox, GNOME, VA-API NVDEC backend driver, NVIDIA proprietary 525 driver, and the missing gears icon on the login screen.

I completed my 3rd major revision for the repo. There is one more test I want to validate before re-releasing to the public.

Highlights:

  1. Resolves Firefox crashing since CL 37700.
  2. Resolves the missing gears icon on the login screen.
  3. GNOME on Xorg and GNOME on Wayland working.
  4. Support for the NVIDIA 525 display driver; particularly HW decode acceleration in Firefox. If you have an NVIDIA Optimus laptop, choose 520 or prior release.
  5. Updates to the complementary Chromium-Freeworld installer-updater script; fetches and install libffi 3.4.x dependency to /usr/local/lib64. Now, able to run the current Chromium-Freeworld release.
  6. Updates to install-driver to accommodate GNOME expecting NVIDIA files residing in standard locations. The /opt/nvidia path now contains mainly lib32 and lib64. This alleviated many hacks for making things work. Accordingly, updated the swupdā€™s picky list in /etc/swupd/config.
  7. Also updates to the post-OS update trigger script; lesser hacks.

In summary, the experience since some time ago has been getting worst and worst per each OS release. Eventually, thereā€™s a limit to the number of hacks needed to make things work.

So, I invited my imaginary friend named Grace. We all sat together in a room; Firefox, GNOME, Mesa, NVDEC driver, NVIDIA display driver, and not to forget CL OS. What is it with you guys? Canā€™t you guys get along?

Obviously, everybody got along with Grace at their side. :slight_smile:

:grin:


Moments prior to re-releasing the ā€œnvidia-driver-on-clear-linuxā€ update, I thought to do a cursory check against the latest Clear Linux 37810 OS. Unfortunately, the NVIDIA modules do not build on 37810. I reverted back to 37800.

Edit: Tool chain fixed in CL 37850. CL 37840 not tested. Awaiting CL 37850 or later CL release to become current (not staging) before re-releasing the NVIDIA driver automation. So that folks do not experience the installation failing due to regression with the tool chain in CL 37810.

Thanks for doing it! Looking forward to a new automation script. I have my clean CL installed to test it.

cheers.

Completed 3rd re-release.

Recommendations:

  1. Choose the LTS kernel during the OS installation (advanced tab). Edit: The Native kernel is now supported.
  2. Not sure which display driver? Choose v520 over v525. I found v520 to be more stable than v525. One reason is that v525 may be unstable for Microsoft Edge (stalling while changing video quality on YouTube, requiring page refresh).

The list of Browser installer-updater automation has grown. Can you spot the new addition? :slight_smile:

  1. Brave Browser
  2. Chromium-Freeworld
  3. Google Chrome
  4. Microsoft Edge
  5. Vivaldi

The difficult decision is which Clear Linux OS release? Choose CL 37860 or later for NVIDIA graphics. Or rather, do not install CL 37810-37850.

Running Linux 6.1? See the Epilogue section in the README file. It may require adding the kernel ibt=off parameter for recent Intel CPUs.

Have a prior ā€œnvidia-driver-on-clear-linuxā€ installation? The 3rd makeover is different; particularly the display driver installation. Also, the launch scripts for the various browsers have changed. Copy anew launch scripts, et cetera as described in the documentation.

GNOME on Xorg or Wayland? It depends on whether you want to run Chrome or like browser. Device-level font scaling in Chrome doesnā€™t work on Wayland. This is problematic for HiDPI monitors.

Hi,

Iā€™ve tested the new script. CL 37800, 520 driver. After a reboot Iā€™ve got a black screen with a mouse pointer. Subsequent reboots - only black screen. Not sure where to look for a problem. I tried to fix the CL by forced repair - did not help. To be honest, it does not help to make the CL my distro of choice, at least for my hardware Dell XPS 9550 which runs Ubuntu 22.04 without any issues.

Best regards and thanks for the effort!
-W

Not mentioned is what kernel lts or native. You may need the ibt=off parameter. CL 37860 has been released with updated X11.

Iā€™m feeling the same way. Iā€™m unable to test NVIDIA Optimus hardware. Maybe, the issue is with the nvidia-drm-outputclass.conf file under /etc/X11/xorg.conf.d. Perhaps, itā€™s a bad idea in the install-driver script to overwrite nvidia-drm-outputclass.conf. That section can be commented out. I understand if you have given up on CL.

Not knowing which kernel you had installed, the README file contains info on disabling Indirect Branch Tracking.

sudo mkdir -p /etc/kernel/cmdline.d
sudo tee /etc/kernel/cmdline.d/disable-ibt.conf >/dev/null <<'EOF'
ibt=off
EOF
sudo clr-boot-manager update

There isnā€™t much more that I can do with the repository. I will add an uninstaller script which will uninstall the nvidia driver, remove configuration added to /etc including the trigger script, and restore CL lib-opengl.

The repository was updated to support Kernel 6.0 and Kernel 6.1. Iā€™m not sure when you had obtained the repo. Due to the re-release (I did it twice yesterday), it requires re-cloning from scratch e.g. git pull may not work. Thus, removing the directory and re-clone.

@Watergap, I saw that you had success here at one point. That rules out the nvidia-drm-outputclass.conf configuration.

Installing KDE on top of GNOME may not work. Iā€™m not sure the steps folks are using installing KDE successfully. That is a CL question.

That leaves the ibt=off kernel parameter if running Linux 6.1.

I have installed 37800 Native. As to the KDE:
sudo swupd bundle-add desktop-kde

It worked fine with Plasma(X11) config. It was installed before NVIDIA script.
May be KDE is the issue.

My interest to CL was ignited by reading a performance review especially in Java. By running a java2d benchmark test I found little to no difference between CL and Ubuntu.

I may give another try with a different version of CL and GNOME.

Thanks again.

Ah, I thought that NVIDIA drivers were not an issue anymore, thatā€™s a shame.
I hope either NVIDIA makes their drivers open-source someday, so this issue with propietary stuff wonā€™t happen again. Or another company such as AMD or Intel overtake NVIDIA in terms of graphical performance for Blender for exampleā€¦ or CL just somehow manages to get along with NVIDIAā€™s propietary stuff.

I did a clean installation of CL 37860 + latest NVIDIA 525, Native kernel + default Gnome. It works without any issues without any tweaking of the automation script. I suspect some glitch exists when a different desktop is used. I tried the KDE and Xfce desktops - got a black screen.

cheers.

I ran the following swupd commands on top of a Clear Linux GNOME installation with success. Uninstall the desktop-autostart bundle to remove the gdm-autostart package from the system. The desktop-kde-apps bundle is not required.

sudo swupd bundle-remove desktop-autostart
sudo swupd bundle-add desktop-kde
sudo swupd bundle-add desktop-kde-apps   # optional

Verified configuration: Kernel 6.1.1, NVIDIA 520 proprietary driver, and selected ā€œPlasma (X11)ā€ from the Sessions drop-down menu. ā€œPlasma (Wayland)ā€ works too.

See also:

The following Exec entries in desktop files work in GNOME, but not in KDE. :frowning:

$ grep "^Exec" ~/.local/share/applications/chromium-freeworld.desktop 
Exec=/bin/bash -c "\$HOME/bin/run-chromium-freeworld %U"
Exec=/bin/bash -c "\$HOME/bin/run-chromium-freeworld"
Exec=/bin/bash -c "\$HOME/bin/run-chromium-freeworld --incognito"

The following works in GNOME and KDE. :slight_smile:

$ grep "^Exec" .local/share/applications/chromium-freeworld.desktop 
Exec=/bin/bash -c "exec $HOME/bin/run-chromium-freeworld %U"
Exec=/bin/bash -c "exec $HOME/bin/run-chromium-freeworld"
Exec=/bin/bash -c "exec $HOME/bin/run-chromium-freeworld --incognito"

I updated the desktop files in the nvidia-driver-on-clear-linux repository.

Some tips:

Running Chrome or derivative browser? Preferably run on Xorg versus Wayland. The VDPAU-backend VA-API driver does not work in Wayland. Another reason is that device font-scaling does not work in Chrome using HiDPI monitors.

Running Microsoft Edge? Install the NVIDIA 520 driver versus 525. For my use case, the 520 driver is more stable.

Desktop stutters or tears while moving windows? Enable ā€œForce Full Composition Pipelineā€, documented in the top-level README file.

Update:

Installing a kernel manually or have multiple kernels (LTS and native)? Run check-kernel-dkms manually including after OS updates. That will check non-running kernel(s) and refresh NVIDIA modules, if needed. The script installs missing dkms bundles and runs dkms autoinstall per each kernel on the system.

An issue ticket (feature request) was created for uninstalling a kernel.

Well, mpv exits with this after install + HWAccel:

[vo/gpu] VT_GETMODE failed: Inappropriate ioctl for device
[vo/gpu/opengl] Failed to set up VT switcher. Terminal switching will be unavailable.
*** MESA_GLSL_CACHE_DISABLE is deprecated; use MESA_SHADER_CACHE_DISABLE instead ***
[vo/gpu] Failed to commit ModeSetting atomic request (-13)
[vo/gpu/opengl] Failed to set CRTC for connector 95: Permission denied
*** MESA_GLSL_CACHE_DISABLE is deprecated; use MESA_SHADER_CACHE_DISABLE instead ***
[vo/gpu-next] Can't handle VT release - signal already used
[vo/gpu-next/opengl] Failed to set up VT switcher. Terminal switching will be unavailable.
*** MESA_GLSL_CACHE_DISABLE is deprecated; use MESA_SHADER_CACHE_DISABLE instead ***
[vo/gpu-next] Failed to commit ModeSetting atomic request (-13)
[vo/gpu-next/opengl] Failed to set CRTC for connector 95: Permission denied
*** MESA_GLSL_CACHE_DISABLE is deprecated; use MESA_SHADER_CACHE_DISABLE instead ***
Error opening/initializing the selected video_out (--vo) device.
Video: no video
[1]    42545 segmentation fault (core dumped)

And blender failing to execute compilation command and missing denoising.

CUDA version 12.0 detected, build may succeed but only CUDA 10.1 to 11.4 are officially supported.
Compiling CUDA kernel ...
"/usr/local/cuda/bin/nvcc" -arch=sm_50 --cubin "/usr/share/blender/scripts/addons/cycles/source/kernel/device/cuda/kernel.cu" -o "$HOME/.cache/cycles/kernels/cycles_kernel_sm_50_1BCBBF7A721073462FF3133D38966C99.cubin" -m64 --ptxas-options="-v" --use_fast_math -DNVCC -I"/usr/share/blender/scripts/addons/cycles/source"
/usr/share/blender/scripts/addons/cycles/source/util/transform_inverse.h(15): error: identifier "ssef" is undefined

/usr/share/blender/scripts/addons/cycles/source/util/transform_inverse.h(15): error: identifier "__m128" is undefined

/usr/share/blender/scripts/addons/cycles/source/util/transform_inverse.h(16): error: identifier "ssef" is undefined
<...>

But other than that, amazing job.

Optimus, X11, native - 6.2.6 - 1290, default settings - 525.89.02


After installing 11.4 for Blender,

nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
In file included from /usr/local/cuda/bin/../targets/x86_64-linux/include/cuda_runtime.h:83,
                 from <command-line>:
/usr/local/cuda/bin/../targets/x86_64-linux/include/crt/host_config.h:139:2: error: #error -- unsupported GNU version! gcc versions later than 11 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
  139 | #error -- unsupported GNU version! gcc versions later than 11 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
      |  ^~~~~
Failed to execute compilation command, see console for details.

oh well :slight_smile: some day Iā€™ll learn


Installing Blender LTS from official site instead of swupd fixed the issue completely! Swapping kernels, drivers and gccā€™s did only add problems :slight_smile:
Now renders are much faster ^~^
Outstanding job on the scripts.
:heart:

1 Like

The repo README does suggest the 520 driver is preferred for Optimus configuration. That also means installing the LTS kernel, as the 520 driver modules will not build on Linux 6.2.x. Moreover, Blender mentions CUDA 12.0 not officially supported.

Fortunately, thereā€™s no-need to re-install the OS from scratch.

  1. Install kernel-lts2021 (sudo swupd bundle-add kernel-lts2021 kernel-lts2021-dkms).
  2. Run (bash check-kernel-dkms) in the repo folder to be sure NVIDIA (current version) kernel modules have been built for the LTS kernel.
  3. Reboot the system and select the lts2021 kernel.
  4. Uninstall the 6.2 kernel (sudo swupd bundle-remove kernel-native-current kernel-native kernel-native-dkms linux-dev).
  5. Reboot the system (add "3 " with a space) to the kernel arguments.
  6. Install the NVIDIA 520 driver (bash install-driver 520).
  7. Uninstall CUDA 12.0 (bash uninstall-cuda).
  8. Install CUDA 11.8 (bash install-cuda). That will install CUDA 11.8 for the 520 driver.

Hopefully, Blender will work with CUDA 11.8. Otherwise, you will need to download CUDA 11.4 from the web and pass the path to install-cuda.

bash uninstall-cuda
bash install-cuda ~/Downloads/cuda_11.4.4_470.82.01_linux.run

Meant to let you know, you are a huge help, and highly appreciate the amount of time you put into maintaining this for all of us.

Will try this as I am having nightmares with pytorch +TF + Cuda . Running on native X___x

@marioroy nice to see you in the forum. Just noticed, NVIDIA has released a new driver.

https://download.nvidia.com/XFree86/Linux-x86_64/530.41.03/

I have not tried it yet.

Update:
Installed it using your amazing and clever repo!!

./install-driver ~/Downloads/NVIDIA-Linux-x86_64-530.41.03.run

There were no errors during installation.


Screenshot from 2023-03-24 07-59-37

1 Like

The errors could be coming from /usr/bin/mpv, which is reported to not work.

sudo swupd bundle-remove mpv

The 3rd-party codecs-cuda bundle has working mpv for NVIDIA graphics. See the instructions in the HWAccel folder for installing the codecs-cuda bundle. The mpv command is found in /opt/3rd-party/bin/ which is in your path.