Bash scripts to automate installation of NVIDIA proprietary driver

Available here

Table of Contents

  1. Summary
  2. Detailed description for each script
    1. pre_install.bash
    2. install.bash
    3. post_install.bash
    4. uninstall.bash
    5. pre_update.bash
    6. update.bash
    7. post_update.bash
    8. cleaning.bash
  3. Known Issues

Summary

Based on the tutorial by Clear Linux, I wrote several bash scripts that automate the installation, update, and uninstallation, of NVIDIA proprietary driver.

There’s no need to give them execute permission to make them executable. To run a script, execute bash ./<SCRIPT-NAME>.bash WITHOUT sudo.

To install the driver, the user needs to execute pre_install.bash, install.bash, and post_install.bash, in that order, with reboots after executing the first script.

Similarly, to update the driver, the user needs to execute pre_update.bash, update.bash, and post_update.bash, in the same manner.

To uninstall the driver, just execute uninstall.bash. This script is also helpful when installation/update failed, which helps to restore your system to the initial state.

Note that the uninstall script invokes the official installer to uninstall files from NVIDIA proprietary driver, and it should be sufficient. But incase something is left behind, the cleaning.bash script is provided, but normally you shouldn’t execute it.

Detailed description for each script

1. pre_install.bash

  • For Intel system, disables IOMMU in the kernel.
  • Creates a systemd unit that fix problems regarding libGL library.
  • Installs kernel-native-dkms or kernel-lts-dkms bundle based on your kernel type, if not found.
  • Disables nouveau driver by blacklisting it in /etc/modprobe.d/disable-nouveau.conf
  • Reminds the user to reboot, and to run install.bash script to proceed to installation.

Note: After the reboot the GUI desktop environment may not work, then you need press Ctrl+Alt+F2 to enter tty2, from which you can log-in and proceed to the next step.

2. install.bash

  • Locates NVIDIA driver installer, NVIDIA-Linux-x86_64-<VERSION>.run, under current directory.

    • If there are multiple installers found, chooses the latest.
    • If no installer is found, tries to download the latest NVIDIA driver for Linux x86-64 system.
  • Modifies config file for dynamic linker, /etc/ld.so.conf, to include libraries of NVIDIA drivers, which are installed under /opt/nvidia/lib and
    /opt/nvidia/lib32.

  • Modifies Xorg configuration file, /etc/X11/xorg.conf.d/nvidia-files-opt.conf, so that it search for additional modules under /opt/nvidia/lib64/xorg/modules

  • Installs the driver with the following options:

    --utility-prefix=/opt/nvidia
    --opengl-prefix=/opt/nvidia
    --compat32-prefix=/opt/nvidia
    --compat32-libdir=lib32
    --x-prefix=/opt/nvidia
    --x-module-path=/opt/nvidia/lib64/xorg/modules
    --x-library-path=/opt/nvidia/lib64
    --x-sysconfig-path=/etc/X11/xorg.conf.d
    --documentation-prefix=/opt/nvidia
    --application-profile-path=/etc/nvidia/nvidia-application-profiles-rc.d
    --no-precompiled-interface
    --no-nvidia-modprobe
    --no-distro-scripts
    --force-libglx-indirect
    --glvnd-egl-config-path=/etc/glvnd/egl_vendor.d
    --egl-external-platform-config-path=/etc/egl/egl_external_platform.d
    --dkms
    --silent
    
    • Note: In previous version of this script, I removed --no-nvidia-modprobe because it’s needed for CUDA toolkit to work properly. I added it back since the official tutorial on CUDA solved this problem.
  • Before the actual installation, users will be reminded to run the post_install.bash, and they need to press a key to continue installation.

3. post_install.bash

  • Checks the NVIDIA kernel modules are correctly loaded on the system, which shall not be empty or otherwise the installation is not successful.
  • Prompts the user whether to install a desktop file for nvidia-settings.

4. uninstall.bash

  • Removes files created for IOMMU and OpenGL workarounds.
  • Re-enables nouveau driver by removing /etc/modprobe.d/disable-nouveau.conf.
  • Restores Xorg configuration
  • Restores dynamic linker configuration.
  • Removes the desktop file for nvidia-settings if it was created.
  • Uninstalls NVIDIA proprietary driver via the official uninstaller, /opt/nvidia/bin/nvidia-uninstall.
  • Reminds the user to reboot.

5. pre_update.bash

  • Verifies that NVIDIA proprietary driver is currently installed and displays the current driver version.
  • Retrieve the latest driver version and check whether an update is needed.
  • If there’s a update and the installer is not downloaded yet, download the installer.
  • Temprarily set the boot target to multi-user.target.
  • Remind the user to reboot and execute update.bash.

6. update.bash

  • Update the driver with the same options as install.bash has.
  • Restore the boot target to graphical.target.

7. post_update.bash

  • Same as post_install.bash that checks NVIDIA kernel moduels are loaded.
  • Updates flatpak runtimes.

8. cleaning.bash

  • Sometimes NVIDIA’s official uninstaller still leaves certain files behind and this scripts will remove those files. One possible scenario is when the installer does not succeed.
    • /opt/nvidia/
    • /usr/src/nvidia*/, this is directory has source files of NVIDIA DKMS module
    • /usr/bin/nvidia-modprobe, this is installed if the installer was not invoked with --no-modprobe flag
    • /usr/lib/libGL.so.1, though we specified library prefix, somehow this file exists

Known Issues

  • It’s been reported that gnome-control-center will not work due to incorrect libGL1, but this is fixed by the workarounds provided in post_install.bash.
  • If there is a integrated GPU on Intel Chip-set, the user has to disable the Intel VGA driver, or otherwise he will see the following error message2.

img

  • It’s been reported that compilation of NVIDIA dkms module may fail due to gcc error, and the cause is unknown3 this is a known structural issue of NVIDIA driver4. When this happens, the Official uninstaller may left certain files behind. This is taken care by cleaning.bash.

Footnotes

1 GitHub Issue #791 - 2060 rtx: Black screen after login live usb

2 Clear Linux Forums - Bash scripts to automate installation of NVIDIA proprietary driver

3 GitHub Issue #974 - Error during compilation of NVIDIA dkms module

4 GitHub Issue #1725 - not able to compile nvidia dkms module


This file was last updated on 2020/03/10
14 Likes

I wrote my own version of this and tested it a few hours ago. Everything seemed to be working during the install, but GNOME Shell crashed reproducibly. :frowning: So I backed it out until I have a troubleshooting plan.

My GNOME Shell crashed for a few times and now it’s working.

Maybe it’s because I didn’t run swupd verify --fix --bundles=lib-opengl, which is suggested in the tutorials.

I did that - it still crashed. I ran out of troubleshooting energy and re-installed the laptop. I’ll take another run at it Wednesday.

Don’t forget the --quick switch otherwise swupd verify behavior is not intuitive with the --bundles paramter.

I verify that ‘swupd verify --fix’ doesn’t break the system.

If you’d like to share your codes, I can take a look to see what might have caused the problem

I had problems not being able to open the gnome-control-center after installing the nvidia drivers. I had an issue with loading opengl libraries. It appeared to be a conflict with the loader, as described in this stack overflow post:

I followed the advice from that post and removed the conflicting libraries that weren’t the nvidia ones (which I was pretty nervous about doing). I have a feeling that running the swupd verify --fix will restore them.

I don’t know what the “right” solution here is in terms of either having both exist and making sure the right library is loaded in the right situation, but my solution at least worked to get me able to open gnome-control-center again.

I had that same swrat steam error message installing Nvidia on the Ubuntu 19.04 distro. The solution I found there was just to run the installer a second time and everything worked out alright. I used to see this problem back in the day too.

Why not blacklist anything else?
blacklist nvidiafb
blacklist nv

I can now see kernel modules from nvidia under lsmod. Yet I still get that white screen with error message telling me to press enter. I ran the uninstall.sh but now notice Firefox won’t allow me to watch any videos on dailymotion saying something about a mime type problem. I suspect it has to do with video acceleration either vdpau or ffmpeg libraries. Nope I’m lacking pepper flash.

Edit: My laptop of 5 years recently died. I suspect the systemic failures to be the cause of my problems with the drivers.

Hi!

I have also problems installing the nvidia driver. I tried the manual way from the clear linux tutorial and also the automatic shell scripts. Both brought me to the same result. A white screen with the message:

image

So I guess the modules are not loaded correctly. I cannot remember the lsmod | grep ^nvidia output.

I’m on a thinkpad p72 with quadro 5200.

Cheers

@specter and @Sumdewd

I’ve seen this white screen before after I finished the installation of NVIDIA driver but still had my moniter connected to the integrated GPU on the motherboard.
As long as I connect the display to my NVIDIA GPU, this white screen disappears.
@specter, does your laptop comes with a NVIDIA GPU and a motherboard GPU as well? If so, you might need to disable the latter one first.

@Jeff
In a previous edition of the tutorial, there was an installation flag that will tells the installer not to create any opengl files.
I had tried once with the installer creating those opengl files, and some applications will crash.
My scripts follows the current edition of Clear Linux documentation, with only one exception that I allowed the installer to create mod-probe, which is a dependency to CUDA. But it shall not affect the opengl library.
I also think it might be swupd verify --fix that restored them. Was the GitHub issue on the libGL error created by you?


I removed the libGL.so.* under /usr/lib64 directory and the gnome-control-center works now.

So somewhere it is loading the wrong libGL. I see many files include RPATH/RUNPATH to /usr/lib64 (mainly RUNPATH due to --enable-new-dtags). I have experienced the same issue which was resolved by stopping it adding -Wl,-rpath,/usr/lib64 to the linker.

Is someone able to run an strace when loading gnome-control-center and it fails to load with that error? (and then want the contents of /tmp/gcc-strace.log)

strace -f -o /tmp/gcc-strace.log gnome-control-center

This is related to https://github.com/clearlinux/distribution/issues/791
I already deleted the mesa libGL. Would you post your findings in the GitHub issue?

Thanx doct0rHu for taking the time. The P72 has the following gpu’s:

lspci -v | grep VGA
0:02.0 VGA compatible controller: Intel Corporation Device 3e94 (prog-if 00 [VGA controller])
01:00.0 VGA compatible controller: NVIDIA Corporation GP104GLM [Quadro P5200 Mobile] (rev a1) (prog-if 00 [VGA controller])

The intel gpu uses:

lspci -vs 00:02.0
00:02.0 VGA compatible controller: Intel Corporation Device 3e94 (prog-if 00 [VGA controller])
Subsystem: Lenovo Device 2269
Flags: bus master, fast devsel, latency 0, IRQ 130
Memory at 404a000000 (64-bit, non-prefetchable) [size=16M]
Memory at 80000000 (64-bit, prefetchable) [size=256M]
I/O ports at 3000 [size=64]
[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
Capabilities: [40] Vendor Specific Information: Len=0c <?>
Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [d0] Power Management version 2
Capabilities: [100] Process Address Space ID (PASID)
Capabilities: [200] Address Translation Service (ATS)
Capabilities: [300] Page Request Interface (PRI)
Kernel driver in use: i915

So I tried to blacklist the i915 but that doesn’t seem to work. The Nvidia GPU was using the nvidia driver and not the nouveau but at startup there was the white error screen.

Interesting is, when i open terminal (strg+F2) and startx, the gnome fallback dm starts.

I cannot disable the intel gpu because this is a corporate laptop and I don’t have access to the bios.

Hope you have any further ideas!

Cheers

The

I think this might be helpful. The author of that post failed to blacklist i915 module and changed the grub configuration and made that work.
I’m not sure whether this could be applied to Clear Linux, so please backup first.

Thanks. I did not create the GitHub issue, but I will follow its resolution. This is a pretty stupid problem to have. Install NVidia driver…no longer have the right driver linked for simple things like the gnome control center. I can’t imagine the pain of people who actually need openGL to work properly.

Hi doct0rHu,

thanx for the link! I think this points to the right direction. So, the problem seems to be that the kernel module i915 is compiled into the kernel and so it cannot be blacklisted as a loaded module. This makes absolutely sense for an intel centric distribution:

cat /lib/modules/$(uname -r)/modules.builtin

kernel/drivers/gpu/drm/i915/i915.ko

the problem now is, how to remove / prevent the module being loaded. This needs to be done in systemd.boot and not in grub. I mounted efi to boot and edited Clear-linux-native-5.0.18-767.conf but couldn’t find a way to prevent loading the i915.

Maybe someone has a hint for an kernel option in systemd.boot to prevent loading the module?

Cheers

You can also blacklist modules from the bootloader.

Simply add module_blacklist=i915 to your bootloader’s kernel line.

So to add new parameter do:

sudo mkdir -p /etc/kernel/cmdline.d/
echo "module_blacklist=i915" | sudo tee /etc/kernel/cmdline.d/blacklist.conf
sudo clr-boot-manager update

and reboot.

2 Likes

@miguelinux
This is great. We shall incorporate this into the NVIDIA driver tutorial if this is confirmed to work.