Fresh install of CL 38700 is really snappy. Thank you CL devs

Indy · March 31, 2023, 12:20pm

I wiped out my old and boring installation and replaced it with the shiny and new Clear Linux desktop 38700. It’s like a breath of fresh air for my PC. And guess what? The NVIDIA driver 530.41 (thanks to @marioroy ) works like a charm too (Just tweak the display settings and watch the instant magic happen!). My PC is now CLEARly the best in the neighbourhood.

And don’t get me started on the speed of CL development. They are on fire! They keep adding new features and improvements every day. I can’t keep up with them. Thank you Clear Linux Development team for making my PC dreams come true.

(Used Bing to write this to get @Businux to reply fast!)

marioroy · March 31, 2023, 1:55pm

@Indy, I am curious as to how your system performs mandelbrot-python auto-zooming demonstration. Do not bother if you run Python via another means than Miniconda or Anaconda. The Miniconda is how I run Python on Clear Linux.

Note: The CUDA and OpenCL demonstrations require double-precision HW on the GPU. This means Intel graphics will not work due to lacking double-precision support. However, the CPU demonstrations should work.

python3 mandel_stream.py --config app.ini 720p
python3 mandel_cuda.py --config app.ini 720p --fma 1
python3 mandel_ocl.py --config app.ini 720p --fma 1

Press the letter “m” or “h” to render with 3x3 or 5x5 super-sampling, respectively.
Then, press the letter “c” to render ~ 183 levels. There are several auto zooms { z, x, c, v, b, g, and t } to various locations. Color schemes (press F1 … F7).

NVIDIA RTX 3070 GPU

19.138 seconds (m) 3x3, (c)
47.707 seconds (h) 5x5, (c)

AMD 3970X 32 cores; extra option --num-threads=32

22.761 seconds (m) 3x3, (c)
55.777 seconds (h) 5x5, (c)

AMD 3970X 64 threads - 1 (omitting --num-threads option)

14.112 seconds (m) 3x3, (c)
33.716 seconds (h) 5x5, (c)

marioroy · March 31, 2023, 2:11pm

Clear Linux is a lot of fun for me, personally. This platform is where I learned Numba, OpenCL, and CUDA using Python. Miniconda is a blessing for folks using Python and not worry about messing with the OS-level Python.

I created the NVIDIA on Clear Linux repo for ease of use of installing the NVIDIA driver and CUDA Toolkit. The Python repo is my sandbox for learning Python; pycuda, pyopencl, Numba, and pygame. Numba is awesome.

arjan · March 31, 2023, 2:19pm

curious why the OS level python is inferior to the miniconda one… the data I saw on phoronix some time ago is that the clear linux python is at least… quite a bit faster

marioroy · March 31, 2023, 2:56pm

Installing llvmlite, a dependency for Numba, is the reason using Miniconda. It’s a pain getting llvmlite to build on CL against llvm11.

Recent Numpy 1.22 or newer emits warnings due to some library or dependency built with fast-math enabled. Unfortunately, another reason using Miniconda and installed Numpy 1.21.5.

The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
The value of the smallest subnormal for <class 'numpy.float64'> type is zero.

Finally, it’s feasible getting the modules needed in Miniconda and not worry about building modules that have many dependencies.

Please know that I tried the OS-level Python first.

Indy · March 31, 2023, 3:03pm

Thanks to your clear instructions, as a Non-IT Professional, I got it up and running. Amazing bit of coding.

Intel 13900K was sweating but the 4090 didn’t break a sweat.
m=23.045s
h=55.588s

4090 CUDA
m=5.349s
h=12.846s

4090 OpenCL
m=5.605s
h=13.257s

marioroy · March 31, 2023, 3:25pm

That is amazing performance for the Intel processor. It is similar to running on an AMD Threadripper 3970X box using 32 “real” cores (factoring out the logical threads). Wow!

The RTX 4090 double-precision performance is about 4 times that of the RTX 3070 GPU. Thank you for sharing.

python3 mandel_cuda.py --config app.ini 1080p --fma 1

Press the letter (u) 7x7 or (i) 9x9 super-sampling.
Press the letter (c).

That’s more work for the GPU.

Indy · March 31, 2023, 3:26pm

I am sure @arjan will sort this issue out so your Mandelbrot program can run without conda.

BTW thanks @arjan for coming here.

Indy · March 31, 2023, 3:38pm

48.519s. Fans started spinning

Indy · March 31, 2023, 3:42pm

How come maths this beautiful??

marioroy · March 31, 2023, 3:43pm

Ditto. Thank you, @arjan. I tried the OS-level Python, but the frustrations took me over, getting various Python modules to build.

Numba is amazing, taking Python code to C-level performance via directives (akin to OpenMP). I tried all the variations including my own parallelization using the Python multiprocessing module; queues and sockets.

mandel_queue.py   - Run parallel using a queue for IPC
mandel_stream.py  - Run parallel using a socket for IPC
mandel_parfor.py  - Run parallel using Numba's parfor loop
mandel_ocl.py     - Run on the CPU or GPU using PyOpenCL
mandel_cuda.py    - Run on the GPU using PyCUDA
mandel_kernel.py  - Run on the GPU using cuda.jit

Businux · March 31, 2023, 3:49pm

I see no spelling mistakes

Amen.

“a bit faster” is quite an understatement…

marioroy · March 31, 2023, 3:52pm

Seeing the CUDA demonstration utilize near 100% on the powerful RTX 4090 GPU made my day. So that is working like one would expect.

The examples utilize double-precision capabilities on the CPU or GPU.

Indy · March 31, 2023, 3:53pm

I am glad to help.

Indy · March 31, 2023, 4:04pm

I am sure you both must have seen this development. My phones news app selected this for me to read while on the go.

Indy · March 31, 2023, 4:07pm

no4k

Businux · March 31, 2023, 4:10pm

marioroy · March 31, 2023, 4:15pm

Regarding 2160p missing, you can edit app.ini and add a section. Scroll near the bottom of the file. Or you can pass --width N and --height N arguments instead of --config app.ini.

Businux · March 31, 2023, 4:20pm

marioroy · March 31, 2023, 11:29pm

Which display settings did you tweak? I had met to ask earlier.

Again, thank you for sharing. The RTX 4090 (48.5 secs) is a beast. It is 3.8 times faster than the RTX 3070 (184.0 secs) computing the 1080p (u) 7x7 super-sampling (c) demonstration.

Topic		Replies	Views
NVIDIA and XanMod CL updates General Discussion	34	2027	July 19, 2025
New NVIDIA driver automation General Discussion	45	5003	May 5, 2024
CL 38270: Good news, the bad news, and solution for NVIDIA graphics General Discussion	11	1111	June 16, 2023
A sad story on the 100% Intel ecosystem (dark BTW) General Discussion	18	2087	December 16, 2021
Install NVIDIA Drivers Tutorials and Guides	46	28626	June 19, 2023

Fresh install of CL 38700 is really snappy. Thank you CL devs

Related topics