Advanced search

Message boards : Graphics cards (GPUs) : GPU without X windows

Author Message
Ed1934158
Send message
Joined: 15 Mar 09
Posts: 32
Credit: 3,313,639
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 12888 - Posted: 29 Sep 2009 | 8:03:11 UTC

I'd like to run gpu grid on my server. Do I require X windows to run it? Will nvidia drivers work fine?

JackOfAll
Avatar
Send message
Joined: 7 Jun 09
Posts: 40
Credit: 24,377,383
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwat
Message 12890 - Posted: 29 Sep 2009 | 8:39:09 UTC - in response to Message 12888.

I'd like to run gpu grid on my server. Do I require X windows to run it? Will nvidia drivers work fine?


No you don't have to run X, you just need to load the nvidia kernel module and create the device nodes, before starting the client.


#!/bin/bash

modprobe nvidia

if [ "$?" -eq 0 ]; then
# Count the number of NVIDIA controllers found.
N3D=`/sbin/lspci | grep -i NVIDIA | grep "3D controller" | wc -l`
NVGA=`/sbin/lspci | grep -i NVIDIA | grep "VGA compatible controller" | wc -l`

N=`expr $N3D + $NVGA - 1`
for i in `seq 0 $N`; do
mknod -m 660 /dev/nvidia$i c 195 $i
done

mknod -m 660 /dev/nvidiactl c 195 255
else
exit 1
fi

____________
Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2

Ed1934158
Send message
Joined: 15 Mar 09
Posts: 32
Credit: 3,313,639
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 12896 - Posted: 29 Sep 2009 | 10:41:19 UTC - in response to Message 12890.

Thank you!

Ed1934158
Send message
Joined: 15 Mar 09
Posts: 32
Credit: 3,313,639
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 12955 - Posted: 1 Oct 2009 | 15:36:54 UTC - in response to Message 12896.

Ok, I've tried the script, named it nvidia on ubuntu-server, and this is the output:

mknod: `/dev/nvidia0': File exists
mknod: `/dev/nvidiactl': File exists

But after running boinc I get:
01-Oct-2009 17:36:11 [---] No CUDA-capable NVIDIA GPUs found
01-Oct-2009 17:36:11 [---] No coprocessors

JackOfAll
Avatar
Send message
Joined: 7 Jun 09
Posts: 40
Credit: 24,377,383
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwat
Message 12957 - Posted: 1 Oct 2009 | 15:45:36 UTC - in response to Message 12955.
Last modified: 1 Oct 2009 | 16:09:33 UTC

Ok, I've tried the script, named it nvidia on ubuntu-server, and this is the output:
mknod: `/dev/nvidia0': File exists
mknod: `/dev/nvidiactl': File exists


OK, so module is loaded and the /dev nodes already exist.


But after running boinc I get:
01-Oct-2009 17:36:11 [---] No CUDA-capable NVIDIA GPUs found
01-Oct-2009 17:36:11 [---] No coprocessors



Have you got the CUDA toolkit installed? libcudart.so is available for runtime binaries to find?


[clivem@c7p6t:~]$ ldconfig -p | grep cudart
libcudart.so.2 (libc6,x86-64) => /usr/lib64/nvidia/libcudart.so.2
libcudart.so.2 (libc6) => /usr/lib/nvidia/libcudart.so.2
libcudart.so (libc6,x86-64) => /usr/lib64/nvidia/libcudart.so
libcudart.so (libc6) => /usr/lib/nvidia/libcudart.so


EDIT: If you think that's not the problem, grab deviceQuery-2.2.tgz, extract, make sure the 2 files are executable (or chmod them) and run 'deviceQuery'. (This uses the libcudart.so that'll prove the lib is available from the toolkit install.) If that fails, run 'deviceQueryDrv' which doesn't use the toolkit, just the driver libs. (If that succeeds but the former has failed you need to make sure libcudart.so is installed and in the LD path.)

Assuming, that a CUDA capable device is reported from 'deviceQuery', BOINC should find it OK.
____________
Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2

Ed1934158
Send message
Joined: 15 Mar 09
Posts: 32
Credit: 3,313,639
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 12960 - Posted: 1 Oct 2009 | 16:20:09 UTC - in response to Message 12957.

No, I haven't installed cudatoolkit. I needed to install just nvidia drivers when I was using ubuntu with gui, so I did the same with ubuntu server.
What is the difference?

JackOfAll
Avatar
Send message
Joined: 7 Jun 09
Posts: 40
Credit: 24,377,383
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwat
Message 12962 - Posted: 1 Oct 2009 | 16:25:38 UTC - in response to Message 12960.
Last modified: 1 Oct 2009 | 16:30:56 UTC

No, I haven't installed cudatoolkit. I needed to install just nvidia drivers when I was using ubuntu with gui, so I did the same with ubuntu server.
What is the difference?


Erm, I don't know a great deal about the Ubuntu side of things, but I don't understand how that could have been working without the libcudart.so, which is not a part of the driver release. You have to get it from the toolkit.

My recollection from the BOINC code is that it dlopens libcudart.so. If it can't find it then you're going to be told that CUDA is not available or no CUDA devices found. (They've been messing with the messages recently, so what you get depends on the version of BOINC you have installed.)

EDIT: Assuming you're running the 185.xx driver release (CUDA 2.2) you can d/l the toolkit, cudatoolkit_2.2_linux_64_ubuntu8.10.run
____________
Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2

Ed1934158
Send message
Joined: 15 Mar 09
Posts: 32
Credit: 3,313,639
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 12963 - Posted: 1 Oct 2009 | 16:35:39 UTC - in response to Message 12962.
Last modified: 1 Oct 2009 | 16:37:02 UTC

double post...

Ed1934158
Send message
Joined: 15 Mar 09
Posts: 32
Credit: 3,313,639
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 12964 - Posted: 1 Oct 2009 | 16:36:15 UTC - in response to Message 12963.
Last modified: 1 Oct 2009 | 16:38:04 UTC

Ok, here it is. On a computer where ubuntuGUI/boinc/cuda works. I downloaded the drivers:
NVIDIA-Linux-x86_64-190.32-pkg2.run
and BOINC/cuda works, in BOINC directory I have file libcudart.so and the test you provided gives:

CUDA Device Query (Driver API) statically linked version
There is 1 device supporting CUDA

Device 0: "GeForce 8600 GT"
CUDA Capability Major revision number: 1
CUDA Capability Minor revision number: 1
Total amount of global memory: 536150016 bytes
Number of multiprocessors: 4
Number of cores: 32
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.19 GHz
Concurrent copy and execution: Yes
Run time limit on kernels: Yes
Integrated: No
Support host page-locked memory mapping: No
Compute mode: Default (multiple host threads can use this device simultaneously)

Test PASSED

On ubuntu server I also installed the same drivers and it gives:
/deviceQueryDrv
CUDA Device Query (Driver API) statically linked version
Cuda driver error 3 in file 'deviceQueryDrv.cpp' in line 76.


So that is what is strange to me, the same procedure gives a different result. And when I used ubuntu desktop version on the same computer everything worked fine on that computer, but now when I installed server version I get this problem.

So you think that installing the cuda toolkit will solve the problem?

Ed1934158
Send message
Joined: 15 Mar 09
Posts: 32
Credit: 3,313,639
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 12965 - Posted: 1 Oct 2009 | 16:58:20 UTC - in response to Message 12964.

I installed the cudatoolkit the same problem arises:

/deviceQueryDrv
CUDA Device Query (Driver API) statically linked version
Cuda driver error 3 in file 'deviceQueryDrv.cpp' in line 76

JackOfAll
Avatar
Send message
Joined: 7 Jun 09
Posts: 40
Credit: 24,377,383
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwat
Message 12974 - Posted: 2 Oct 2009 | 7:34:09 UTC - in response to Message 12965.

I installed the cudatoolkit the same problem arises:
/deviceQueryDrv
CUDA Device Query (Driver API) statically linked version
Cuda driver error 3 in file 'deviceQueryDrv.cpp' in line 76


OK, what version of the nVidia driver did you install on the server?

What is the output from 'ldd ./deviceQueryDrv'?
____________
Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2

JackOfAll
Avatar
Send message
Joined: 7 Jun 09
Posts: 40
Credit: 24,377,383
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwat
Message 12976 - Posted: 2 Oct 2009 | 9:51:57 UTC - in response to Message 12965.


/deviceQueryDrv
CUDA Device Query (Driver API) statically linked version
Cuda driver error 3 in file 'deviceQueryDrv.cpp' in line 76


More.....

Error 3 == CUDA_ERROR_NOT_INITIALIZED. So the nvidia.ko is not loaded?

Make sure it is loaded - 'lsmod | grep nvidia'

Make sure the /dev nodes exist - 'ls -l /dev/nv*'
____________
Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2

Post to thread

Message boards : Graphics cards (GPUs) : GPU without X windows

//