Advanced search

Message boards : Number crunching : error while computing

Author Message
labrat42
Avatar
Send message
Joined: 13 May 10
Posts: 7
Credit: 452,806,864
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 37551 - Posted: 5 Aug 2014 | 23:49:19 UTC

I have a Linux PC (Ubuntu 14.04) with a nVidia GTX 480 GPU. All work units stop after a few seconds and give an "error while computing" message. The same system has completed work units from other BOINC projects that use GPUs. I'm using driver 331.38.Any help will be appreciated. Thank you.

captainjack
Send message
Joined: 9 May 13
Posts: 171
Credit: 2,323,654,288
RAC: 2,351,655
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37552 - Posted: 6 Aug 2014 | 2:41:51 UTC

Bill42,

Take a look at this thread and see if it addresses your errors.

http://www.gpugrid.net/forum_thread.php?id=3736&nowrap=true#36577

Hope that helps.

Nicolas_orleans
Send message
Joined: 25 Jun 14
Posts: 14
Credit: 446,219,525
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwat
Message 37553 - Posted: 6 Aug 2014 | 6:55:25 UTC - in response to Message 37552.

There may also be something else going on, I used to be crunching successfully with Ubuntu 14.04 + custom client 7.3.15 + driver 340.24 (manual install after uninstalling packages nvidia-304 and nvidia-331) + cuda 6.0 application.

Now since this morning, all new tasks downloaded error ! (20 tasks to be precise see : http://www.gpugrid.net/results.php?hostid=177839&offset=0&show_names=1&state=5&appid=)

Error:

<core_client_version>7.3.15</core_client_version>
<![CDATA[
<message>
process exited with code 201 (0xc9, -55)
</message>
<stderr_txt>
# Unable to initialise. Check permissions on /dev/nvidia* (err=100)

</stderr_txt>
]]>


I noticed Ubuntu 14.04 downloaded this morning the following updates that in my understanding shall not have impacted GPUGRID (since I use driver 340 and CUDA libs are provided by GPUGRID app) :
-nvidia-opencl-icd-331-updates (backported from 14.10)
-nvidia-libopencl1-331-updates (backported from 14.10)
-libcuda1-331-updates (backported from 14.10)

I did a sudo apt-get purge of these updated libs, a few tasks downloaded and imediately failed like before... then (around 3-5 minutes after purging the libs) it started to crunch again.

Due to the 3-5 minutes delay, I cannot really say if it's a flow of 20 "bad tasks" or if really, this Ubuntu update was responsible for the tasks erroring...

Will report if I see other errors.

Nicolas_orleans
Send message
Joined: 25 Jun 14
Posts: 14
Credit: 446,219,525
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwat
Message 37557 - Posted: 7 Aug 2014 | 8:02:22 UTC - in response to Message 37553.

No errors > 24h later, so I guess the errors were generated by the library updates (please note on this computer nvidia-331 and nvidia-331-updates were purged before the issue, so not concerned by automatic updates)

Which one of these three libraries interferes with GPUGRID calculations, and shall be excluded from Ubuntu's updates ? (I would suspect the third one)
-nvidia-opencl-icd-331-updates
-nvidia-libopencl1-331-updates
-libcuda1-331-updates

Post to thread

Message boards : Number crunching : error while computing

//