Advanced search

Message boards : Graphics cards (GPUs) : GTX 470/480 And linux, all WU crashed.

Author Message
The Brain QC
Send message
Joined: 27 Oct 08
Posts: 27
Credit: 3,211,916
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 17002 - Posted: 13 May 2010 | 16:41:34 UTC

Is there any solution available to make GPUGRID function with fermi cards in Linux environement?

All Wus i got in linux (6.04) crashed. I tried with Ubuntu/Xubuntu/Kubuntu and Suse in their last x64 flavours.

I'm a linux newbie as i just want to use it for crunching faster.



Snow Crash
Send message
Joined: 4 Apr 09
Posts: 435
Credit: 436,442,699
RAC: 718,613
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 17008 - Posted: 13 May 2010 | 19:35:41 UTC - in response to Message 17002.

I'm not sure they have released the linux version for fermi yet, should be any day now. To make sure you get the correct WUs for your card and OS go into your GPUGrid preferences and make sure you have "Yes to Run test applications" and then select "ACEMD" and "BETA" in the Run only the selected applications section.
____________
Thanks - Steve

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17010 - Posted: 13 May 2010 | 19:58:49 UTC - in response to Message 17002.
Last modified: 13 May 2010 | 20:01:50 UTC

I'm a N00B to, it's limited what I can do to help, but being a N00B makes it easier to understand other N00B's (I guess).

Here's a guide to get the latest Drivers for Linux, which aren't always available the easy way.

1) First step: you must reset the Xorg to its default conf. Before, you should backup it, to avoid any mistakes.

In the Terminal:

$ sudo cp /etc/X11/xorg.conf /etc/X11/xorg.conf.original
$ sudo dpkg-reconfigure -phigh xserver-xorg

2) Installing packages and dependencies:

$ sudo apt-get install build-essential linux-headers-`uname -r`

3) Remove old version drivers:

$ sudo apt-get --purge remove $(dpkg -l | grep nvidia | awk '{print $2}')

Download the driver from nvidia website. You must check your system architecture (x86, 64...)

4) In my case: Ubuntu 64:

$ wget ftp://download.nvidia.com/XFree86/Linux-x86_64/195.36.24/NVIDIA-Linux-x86_64-195.36.24-pkg2.run -O NVIDIA-Linux-195.pkg.run

5) Now, move the installer to /usr/src and link it. Follows:

$ sudo install NVIDIA-Linux-195.pkg.run /usr/src/
$ sudo ln -s /usr/src/NVIDIA-Linux-195.pkg.run /usr/src/nvidia-driver

Kill X:
6) Time to stop X on GDM. So, press "Ctrl+Alt+F1", login and stop gdm:

$ sudo /etc/init.d/gdm stop (If KDE is used write instead: $ sudo /etc/init.d/kdm stop)

Installing NVIDIA Driver
7) Installing:

$ sudo sh /usr/src/nvidia-driver

8) When it is done, restart your computer:

$ sudo reboot

After boot up, go to terminal and:

$ sudo nvidia-xconfig
(I didn't need to do this, but this step is in the origial step-by-step)

BTW, I got this from here: http://forums.nvidia.com/index.php?showtopic=99513 & changed a little bit.

To search for the latest Nvidia Driver, go to http://ftp://download.nvidia.com/XFree86/ to find what they have.
____________

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwat
Message 17013 - Posted: 13 May 2010 | 22:34:32 UTC - in response to Message 17010.

I think Snow Crash has it right, there has been no Linux specific Fermi app released yet. I know I myself am checking the site many times a day waiting for it. I'm dying to see what my 470s can do vs my 295. I never crunched wu's on windows with my 295 so I have no basis for comparison until they release the Linux app.

I think the gpugrid guys use Linux in the lab, hopefully we'll see the Linux app soon, but so far they have mentioned almost nothing about it whatsoever. I think a lot of development time has been devoted to ATI lately; I really hope that doesn't mean us Linux guys will just have to wait...

The Brain QC
Send message
Joined: 27 Oct 08
Posts: 27
Credit: 3,211,916
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 17015 - Posted: 14 May 2010 | 0:09:01 UTC

Thank you all for useful informations. I'm going to wait a bit before crunching under Linux, i'm glad to discover new OS via GPUGRID participation.

Again thx guys.

PS: Do you think Open suse 11.2x64 is a good distro to crunch and learn about linux world? This one is the one i prefer after testing Ubuntu and derivated included Mint, Mandriva and Fedora. What is your advice?

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3519
Credit: 938,912,957
RAC: 1,145,106
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17026 - Posted: 14 May 2010 | 10:11:07 UTC - in response to Message 17015.

6.04 tasks are not for Fermi.

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwat
Message 17039 - Posted: 14 May 2010 | 15:14:30 UTC - in response to Message 17015.

Thank you all for useful informations. I'm going to wait a bit before crunching under Linux, i'm glad to discover new OS via GPUGRID participation.

Again thx guys.

PS: Do you think Open suse 11.2x64 is a good distro to crunch and learn about linux world? This one is the one i prefer after testing Ubuntu and derivated included Mint, Mandriva and Fedora. What is your advice?


SuSE is a great distro, very robust and lots of development. I personally use Ubuntu, mostly because the deb package management system is fantastic.

The Brain QC
Send message
Joined: 27 Oct 08
Posts: 27
Credit: 3,211,916
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 17051 - Posted: 15 May 2010 | 1:12:45 UTC - in response to Message 17039.

Finally, I opted for the Ubuntu distro, mainly due to install via wubi which allows Windows to coexist just seven and Linux without having to suffer the burden of Grub 2.

My first workunits were calculated in 425 seconds or less, I'm pretty happy with the results.

Wubi is really an extraordinary idea for a setup like mine, again, everything here has installed a ssd, I did not suffer loss of velocity on my disk access, the best of both worlds ...


(Google traduction from French, sorry if it's not accurate)

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwat
Message 17052 - Posted: 15 May 2010 | 1:30:22 UTC - in response to Message 17051.

My machine has been happily crunching WUs all afternoon, no errors, everything completed and validated.

431secs is the slowest one i've seen. This is almost twice as fast as the few beta units I crunched while booted into win7.

The Brain QC
Send message
Joined: 27 Oct 08
Posts: 27
Credit: 3,211,916
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 17055 - Posted: 15 May 2010 | 5:03:00 UTC

I have the problem after reboot where my gpu is not found, then i found a workaround with two shortcuts on my desk, one to stop client, the other to restart it.

No "panache" here but it does the job.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 435
Credit: 436,442,699
RAC: 718,613
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 17060 - Posted: 15 May 2010 | 10:38:22 UTC - in response to Message 17055.

I think increasing the start up delay instead of manually stopping and starting would work.

<cc_config>
<options>
<start_delay>60</start_delay>
</options>
</cc_config>
____________
Thanks - Steve

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 530
Credit: 309,959,215
RAC: 825,975
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 17061 - Posted: 15 May 2010 | 10:51:34 UTC - in response to Message 17060.

It will indeed be interesting to see if that works, but I think it failed last time it was tried.

IIRC, the 'start_delay' in cc_config only controls how long the science apps wait, after BOINC itself has already started (BOINC has to be running, because it's BOINC that reads cc_config).

The trouble desribed here happens if BOINC starts before the CUDA driver is ready - BOINC has to detect the CUDA card through its drivers before it can schedule CUDA science apps properly.

I think there have been some delayed-load BOINC scripts posted around the place, but I'm afraid I haven't got a link handy.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17063 - Posted: 15 May 2010 | 11:23:02 UTC - in response to Message 17060.
Last modified: 15 May 2010 | 11:26:08 UTC

Thanx Snow Crash! Works for me. Just don't put it on the startup programs list, because then you have to stop & start, but otherwise the delay does the trick ;-)
____________

The Brain QC
Send message
Joined: 27 Oct 08
Posts: 27
Credit: 3,211,916
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 17111 - Posted: 18 May 2010 | 8:46:19 UTC

Does not work for me but a launcher restarting boinc client does actually the job.

Xubuntu permit to compute really faster, according to my personal tests, than Ubuntu or kubuntu. Give it a try.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17122 - Posted: 18 May 2010 | 15:14:06 UTC - in response to Message 17111.

You're right, I was a bit too optimistic. It worked for me, then it stopped working. I restarted my pc several times & it seemed as if the problem was gone, but then it came back & never worked again. I guess restarting BOINC is the only way to go on Linux.

I've got lots of performance on Ubuntu & Kubuntu 10.4, only problem is, I've never had so many failed WU as I now have. I think it's heat though, Linux WU's seem to be far more aggressive then Windows WU's. I don't think it's Ubuntu, because even my Mint linux 8 pc's Ubuntu 9.10 started to have them. I don't think it's Ubuntu, Nvidia drivers, or Boinc, but rather the GPUGRID WU's. Summer hasn't been that hot here & sometimes with the windows open 24/7, it's even colder then it was during winter, i think it's something recent.
____________

The Brain QC
Send message
Joined: 27 Oct 08
Posts: 27
Credit: 3,211,916
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 17134 - Posted: 18 May 2010 | 17:17:20 UTC
Last modified: 18 May 2010 | 17:31:04 UTC

For Information of linux users, to create proper launcher that stop and restart Boinc when "gpu missing" appears in its messages lines, the command line is :


sudo /etc/init.d/boinc-client restart


Don't forget to execute it in a terminal or it could not work as wanted.


Here is the stats i get crunching under Xubuntu :

<core_client_version>6.10.17</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 470"
# Clock rate: 1.22 GHz
# Total amount of global memory: 1341325312 bytes
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
# Time per step (avg over 50000 steps): 8.179 ms
# Approximate elapsed time for entire WU: 408.944 s
08:07:37 (3004): called boinc_finish

</stderr_txt>
]]>


Average 8ms/step with stock gtx 470 and Core 2 quad Q6600 runing 3.6Ghz 450 Fsb X 8 watercooled.

90°C for gpu temp 38°C for hotest core of Cpu at ambiant 27°C.

No crash since new Wus a couple of days. All runs fine for me here, yesterday i reached the 120 wus/day quota near 12H crunching since 00H00. Fermi works well on linux x64!!!

[AF>Suisse>Geneve]titoucha
Send message
Joined: 20 Nov 09
Posts: 4
Credit: 104,789,729
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 17362 - Posted: 27 May 2010 | 14:51:00 UTC - in response to Message 17134.

Hello all.

In my case I always calculate an error with my gtx480.
I installed the latest nvidia driver, the latest stable version of boinc and nothing I have done those silly mistakes

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3519
Credit: 938,912,957
RAC: 1,145,106
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17363 - Posted: 27 May 2010 | 15:08:59 UTC - in response to Message 17362.

Perhaps this is because you put a GTX285 a Tesla C1060 and a GTX480 all into the same system.

The GTX480 will only crunch CUDA30 tasks, and the other cards will only crunch cuda tasks.

When your GTX480 picks up a cuda task it will fail, then get another task and fail, and so on until it randomly gets a CUDA30 task. The same applies for the Tesla and GTX285, except they will fail cuda30 tasks and crunch cuda tasks.

Put the Fermi into a different system.

[AF>Suisse>Geneve]titoucha
Send message
Joined: 20 Nov 09
Posts: 4
Credit: 104,789,729
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 17367 - Posted: 27 May 2010 | 16:48:26 UTC - in response to Message 17363.

That was it, thank you

roundup
Send message
Joined: 11 May 10
Posts: 31
Credit: 63,539,795
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17895 - Posted: 5 Jul 2010 | 22:39:13 UTC

After the disappointing experience with the poor new Win 7 drivers for Fermi (257.xx, which is even worse than the 197.xx driver under Win 7) I tried to run GPUGrid on Ubuntu 10.04.
Results for my GTX470:
7 out 8 work units crashed
1 work unit took 36,791.16 seconds for a 6803 credits task: (http://www.gpugrid.net/workunit.php?wuid=1665624). This took place on a i7-920 on stock speed equipped with one GTX470 stock clocked, BOINC 6.10.17 with setting for 7 cores for CPU crunching in order to have one core for the GPU.
I recognized that the task ran with a CPU usage of about 5% - rather low for a Fermi.

The unit that is running on the machine right now is also extremely slow: After 37 min the progress bar shows 5% - even a GTS 250 under Win 7 is faster. With this speed the GTX470 generates about 16-17k Credits/day :-D

I installed the latest driver 256.35 exactly as decribed in this posting: http://www.gpugrid.net/forum_thread.php?id=2150&nowrap=true#17010

Any ideas what went wrong?

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17896 - Posted: 5 Jul 2010 | 23:04:18 UTC - in response to Message 17895.

After the disappointing experience with the poor new Win 7 drivers for Fermi (257.xx, which is even worse than the 197.xx driver under Win 7) I tried to run GPUGrid on Ubuntu 10.04.
Results for my GTX470:
7 out 8 work units crashed
1 work unit took 36,791.16 seconds for a 6803 credits task: (http://www.gpugrid.net/workunit.php?wuid=1665624). This took place on a i7-920 on stock speed equipped with one GTX470 stock clocked, BOINC 6.10.17 with setting for 7 cores for CPU crunching in order to have one core for the GPU.
I recognized that the task ran with a CPU usage of about 5% - rather low for a Fermi.

The unit that is running on the machine right now is also extremely slow: After 37 min the progress bar shows 5% - even a GTS 250 under Win 7 is faster. With this speed the GTX470 generates about 16-17k Credits/day :-D

I installed the latest driver 256.35 exactly as decribed in this posting: http://www.gpugrid.net/forum_thread.php?id=2150&nowrap=true#17010

Any ideas what went wrong?


I can only guess, because I don't have a GTX470 or an i7-920, neither do I have or know what mobo you're using. But I've had so many failed WU's on many different Linux Distro's, but Mint Linux 64bit Gnome or KDE, works good for me, I have 4 PC's & if Gnome isn't good KDE works & vice versa. Both 195.36.31 & 195.36.24 work well for me & they also support the GTX470 I wrote a tutorial here:http://liveonc.weebly.com/index.html that makes it easy to clip & paste on to the terminal.

To manually adjust the fan speed (to prevent overheating & increase the chances that WU's don't fail due to high temps). Install via Synaptic, Nvclock & write in terminal $sudo nvclock -f -c 1 -F 99 & ($sudo nvclock -f -c 2 -F 99 if you have a second GPU, 3 if you have three GPU's & so on).
____________

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17898 - Posted: 6 Jul 2010 | 0:29:55 UTC - in response to Message 17895.
Last modified: 6 Jul 2010 | 0:30:47 UTC

BTW roundup, did you actually follow the steps you pointed to+ Because it doesn't work (for me at least & I even wrote about that in another thread) on Ubuntu 10.04 or Mint Linux 9, there's a different way of upgrading the Nvidia Driver on versions AFTER Ubuntu 9.10 Mint Linux 8. I didn't bother find out, because I was getting way too many errors on Ubuntu 10.04 & Mint Linux 9 no matter what driver I tried using.

I can't edit a thread after 1 hour has past, neither can I delete them.
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1862
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17900 - Posted: 6 Jul 2010 | 7:09:07 UTC - in response to Message 17898.

Try the GPUGRID usb key.

gdf

roundup
Send message
Joined: 11 May 10
Posts: 31
Credit: 63,539,795
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17901 - Posted: 6 Jul 2010 | 7:17:31 UTC - in response to Message 17898.

Thanks, Liveonc.
I followed your steps. The only command that changed with Ubuntu 10.04 is 'sudo //etc/init.d/gdm stop' that reads now 'sudo stop gdm'.
I will give it another try with another distro - linux is still a lottery what prevents it from having broad success in the desktop world.

roundup
Send message
Joined: 11 May 10
Posts: 31
Credit: 63,539,795
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17902 - Posted: 6 Jul 2010 | 11:53:27 UTC - in response to Message 17901.
Last modified: 6 Jul 2010 | 12:47:24 UTC

I found the bug (actually my mistake).
In the installing process some questions have to be answered on the console. When you are asked if the driver should install the NVIDIA 32-BIT COMPATIBILITY LIBRARIES You MUST NOT answer 'yes'.
Then it works with Ubuntu 10.04 LTS :-).


To manually adjust the fan speed (to prevent overheating & increase the chances that WU's don't fail due to high temps). Install via Synaptic, Nvclock & write in terminal $sudo nvclock -f -c 1 -F 99 & ($sudo nvclock -f -c 2 -F 99 if you have a second GPU, 3 if you have three GPU's & so on).

This setting brings absolutely no effect. I observed the temps on NVIDIA X SERVER SETTINGS. Fan control type is 'variable' and the fan operates between 51% and 57%. Temps are between 78° and 84°C.
All nvclock settings lead to the following message:
It seems your card isn't officialy supported in NVClock yet.
The reason can be that your card is too new.If you want to try it anyhow [DANGEROUS], use the option -f to force the setting(s).
NVClock will then assume your card is a 'normal', it might be dangerous on other cards.
Also please email the author the pci_id of the card for further investigation.
[Get that value using the -i option].

Even the option -i leads to this message.
Nvclock version is 0.8 beta4. Is there a newer one available (maybe a release that allows for some oc'ing)?

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3519
Credit: 938,912,957
RAC: 1,145,106
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17903 - Posted: 6 Jul 2010 | 13:44:30 UTC - in response to Message 17902.
Last modified: 6 Jul 2010 | 13:56:27 UTC

I expect NVClock does not work for Fermi cards, as it is based on the previous generation of architectures. Up to GT200 series only.

http://www.linuxhardware.org/nvclock/

Does the Linux x86_64 256.35 Driver not have a built in OverClocking feature (Something equivalent to NVidia Control Panel)?

- I would check myself, but it would mean installing Linux, the driver and not crunching for a day.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17904 - Posted: 6 Jul 2010 | 14:26:11 UTC - in response to Message 17903.

Hi skgiven,

nobody says you have to stick to Windows or spend an entire day on Linux. Just use Wubi if you want an easy way to quickly install within Windows & remove also with Windows quickly. I'd do it too but Ubuntu 10.04 doesn't like my PC's.
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3519
Credit: 938,912,957
RAC: 1,145,106
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17905 - Posted: 6 Jul 2010 | 15:19:09 UTC - in response to Message 17904.
Last modified: 6 Jul 2010 | 15:29:39 UTC

Hi Liveonc,
Can you alter the shaders speeds via NVidia X Server Settings on any of you Linux systems?

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17906 - Posted: 6 Jul 2010 | 16:58:15 UTC - in response to Message 17905.
Last modified: 6 Jul 2010 | 16:58:57 UTC

Hi Skgiven,

I haven't tried I like to flash my GPU instead.

But you're supposed to add:

Option "Coolbits" "1"

$sudo gedit (kate for kde or mousepad for xfce) /etc/X11/xorg.conf

to: Section "Device"

BTW, I haven't tried myself so I'm not sure. nvclock isn't even needed for this, but I need nvclock to set my fan to 99%

with coolbits you just run in terminal

$nvidia-settings --assign="GPU3DClockFreqs=676,1457"

if fx you OC to what I use on my GTX260's

Again I'm not sure, I haven't tried, nor do i need to.
____________

Post to thread

Message boards : Graphics cards (GPUs) : GTX 470/480 And linux, all WU crashed.