Advanced search

Message boards : Graphics cards (GPUs) : BSODs with Nvidia driver while running GPGRID

Author Message
aka1nas
Send message
Joined: 9 Apr 09
Posts: 7
Credit: 612,049
RAC: 0
Level
Gly
Scientific publications
watwat
Message 8345 - Posted: 10 Apr 2009 | 15:32:28 UTC

Hi all,

I can't seem to run GPUGrid WUs without my Nvidia driver causing a BSOD within 30 minutes or so. This is with my PC just siting otherwise idle on the desktop. I've checked my temperatures and even tried maxing out all the GPU Fans in Rivatuner, and the GPUs are no running very hot(~60C under load).

Any ideas on what else might be causing this?

My Setup:

Q6600
8GB DDR2
2x GTX 260(SLI)
1x 8800GT
Vista 64 Ultimate
Nvidia Forceware 182.06
Boinc 6.6.20

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 8346 - Posted: 10 Apr 2009 | 16:43:58 UTC - in response to Message 8345.

have you tested the newer 182.50 Nvidia driver?

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 135,911,881
RAC: 51
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 8350 - Posted: 10 Apr 2009 | 18:03:26 UTC - in response to Message 8345.

And is SLI turned OFF?
____________

pixelicious.at - my little photoblog

aka1nas
Send message
Joined: 9 Apr 09
Posts: 7
Credit: 612,049
RAC: 0
Level
Gly
Scientific publications
watwat
Message 8351 - Posted: 10 Apr 2009 | 18:14:54 UTC - in response to Message 8350.

I haven't tried disabling SLI yet, is it explicitly not supported?

Profile Michael Goetz
Avatar
Send message
Joined: 2 Mar 09
Posts: 124
Credit: 47,698,744
RAC: 107,365
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 8364 - Posted: 11 Apr 2009 | 0:22:45 UTC - in response to Message 8351.

SLI will officially prevent BOINC from seeing more than one GPU. If you want all of your video card cores to being processing BOINC tasks, SLI must be disabled.

____________
Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG.

aka1nas
Send message
Joined: 9 Apr 09
Posts: 7
Credit: 612,049
RAC: 0
Level
Gly
Scientific publications
watwat
Message 8420 - Posted: 14 Apr 2009 | 17:30:58 UTC - in response to Message 8364.

Ok, I have disabled SLI, but am still having problems:

- I can run CUDA WUs on Seti@home without any issues if GPUGRID is suspended.
- Running GPUGRID (with or without SETI running) causes a Nvidia driver BSOD within 30 minutes or so.

Profile UBT - Ben
Send message
Joined: 12 Aug 08
Posts: 8
Credit: 137,219
RAC: 0
Level

Scientific publications
watwatwat
Message 8425 - Posted: 14 Apr 2009 | 18:54:42 UTC - in response to Message 8420.

Just a theory, but i assume your power supply unit is upto the job? It maybe that trying to feed 3 GPU's at the same time could be having an inpact? With that type of configuration i would have thought a 800W+ would be needed?

Joe
Send message
Joined: 1 Sep 08
Posts: 37
Credit: 5,864,088
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 8432 - Posted: 14 Apr 2009 | 19:27:58 UTC - in response to Message 8425.

Please install the newest driver from Nvidia, disable the Muli-GPU-Mode and reboot. Now try again. If there are still errors, try only one GC, perhaps one has an error. Maybe is ok for the short Seti WUs, but for GPU Grid...

aka1nas
Send message
Joined: 9 Apr 09
Posts: 7
Credit: 612,049
RAC: 0
Level
Gly
Scientific publications
watwat
Message 8441 - Posted: 14 Apr 2009 | 22:14:53 UTC - in response to Message 8425.

Just a theory, but i assume your power supply unit is upto the job? It maybe that trying to feed 3 GPU's at the same time could be having an inpact? With that type of configuration i would have thought a 800W+ would be needed?


It's a 750watt Silverstone Zeus unit.
http://www.newegg.com/Product/Product.aspx?Item=N82E16817256006

I'll try the newer Forceware driver when I get a chance. Are there any other benchmarks or stress tests that I can use that will load all my GPUs at once?

aka1nas
Send message
Joined: 9 Apr 09
Posts: 7
Credit: 612,049
RAC: 0
Level
Gly
Scientific publications
watwat
Message 8445 - Posted: 15 Apr 2009 | 1:34:33 UTC - in response to Message 8441.

Alternately, is there any way to limit which GPU(s) GPUGRID can run WUs on?

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 8472 - Posted: 15 Apr 2009 | 20:40:16 UTC - in response to Message 8445.

Alternately, is there any way to limit which GPU(s) GPUGRID can run WUs on?


Not yet.

Regarding benchmarks: running a recent 3D Mark on the 2 GTX 260's, possibly at a higher resolution than the default, should result in a load higher than GPU-Grid on all 3 cards. Furmark is the choice if you want to maximize heat & power draw.. not sure how it handles multiple cards, though.

Distribution of the power over the different rails may also be an issue, which might not be caught by running 3D Mark on 2 GPUs in SLI. How did you connect your GPU power plugs?

MrS
____________
Scanning for our furry friends since Jan 2002

aka1nas
Send message
Joined: 9 Apr 09
Posts: 7
Credit: 612,049
RAC: 0
Level
Gly
Scientific publications
watwat
Message 8474 - Posted: 15 Apr 2009 | 21:47:13 UTC - in response to Message 8472.

The PSU is quad rail , and has a pair of dual-headed PCI-E power plugs that I am using for the GTX 260s. I am using a dual 4-pin molex --> PCI-E power adapter for the 8800GT. I'd have to check if anything is on the same rail as the latter adapter.

Profile JockMacMad TSBT
Send message
Joined: 26 Jan 09
Posts: 31
Credit: 3,877,912
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 8481 - Posted: 16 Apr 2009 | 0:13:49 UTC - in response to Message 8474.

Yeah I'd definatly say FurMark it's a beast.

IndianaX
Send message
Joined: 9 Apr 09
Posts: 2
Credit: 61,024,475
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 8488 - Posted: 16 Apr 2009 | 7:17:48 UTC

I get a BSOD while playing Supreme Commander for some time and having GPUGRID running in the background :-(
I have Win XP 32bit and a GTX 285 with the latest drivers from nvidia!

Profile K1atOdessa
Send message
Joined: 25 Feb 08
Posts: 249
Credit: 370,320,941
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 8494 - Posted: 16 Apr 2009 | 12:34:27 UTC - in response to Message 8488.

You should stop GPUGrid before playing games -- you can run into memory issues because the game and GPUGrid are using up all the vidmem. You can do this manually every time or use an option in the cc_config file. Search the forums and you will find it, probably fairly recent.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 8504 - Posted: 16 Apr 2009 | 20:12:13 UTC - in response to Message 8474.

I am using a dual 4-pin molex --> PCI-E power adapter for the 8800GT


Power distribution seems fine. The only possible weak spot which I can find is the quote above. It could be that the 8800GT draws just enough power to overload a certain rail if you're running GPU-Grid, whereas under seti the overall GPU power draw is lower and therefore the problem doesn't cause a BSOD.

I'd take the 8800GT out and test if GPU-Grid runs. Otherwise I'd start wildly swapping cards in and out in a somewhat organized manner, which would likely lead to seemingly random errors. I'd forget to write these down properly and at the end of the day still wouldn't know what caused the error, but the 3 card config would likely work.

MrS

(just kidding.. well, I hope)
____________
Scanning for our furry friends since Jan 2002

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 8512 - Posted: 16 Apr 2009 | 22:40:24 UTC - in response to Message 8474.

I am using a dual 4-pin molex --> PCI-E power adapter for the 8800GT.

IN every GPU Box they had this little adapter ... and a note that said use it only temporarily as it is not a recommended configuration. I think you are seeing an example of why this is not recommended and is only suitable as a temporary stop-gap.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 8551 - Posted: 18 Apr 2009 | 9:30:51 UTC - in response to Message 8512.

I don't think it's that bad. I used one for my 9800GTX+ (quite thirsty) without problems. I connected both 4 pin Molex to different "cable threads", don't know if it made a difference.

MrS
____________
Scanning for our furry friends since Jan 2002

Clownius
Send message
Joined: 19 Feb 09
Posts: 37
Credit: 30,657,566
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwat
Message 8716 - Posted: 22 Apr 2009 | 12:57:32 UTC

Im also getting BSOD on my Dual GTX 295's setup. Ive narrowed it down a little myself and it usually (but wont always) happens when i restart a suspended WU.
Im thinking i have a bad second graphics card but its stable for days sometimes or that firing up both 295's at once makes the PSU have a heart attack because of the sudden power draw. Not sure though and testing takes time.
I generally suspend GPUGrid when im playing games and need to use Quad SLI mode. Once im finished the game i disable SLI (if i used it, some games run perfect on one GPU) and un-suspend....then it BSOD's.

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 8742 - Posted: 22 Apr 2009 | 20:25:11 UTC

arghh had a crash on my vc driver after windows update ...
lost the 9 hours+ unit and it downloaded again after i rebooted and fixed the videocard error :(
grrr why oh why

jboese
Send message
Joined: 30 Jul 08
Posts: 21
Credit: 31,229
RAC: 0
Level

Scientific publications
wat
Message 8755 - Posted: 23 Apr 2009 | 1:29:55 UTC - in response to Message 8420.

I wouldn't waste time debugging something with you hardware. You already said it runs seti@home wu fine. My guess is the problem is with the poor code this project seems to be using. Look over at the PS3 forum at all the issues and you get a feel for the quality of this project. I would stick to other projects or even switch to folding@home as their science is much more cutting edge and most importantly with work with no problems on %99 of all setups.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 8761 - Posted: 23 Apr 2009 | 5:26:58 UTC - in response to Message 8742.

arghh had a crash on my vc driver after windows update ...
lost the 9 hours+ unit and it downloaded again after i rebooted and fixed the videocard error :(
grrr why oh why

Gremlins?

It is easy to sling arrows and make claims about the quality of the code. The problem is that there are significant counter examples. I don't think that I have seen a bad task in at least two weeks if not more. Yes, on occasion, a bad batch gets created, but that is usually because they are trying something new. Does not always work right away.

The loss of a task is one reason to be cautious in making updates ... you may need to suspend BOINC or wait until you have just completed the latest task to run the risk. So, if tasks crash you do not lose that much work.

jboese
Send message
Joined: 30 Jul 08
Posts: 21
Credit: 31,229
RAC: 0
Level

Scientific publications
wat
Message 8765 - Posted: 23 Apr 2009 | 7:24:46 UTC - in response to Message 8761.
Last modified: 23 Apr 2009 | 7:25:45 UTC

I will not speak to the Nvidia client as I don't run it but I can tell you the PS3 side of this project is a horrible mess. Worse DC project I have ever dealt with and I have run dozens. Locking up users machines with very poor code, so they cant run any project without a reboot is not only mickey mouse but downright unethical and against the volunteer nature of BOINC. I just wish I wasn't remote so I could switch it back to folding@home immediately (should have stayed).

Profile Michael Goetz
Avatar
Send message
Joined: 2 Mar 09
Posts: 124
Credit: 47,698,744
RAC: 107,365
Level
Val
Scientific publications
watwatwatwatwatwatwatwat
Message 8768 - Posted: 23 Apr 2009 | 8:57:23 UTC - in response to Message 8765.

I've been crunching GPUGRID almost continuously for 6 weeks or so now, cranking out about 3 work units per day.

Zero errors.
____________
Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG.

uBronan
Avatar
Send message
Joined: 1 Feb 09
Posts: 139
Credit: 575,023
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 8777 - Posted: 23 Apr 2009 | 11:32:45 UTC - in response to Message 8761.
Last modified: 23 Apr 2009 | 11:37:34 UTC

Gremlins?


Well yes must be gremlins i did not allow windows to update itself and not allowed microsoft to reboot without my knowledge but it did.
And i did not allow nvidia to crash their buggy driver :D
When i came check my pc i saw my machine running in a 4bit video mode....
Those darn gremlins ate all the others away i guess.... darn monsters

P.S. "i saw something furry nearby my pc not sure what it was ;)"

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 8800 - Posted: 23 Apr 2009 | 19:50:30 UTC - in response to Message 8755.

I wouldn't waste time debugging something with you hardware. You already said it runs seti@home wu fine.


That seti@home works tells us that his hardware is not completely broken. But it does not tell us that it can still run GPU-Grid, as the latter code utilizes the hardware more intensively.

And I can't speak for the PS3 as I don't run it. However, I know that the group at Folding@home is relatively large, they're PS3 pioneers and usually deliver quality work. So I'd be surprised if any other project could top their quality. Tying them would already be a "knightly accolade".

And having said that I can assure you that GPU-Grid is not as bad you describe PS3-Grid. Most problems actually come from the software environment (OS, driver) and overclocking.

P.S. "i saw something furry nearby my pc not sure what it was ;)"


It must be our furry friends, the extraterrestrial apes, who SETI finally found and who're trying to chase away the gremlins!

MrS
____________
Scanning for our furry friends since Jan 2002

Post to thread

Message boards : Graphics cards (GPUs) : BSODs with Nvidia driver while running GPGRID

//