11) Message boards : Graphics cards (GPUs) : All WUs on GTX660 failing (Message 31531)
Posted 3941 days ago by dyeman
Yes the memory load was 1% (also on the WU running now on the "good" 660 - it's been running for almost 24 hours and is saying it is 59% complete). Looks like you have seen or heard of this before?!

The WU eventually failed after mor than 1000,000 seconds with:

<core_client_version>7.0.64</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified.
(0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
MDIO: cannot open file "output.restart.coor"
SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1574.
Assertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

</stderr_txt>
]]>
12) Message boards : Graphics cards (GPUs) : All WUs on GTX660 failing (Message 31494)
Posted 3942 days ago by dyeman
Thanks - I'll certainly try to negotiate something like that (this card is even failing when I set it back to reference clocks!). The reason I got the card was not because of its overclock, but because it has good cooling

Interestingly the card has actually started working now, but I think this is related to the type of WU: all of the failing WUs were NOELIA_klebe which generate TDP % well into the 90's according to GPU-Z). wThe WU that is working OK (so far) is a NOELIA-1MG which GPU-Z reports as 98% GPU busy but TDP % is only in the mid-60's. Also looks like this will take twice as long to run as the _klebe_'s - no way to get the 24 hour bonus here! For comparison, TDP % is around mid-70's running a pair of PRIMEGRID PPS Sieves concurrently.

@nanoprobe - I'll try the timer delay reqistry setting - thanks. Only some of the failures have been accompanied by the "Stopped Responding" message, however - the rest have just quietly stopped.
13) Message boards : Graphics cards (GPUs) : All WUs on GTX660 failing (Message 31457)
Posted 3943 days ago by dyeman
Well it's looking like hardware. I swapped the two cards and the original (that works fine in the old system) is working fine in the new system, while the new card (that was failing in the new system) is also failing in the old system (errored the GPUGRID WU that was in process within a minute or two).

...Now to try to explain this to try to organise a replacement...

Thanks to all for your help.
14) Message boards : Graphics cards (GPUs) : All WUs on GTX660 failing (Message 31453)
Posted 3943 days ago by dyeman
The system is Intel chipset.

Just tried with card set to reference clocks (980/1033). No help - WU crashed in less than 2 minutes.

Will try to install different drivers and see whether that helps...
15) Message boards : Graphics cards (GPUs) : All WUs on GTX660 failing (Message 31433)
Posted 3943 days ago by dyeman
Thanks Mark,
All my systems have both ATI and NVIDIA cards and this hasn't caused me issues in the past (at least not like this!) so far, and all of the drivers are explicitly downloaded and installed. One difference is that the working system has 310.90 while the failing system has 314.07. I might try the older driver on the new system and see if that helps.

16) Message boards : Graphics cards (GPUs) : All WUs on GTX660 failing (Message 31427)
Posted 3943 days ago by dyeman
Have just built a new system with a GTX660 with the intention of running GPUGRID (encouraged by another system with a GTX660 that is working well). So far all 4 WUs that I have got start processing but fail after between 2 and 10 minutes. I think they have all had the "Driver has recovered after stopped working" message in Windows which I guess is an indication that the GPU hardware has failed. Both the old and new cards are factory overclocked (old one 1006 (1072 boost), new one 1033 (1098 boost). Memory on both is 1502.3 (which I think is not overclocked). GPU Core clocks reported by GPU-Z when running are 1162.7 (new) and 1123.5 (old) respectively. The new 660 is fine running PrimeGrid (two concurrent - pegged at 99% busy according to GPU-Z). Environmentals on both systems seem fine (temp about 60 degrees).

I tried reducing the clocks on the new card to the same level as the old one (1006 (1072 boost)). Still failed (but did run for nearly 10 mins - longer than the others).

So does this look like I just need to keep backing off the factory overclock, or is there anything else to look at? Here are the two systems:

old (working) system: 145220
new (failing) system: 155065

Thanks..
17) Message boards : Graphics cards (GPUs) : 100% CPU use (Message 28606)
Posted 4087 days ago by dyeman
Ah thanks..

I guess they hard-coded 'swan_sync'. Pity - since 560 is about the same speed and only really uses 5-10% of a CPU
18) Message boards : Graphics cards (GPUs) : 100% CPU use (Message 28591)
Posted 4088 days ago by dyeman
Hi,
Have just installed GTX660 (non-ti) and at the same time installed 310.90 drivers. It is processing WUs OK - a little faster than my 560ti, but I notice that the program (acemd.2562.cuda42*32) is pegged at 100% of a core, vs almost 0 on the other system with the 560ti (also 310.90). Both systems are Win 7 64bit.

Any ideas what might be causing this?

Thanks!
19) Message boards : Graphics cards (GPUs) : GPUGRID and ATI (Message 19405)
Posted 4919 days ago by dyeman
For me, SWAN_SYNC increases GPU usage from 62% or so to 67% (Win 7 64 bit)
20) Message boards : Number crunching : What a silly idea (Message 11496)
Posted 5387 days ago by dyeman
I am seeing the same thing - a C2D that used to download 2 WUs (1 running, 1 waiting) now downloads 4 (1 running 3 waiting - I assume 2 WUs per CPU core). This will result in all WUs now missing the 2-day 'bonus' window. Nothing has changed on this machine. It is running BOINC 6.6.28 and NVIDIA 185.85. This is the computer: 29936


Previous 10 | Next 10
//