Advanced search

Message boards : Number crunching : Error after 4 Hours

Author Message
AEM74
Send message
Joined: 5 Mar 14
Posts: 16
Credit: 16,903,909
RAC: 0
Level
Pro
Scientific publications
watwatwat
Message 44239 - Posted: 19 Aug 2016 | 16:27:40 UTC
Last modified: 19 Aug 2016 | 16:27:49 UTC

Hi,

Here is the task in question: https://www.gpugrid.net/result.php?resultid=15241975

It seems that the WU got unstable and terminated, but I have no idea why. No OC'ing, good PSU (Corsair HX750i), no shut downs or graphic glitches, and good cooling (GPU temps are 72C constant on 23C ambient).

Anybody know why?

Thanks.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44240 - Posted: 19 Aug 2016 | 17:00:27 UTC - in response to Message 44239.

It is probably overclocked too much by the factory. Try reducing the GPU clock in 100 MHz steps. It will probably take only one.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44241 - Posted: 19 Aug 2016 | 17:08:55 UTC - in response to Message 44239.
Last modified: 19 Aug 2016 | 17:10:42 UTC

Here is the task in question: https://www.gpugrid.net/result.php?resultid=15241975

It seems that the WU got unstable and terminated, but I have no idea why. No OC'ing,

Your card is factory overclocked, as according to the specifications of the GTX 980Ti on NVidia homepage its default base clock is 1000MHz, while your card's is 1190MHz according to the task's log:
<stderr_txt> # GPU [GeForce GTX 980 Ti] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 980 Ti # ECC : Disabled # Global mem : 4095MB # Capability : 5.2 # PCI ID : 0000:01:00.0 # Device clock : 1190MHz # Memory clock : 3505MHz # Memory width : 384bit # Driver version : r372_53 : 37254

While it is not too much - as my GTX980Ti's are running at 1380-1400MHz - it could cause these errors if the GPU voltage is not appropriate for this frequency. Raising the GPU voltage however, will raise the GPU temperature, so it's better to reduce its frequency and/or temperature.

good PSU (Corsair HX750i), no shut downs or graphic glitches, and good cooling (GPU temps are 72C constant on 23C ambient).

Check the GPU voltage by a GPU monitoring tool like GPU-Z, Nvidia Inspector or MSI Afterburner. Setting a more aggressive fan profile to further decrease GPU temp could fix this error.

AEM74
Send message
Joined: 5 Mar 14
Posts: 16
Credit: 16,903,909
RAC: 0
Level
Pro
Scientific publications
watwatwat
Message 44242 - Posted: 20 Aug 2016 | 1:14:21 UTC

I'm really hesitant in reducing the clocks as I also game and I'm frankly too lazy to switch between OC profiles. However, fans are usually running at 35% so I can always boost the fan curve and up the voltage.

Also, my WU is almost complete and so far no terminations. I think the GPU was overloaded randomly but if it becomes a frequent problem, I will resort to upping the voltage and fan speed, and as a last resort lowering frequency.

Thanks for the responses.

Post to thread

Message boards : Number crunching : Error after 4 Hours

//