1) Message boards : Graphics cards (GPUs) : OC'ing an (Message 47112)
Posted 2 days ago by Profile Retvari Zoltan
Well... you cannot generalize... but looking at the EVGA GTX1070 FTW thermal image, I'd say that Erich56 has a point. As I wrote, there are cards that run several years at 80°C and there are other designs failing at 70°C after a couple of months only.
I agree.

Side note: if you crunch, avoid temperature fluctuations by all means, as they cause a lot of mechanical stress to the PCB, extending the conducting paths, lands and soldering joints.
That's called thermal fatigue caused by the thermal cycle.
Less thermal cycle = longer lifespan.
Lower amplitude of the thermal cycle = longer lifespan.
As the cards normally start from room temperature, the latter equals to:
Lower maximum temperature = longer lifespan.
Applying liquid nitrogen (-196°C=-321°F) cooling causes 4 times higher thermal cycle, than the card going from room temperature to 80°C.

A black screen or stripes may be the result after a while.
Failed cards can't process workunits, so failed workunits are also a sign.

Which means: 24/7 operation is much better for a graphics card lifespan than crunching in the day- or nighttime only (so the card is cool in half a day)
I fully agree.

... and then even OC the card all the way in order to accomplish the time limit. Another thing that speaks for lower GPU temperatures is the cooling phase inbetween two jobs when the GPU temp drops to 40-50°C for 10-20 seconds and then increases again.
The chip could withstand far more and far higher thermal cycles than the whole PCB. The PCB won't cool down that fast, so it does not have such a big impact on the card's lifespan than letting the whole card reaching room temperature.

You simply will have less mechanical stress by keeping the temperature difference as little as possible.
I fully agree.

As a consequence, a graphics card will age faster with short runs than long runs, as there are much more cooling intervals.
That's another reason for crunching long runs only on fast cards, and short runs only on slower cards.
And also a reason for not set the "Suspend GPU while the computer is in use".

Running two tasks in parallel (and therefore having no sharp decline of load at the end of one job) will cushion that effect by the way.
2) Message boards : Number crunching : Bug in BOINC or GPUGRID? (Message 47093)
Posted 4 days ago by Profile Retvari Zoltan
It will suspend after ~30-120 seconds at most. (Though it's abnormal to take that long.)
3) Message boards : Graphics cards (GPUs) : EVGA GeForce GTX 1080 Ti SC Black vs. 1080 FTW? (Message 47086)
Posted 4 days ago by Profile Retvari Zoltan
Zoltan, what brand 1080ti did you get and what clockspeed are mem freq are you running it at?
It's a Gigabyte AORUS GeForce GTX 1080Ti.
By default its GPU is running at 1987MHz, the memory is at 5000MHz.
I've set it to 2000/5500MHz, but it just keeps on switching back.
Luckily I can re-set to the overclocked frequencies without restart.

Also have you considered moving that card to linux as 9.14 is still the app on linux with none of the slowdowns of 9.18?
Yes, but I don't sense the slowdown, perhaps because I did not have this card before the 9.18 app.
4) Message boards : Number crunching : Version 9.18 Takes longer (Message 47083)
Posted 4 days ago by Profile Retvari Zoltan
But I don't see the page listing the applications at the moment. Either I am not looking in the right place, or maybe they are updating it.
You can find them here. This page is not linked from the GPUGrid pages, so it's no wonder that you didn't see it.
5) Message boards : Number crunching : PABLO_bound2KIX2CMY workunits (Message 47069)
Posted 6 days ago by Profile Retvari Zoltan
Something wrong with some of these workunits.
Even though there are some successful in this batch, all 3 that I've received failed immediately (on all hosts these have been) with the following error:
# The simulation has become unstable. Terminating to avoid lock-up (1) # Attempting restart (step 5000)

One of them had this error also:
SWAN : FATAL : Cuda driver error 700 in file 'swanlibnv2.cpp' in line 1965.
The following workunit progressed very slowly, so I've restarted the host (probably the previous error downclocked the GPU, but I didn't checked it.)
6) Message boards : Number crunching : extremely high error rates (Message 47068)
Posted 6 days ago by Profile Retvari Zoltan
Thanks. I'm well aware of <exclude_gpu>. I am the one that requested that David A imlement it into BOINC :) I'm directly responsible for its existence, originally requested to prevent certain apps from running on the primary GPU because they made my display laggy!
I know, I intended this for the others you referred having the same problem.

However, I'm not going to use it as a workaround to fix this server issue.
This is not a server issue, this is a compiler issue.
However it could be avoided by the server if it wouldn't send work for the hosts equipped with GTX 660Ti cards, which policy wouldn't filter out your host, as it has a GTX 970 too, so the server doesn't know about the lesser cards in it.

Instead, the tasks will continue to error on my GTX 660 Ti GPUs, until MJH and staff step up to better identify and then fix the issues.
I think it's an unnecessary display of protest, as there are enough unsupervised hosts to make the statistics worse.

They've hinted at some bug, but did not give appropriate info for anybody to do anything to fix it... So what is the nature of the problem?
I don't know, but it should be a nasty one, as GTX 670 & GTX 680 (both CC3.0) is working fine with the new app.
7) Message boards : News : App update 17 April 2017 (Message 47067)
Posted 6 days ago by Profile Retvari Zoltan
is it official that Fermi is no longer supported for gpugrid?
Fermi (GTX 4xx & GTX 5xx series) is CC2.0 and CC2.1 which is below the minimum required CC3.0 therefore they are no longer supported for GPUGrid.
Due to some undisclosed "compiler problems" the the lesser Kepler based cards (CC3.0, GTX 660Ti) will fail all tasks.
See the compute capability (CC, also known as Shader Model (SM)) table on Wikipedia.
8) Message boards : Graphics cards (GPUs) : EVGA GeForce GTX 1080 Ti SC Black vs. 1080 FTW? (Message 47063)
Posted 7 days ago by Profile Retvari Zoltan
Thank you! I was wondering because I have had the 1080ti+hybridkit on preorder for a few weeks now as they are severely backordered, but the 1080hybrid is available to order today. Wanted to see if it was worth the wait. Seems it is...

Any differing opinions?
No. You should wait for the Ti, it's worth it!
If you browse the "Performance" tab, you could see it for yourself.
9) Message boards : Number crunching : extremely high error rates (Message 47058)
Posted 8 days ago by Profile Retvari Zoltan
Since I have a PC that has GTX 970 alongside 2x GTX 660 Ti (SM3.0).... that means that I'm still failing a lot of tasks, until the app is fixed.
Who knows how long it could take. Perhaps you should exclude in your cc_config.xml those GPUs in the meantime.

So, the "918" app is not running fine... for me.
I imagine I'm not the only one.
I copy my method here for you and everybody else:
Copy the following to your clipboard:
notepad c:\ProgramData\BOINC\cc_config.xml
Press Windows key + R, then paste and press enter.
If you see an empty file, copy and paste the following text:
<cc_config> <options> <exclude_gpu> <url>www.gpugrid.net</url> <device_num>1</device_num> <type>NVIDIA</type> </exclude_gpu> </options> </cc_config>
The value in the <device_num> section should be adapted to the given system.
You can have as many <exclude_gpu> sections in your cc_config.xml as many GPUs you have to disable.
If your cc_config.xml already has an <options> section then you should insert the section between the <exclude_gpu> and the </exclude_gpu> tags (including both) right after the <options> tag.
Click file -> save and click [save].
If your BOINC manager is running, you should click Options -> read config files.
Perhaps you should restart BOINC manager (stop the scientific applications upon exiting).
10) Message boards : Graphics cards (GPUs) : Pascal (GTX10x0) overclocking (Message 47051)
Posted 8 days ago by Profile Retvari Zoltan
Thank you for your advice. It's working.

I've mistook the short and the long names of the power states.
Performance level 3 - P0
Performance level 2 - P2
Performance level 1 - P5
Performance level 0 - P8

Compute mode is a mixture of P0 and P2 (i.e. memory clock comes from P2 but GPU clock comes from P0)

Next 10