1) Message boards : Graphics cards (GPUs) : Maxwell in February (Message 37929)
Posted 1 day ago by Profile Retvari Zoltan*
I estimate that the new GM204 will be about 45% faster than a 780ti.

That's a bit optimistic estimation as (1216/928)*(16/15)=1.3977.
but...
1. my GTX780Ti is always boosting to 1098MHz,
2. the 1219MHz boost clock seems to be a bit high, as the GTX750Ti's boost clock is only 1085MHz, and it's a lesser chip.

We'll see it soon.

BTW there's an error in the chart, as the GTX780Ti has 15*192 CUDA cores.
2) Message boards : Graphics cards (GPUs) : Maxwell in February (Message 37928)
Posted 1 day ago by Profile Retvari Zoltan*
How does a 5 SMM, 640core/40TMU/60W-TDP GTX750ti perform (7%~) better than a 4SMX, 768 core/ 110/130W-TDP Kelper with more TMU(64), while smashing GTX650ti/boost compute time/power consumption ratios? Core/memory speed differences'? GTX 750ti is close (~5%) to GTX660 (5SMX/960Core/140w-TDP) compute times.

That's very easy to answer:
The SMXes of the GTX650Ti and the GTX660 are superscalar, so only (approximately) 2/3rd of their cores can be utilized (512 and 640, respectively).

If this is the case, then why do GPU utilization (MSI afterburner, eVGA precision) programs show +90% for most GPUGRID tasks? Are these programs not accounting for type of (scalar or superscalar) architecture? If only 2/3rd of cores are active, won't GPU utilization be at ~66%, instead of the typical 90%? These programs are capable of monitoring Bus usage, memory control (frame buffer), Video processing, amount of power, and much more.

The "GPU utilization" is not equivalent of the "CUDA cores utilization". These monitoring utilities are right in showing that high GPU utilization, as they showing the utilization of the untis which feeding the CUDA cores with work. I think the actual CUDA cores utilization can't be monitored.
3) Message boards : Graphics cards (GPUs) : Maxwell in February (Message 37924)
Posted 1 day ago by Profile Retvari Zoltan*
How does a 5 SMM, 640core/40TMU/60W-TDP GTX750ti perform (7%~) better than a 4SMX, 768 core/ 110/130W-TDP Kelper with more TMU(64), while smashing GTX650ti/boost compute time/power consumption ratios? Core/memory speed differences'? GTX 750ti is close (~5%) to GTX660 (5SMX/960Core/140w-TDP) compute times.

That's very easy to answer:
The SMXes of the GTX650Ti and the GTX660 are superscalar, so only (approximately) 2/3rd of their cores can be utilized (512 and 640, respectively).
4) Message boards : Graphics cards (GPUs) : Maxwell in February (Message 37921)
Posted 1 day ago by Profile Retvari Zoltan*
So here is deciding which card would be best for GPUgrid
Boost clock for gtx 980 looks very good, even the 64 ROPS.. But 780Ti have 2880 cuda cores...?

The GTX 780Ti is superscalar, so not all of the 2880 CUDA cores can be utilized by the GPUGrid client. The actual number of the utilized CUDA cores of the GTX 780Ti is somewhere between 1920 and 2880 (most likely near the lower end). This could be different for each workunit batch. If they really manufacture the GM204 on 28nm lithography, than this is only a half step towards a new GPU generation. The performance per power ratio will be slightly better of the new GPUs, and (if the data in this chart are correct) I expect the GTX980 could be 15~25% faster than the GTX780Ti (here at GPUGrid). When we'll have the real GPUGrid performance of the GTX980, we'll know how much of the 2880 CUDA cores of the GTX780Ti is actually utilized by the GPUGrid client. But as NVidia choose to move back to scalar architecture, I expect that the superscalar architecture of the Keplers (and the later Fermis) wasn't as successful as expected.
5) Message boards : News : gpugrid.net layout upgrade (Message 37918)
Posted 1 day ago by Profile Retvari Zoltan*
...I included the GPU description in the rank...

This is revealed some errors of the total computing time (Timestamp) in the NOELIA_SH2 and NOELIA_tpam2 groups.
6) Message boards : News : gpugrid.net layout upgrade (Message 37886)
Posted 5 days ago by Profile Retvari Zoltan*
Wow!

I like the new version very much.
7) Message boards : Number crunching : Errors piling up, bad batch of NOELIA? (Message 37875)
Posted 6 days ago by Profile Retvari Zoltan*
Those which run for a long time before the error shows (the NOELIA_tpam2 workunits) are failing because they got a lot of "The simulation has become unstable. Terminating to avoid lock-up" messages before they actually fail. This kind of error is usually caused by:
- too high GPU frequency
- too high GDDR5 frequency
- too high GPU temperature
- too low GPU voltage
The bad batch this thread is about consists of NOELIA_TRP188 workunits, which usually fail right after the start.

I disagree either of the 4 points being the issue.

Of course there are more possibilities, but these 4 points are the most frequent ones, also these could be checked easily by tuning the card with software tools (like MSI Afterburner). Furthermore these errors could be caused by a faulty (or inadequate) power supply, and the aging of the components (especially the GPU). These are much harder to fix, but you can still have a stable system with these components if you reduce the GPU/GDDR5 frequency. It's better to have a 10% slower system than a system producing (more and more frequent) random errors.

My Nvidia Quadro K4000 GPU is completely stock with absolutely no modifications or overclocking applied. So the frequencies are right.

The statement in the second sentence is not in consequence of the first sentence. The frequencies (for the given system) are right when there are no errors. The GPUGrid client pushes the card very hard, like the infamous FurMark GPU test, so we had a lot of surprises over the years (regarding stock frequencies).

Also the case of my host Dell Precision T7610 is completely unmodified and the case fans are regulated automatically as always. The GPU run at 75 degree C which is well on the safe side. Further, I haven't performed a driver update for months.

It is really strange, that a card could have errors even below 80°C. I have two GTX 780Ti's in the same system, one of them is an NVidia standard design, the other is an OC model (BTW both of them Gigabyte). I had errors with the OC model right from the start while its temperature was under 70°C (only with GPUGrid, no other testing tools showed any errors), but reducing its GDDR5 frequency from 3500MHz to 2700MHz (!) solved my problem. After a BIOS update this card is running error free at 2900MHz, but it's still way below the factory setting.

May I add that I haven't noticed any failed WU's on my system until now. Within 5 days 5 NOELIA WU's failed.

If you check the logs of your successful tasks, those also have this "The simulation has become unstable. Terminating to avoid lock-up" messages, so you were lucky that those workunits were successful. If you check my (similar NOELIA) workunits, none of them has these messages.
So, give it a try to reduce the GPU frequency (its harder to reduce the GDDR5 frequency, as you have to flash the GPU's BIOS).
8) Message boards : Number crunching : Errors piling up, bad batch of NOELIA? (Message 37863)
Posted 8 days ago by Profile Retvari Zoltan*
Those which run for a long time before the error shows (the NOELIA_tpam2 workunits) are failing because they got a lot of "The simulation has become unstable. Terminating to avoid lock-up" messages before they actually fail. This kind of error is usually caused by:
- too high GPU frequency
- too high GDDR5 frequency
- too high GPU temperature
- too low GPU voltage
The bad batch this thread is about consists of NOELIA_TRP188 workunits, which usually fail right after the start.
9) Message boards : Server and website : Bad Noelias (Message 37827)
Posted 13 days ago by Profile Retvari Zoltan*
I love this television, the first channel is successful infinite series Noelia errors, and on the second channel is a new documentary series, Lament due to high RAC the bitcoin utopia

Well, you've missed the title of that documentary on the second channel, as it is
"How to turn BOINC to a commercial platform?"
10) Message boards : Number crunching : Errors piling up, bad batch of NOELIA? (Message 37826)
Posted 13 days ago by Profile Retvari Zoltan*
I'm having the same issue with NOELIA_TRP188 workunits.
All of these run to the following error after 2 seconds:
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified


Next 10