Advanced search

Message boards : Graphics cards (GPUs) : GPUGRID and Fermi

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16513 - Posted: 23 Apr 2010 | 17:16:55 UTC

We managed to install easily the two cards a GTX470 and a GTX480 donated by ftpd (thanks).

ACEMD does not run on it as we know from previous tests on GPUGRID. We are now testing why.

It is strange because the app crashes with error 700 only on the Fermi card, but not on the other cards of the same machine.

gdf

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16524 - Posted: 24 Apr 2010 | 18:54:59 UTC - in response to Message 16513.

Now ACEMD works on Fermi. Working on the optimizations.
Most likely on Monday we will have a working Fermi application out for Linux and possibly Windows.

gdf

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16548 - Posted: 26 Apr 2010 | 7:13:20 UTC - in response to Message 16524.

So far, the performance of a GTX480 is only 30% faster than a GTX 285.
Fermi is completely different from a G200 chip, it's not just plug&play for a code optimized on the G200 as ACEMD.

We are trying to understand what's the problem. A factor two (100% faster) should be really the minimum for the hardware.

I'll keep you posted.
gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16557 - Posted: 26 Apr 2010 | 13:02:43 UTC - in response to Message 16548.

CC1.3 optimization did not happen overnight!

I take it that you are comparing a non-optimized Fermi to the optimized GTX285 ???

I think you said CC1.3 cards are now optimized to 200% from the original app via several updates. If so then compared to a non-optimized GTX285 would mean Fermi is about 2.6 times as fast, in a non-optimized to non-optimized comparison.

So which is it, an optimized or non-optimized comparison?

Either way, 30% better than a GTX285 is 130% better than last Friday!



PS. For anyone that doesn't know,
Fermi means 10 to the power of minus 15 metres (10^-15m).

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16558 - Posted: 26 Apr 2010 | 13:49:20 UTC - in response to Message 16557.


Fermi means 10 to the power of minus 15 metres (10^-15m).


That's fempto not fermi...

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16559 - Posted: 26 Apr 2010 | 14:40:52 UTC - in response to Message 16558.
Last modified: 26 Apr 2010 | 14:44:46 UTC

In English, Fermi = femto

I know it means shut too. PS. There is no p in femto!

Did you compare the Fermi cards performance to that of an optimized app running on a GTX285 or a non-optimized app (the original ACEMD)?

Thanks,

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16560 - Posted: 26 Apr 2010 | 14:43:52 UTC - in response to Message 16559.

The optimized application.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16561 - Posted: 26 Apr 2010 | 14:45:44 UTC - in response to Message 16560.

Then there is hope!
Thanks,

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 16562 - Posted: 26 Apr 2010 | 16:13:51 UTC

One question, if you succeed in optimizing your app to make Fermi cards run twice as fast as the GTX285, would that mean that someone crazy enough to buy a Tesla & crunch for GPUGRID.net would be 4 times faster than a Fermi & if so & the Fermi is only crippled through the Nvidia Driver, would that "potentially" mean that if Nvidia uncripples or "someone" hacks it, that the GTX480 would run 8 times faster than a GTX285 if you can get it to run twice as fast with a crippled GTX480?
____________

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16564 - Posted: 26 Apr 2010 | 16:32:02 UTC - in response to Message 16562.

One question, if you succeed in optimizing your app to make Fermi cards run twice as fast as the GTX285, would that mean that someone crazy enough to buy a Tesla & crunch for GPUGRID.net would be 4 times faster than a Fermi

DP as I understand it is crippled from 1/2 to 1/5th the SP speed, so for DP it could conceivably be a max of 2.5 times faster. However AFAIK GPUGRID doesn't use DP so there probably wouldn't be a speed increase at all. MilkyWay speed may increase to a reasonable level when they get the v3.x app working, but from early reports still slow compared to ATI.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 16566 - Posted: 26 Apr 2010 | 17:19:15 UTC - in response to Message 16564.

So that's to say that the Fermi GTX480 was crippled from the start 1/2 & couldn't undergo a softmod, in the same way that 8xxx & 9xxx could become a Quadro. But still, if GPUGRID.net doesn't use DP it doesn't really matter...
____________

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16567 - Posted: 26 Apr 2010 | 17:25:11 UTC - in response to Message 16566.

So that's to say that the Fermi GTX480 was crippled from the start 1/2 & couldn't undergo a softmod,

DP speed is always slower than SP speed so the 1/2 speed is not because it's crippled. Dropping it to 1/5th is crippling it.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16569 - Posted: 26 Apr 2010 | 20:38:23 UTC - in response to Message 16567.
Last modified: 26 Apr 2010 | 20:55:58 UTC

Current performance on the GTX480 is 35% faster than the fastest version of ACEMD on a GTX285.

There is little more we can do I think. The cuda3 compiler seems much worse at optimizing the code than the 2.2 version. If this is the case, then 3.1 could bring some improvement.

Tomorrow, we clean up and submit the new application in beta.

gdf
PS: no doubts this is much less than expected

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16572 - Posted: 26 Apr 2010 | 21:00:03 UTC - in response to Message 16559.

In English, Fermi = femto


LOL! The first time I see this. It looks totally ridiculous.. but apparently is true.

MrS
____________
Scanning for our furry friends since Jan 2002

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16575 - Posted: 26 Apr 2010 | 21:14:47 UTC - in response to Message 16569.

Tomorrow, we clean up and submit the new application in beta.


Hooray !!!
I am all ready and setup for testing. I have a day job (east coast usa) so may not be able to provide up to the minute coverage but I will report any results as soon as I see them on my machine.

____________
Thanks - Steve

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16583 - Posted: 27 Apr 2010 | 7:47:16 UTC - in response to Message 16575.

We have now a good performance model of the application. The maximum we should have expected from Fermi is a reduction in time of 60% compared to a GTX275. Actually, we now think that this will be achievable in the future. The problem is that CUDA 3 is slower even on GTX200 by almost 15%. Indeed, we are 60% faster compared to both applications compiled for CUDA3 but compared to CUDA2.2 only 35%.

On our machine the running temperature is 91 degrees. Equivalent to a GTX275.
The only real problem is the price. Later I will report the time of the GTX470 and running temperature.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16585 - Posted: 27 Apr 2010 | 12:17:22 UTC - in response to Message 16583.

Boinc Messages:
NVIDIA GPU 0: GeForce GT 240 (driver version 19621, CUDA version 3000, compute capability 1.2, 512MB, 307 GFLOPS peak)

Boinc Tasks:
ACEMD – GPU molecular dynamics 6.03 (cuda) p45-IBUCH_101b_pYEEI...

Would this task run faster if I had CUDA version 2.2 or does it just depend on what you compile it on (2.2)?

Thanks,

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16586 - Posted: 27 Apr 2010 | 13:20:49 UTC - in response to Message 16585.

Yes, it's what it's compiled with.


____________
Radio Caroline, the world's most famous offshore pirate radio station.
Great music since April 1964. Support Radio Caroline Team -
Radio Caroline

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16589 - Posted: 27 Apr 2010 | 19:41:43 UTC - in response to Message 16583.

.........The only real problem is the price.......
gdf


Thats a big problem - especially baring in mind that a 295 - taking up the same card space in a machine - takes out a 480, why should the world at large pay that premium for a hot & noisy 480, I think NVidia have the pricing wrong given the performance level.

Bad enough trying to find a 480 supplier, the other problem at present is no one with a 295 wants to let it go for a Fermi as they get no performance boost, and 295's are now EOL.

There is no way the market in general will be pushed around like that - it will just go ATI, they are pushing once loyal consumers over the edge.

Regards
Zy

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16591 - Posted: 27 Apr 2010 | 20:43:31 UTC - in response to Message 16589.

WHEN Fermi IS Optimized by GPUGrid it WILL OutPerform the GTX 295!
Presently the highly optimized GTX295 just about outperforms a totally non-optimized Fermi, which is by the way not yet the finished article.
There will be a CUDA 3000 & driver update.
There will be a GPUGrid refinement for Fermi,
and there will be other Fermi's.

By the way, there is little price difference between the GTX480 and the GTX 295 and both are as hard to find.
That said, NVidia’s Internal Implosion of a strategy has recently been completely and utterly bonkers!

PS. Your RAC is still at Zero and yet you still keep banging on. Have you no shame?

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16594 - Posted: 28 Apr 2010 | 0:35:00 UTC - in response to Message 16591.

Enough ..................

I will express my view on a project I love whether you like it or not

I will crunch on the project when I can get a card that will perform to the cost performance ratio that I can afford. I haved mo0veed to an ATI card due to reason connected with my work, and theyt nothing to do with you.

I want to restart the crunching that i did for over year for a very personal reason.

I will express my views - politely - on a card that i am interested in buying if it performs the way i want it to perform.

You dont like - tough - and I'll thank you to keep your crass opinions of my "shame" to yourself, I am not interested ........................

After 35 years of software development running teams the size of which you could only dream about, its about time you zipped it. You dont own GPU Grid, so keep your personal opinions of others to yourself.

If that earns me a ban - fine - because that was the most appauling crass and arrogent comment i have seen on a BOINC board for a very long time.

GOODBYE............................

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 16602 - Posted: 28 Apr 2010 | 12:32:29 UTC
Last modified: 28 Apr 2010 | 13:09:42 UTC

Wow! Maybe he's going to f@h. BTW I had to go there too with my 8800GT 256MB which didn't have enough RAM for GPUGRID.net the 9400 IGP likes GPUGRID.net though, 3 days a week ;-)

f@h took over the PS3's, Ati runs better there, & if I was miffed with GPUGRID.net I'd go there too. But Nvidia is the way & Fermi is the future of Nvidia. That GPUGRID.net is the 8th in combined credits & the 5th in RAC & f@h gets more work done then the combined BOINC community, says something too. Is the future bright, or very bleak?
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16604 - Posted: 28 Apr 2010 | 12:46:58 UTC - in response to Message 16602.

The future will be brighter when also ATI works nicely.

gdf

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16605 - Posted: 28 Apr 2010 | 13:21:36 UTC - in response to Message 16604.

Performance for a reference molecular system (DHFR):
GTX275 9.0 ms/step (cuda3.0)
GTX275 8.5 ms/step (cuda2.2)
GTX470 7.1 ms/step (cuda3.0)
GTX480 5.9 ms/step (cuda3.0)

Running temperature of about 92 degrees for all of them.

gdf

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16609 - Posted: 28 Apr 2010 | 16:39:21 UTC - in response to Message 16591.

PS. Your RAC is still at Zero and yet you still keep banging on. Have you no shame?

A RAC of zero only means that the participant has not been able to work on a particular project for some time. I was, i think, number 8 or 9 on the project some time back ... but I am falling because I am slowly replacing my Nvidia cards with ATIs because of the price performance and the power drawn for said same ... when I replace my last three Nvidia cards my RAC also will fall to zero ... not because I don't believe int the project, but because the project is not supporting me ...

Now, I don't think Zydor blames GPU Grid, and neither do I, for not having an ATI application already ... but ... he did do 1M CS and that is nothing to disrespect. And, if and when GPU Grid does have a working ATI application I am sure that if he can he, like I, will be putting GPU Grid back in our project rotations ...

The point Zy was trying to make is that even after optimization it is unlikely that the speed improvement is going to be high enough to justify the cost of the card or the power it draws. Perhaps he did not say it artfully, perhaps you misunderstood ... but just because someone has a RAC of zero does not mean that we should not give them fair hearing ...

More sharply to the point there does not even seem to be a doubling of performance between the GTX295 and its putative replacement. Generally, one expects that the next generation GPU doubles or triples the performance of the prior generation of cards. For example, the preliminary numbers I have for the 4870 vs 5870 cards I have is that the 5870s are twice as productive, much quieter, and don't draw that much more power ...

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16610 - Posted: 28 Apr 2010 | 17:06:19 UTC - in response to Message 16609.

Sorry guys, we are still testing.
Hopefully tomorrow we can get out the new beta.

gdf

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 16612 - Posted: 28 Apr 2010 | 18:10:20 UTC

Fermi's still too rich for me, this is just a hobby for myself. I don't plan to use any newly developed anti-psychotic medicine. But others might & God knows how many (drug) experimenting youthes, war vets, & traumatized civilians in war zones will need them, on top of everybody else that's sick or stressed.

But appart from not gaining enough performance/price out of Fermi, it's great work on the part of GPUGRID.net, BOINC, & Nvidia to get as much out of the GPU's as possible.

I can't prove this appart from that I got much more work done on my 8800GT 256MB on Linux then on Windows, but I'm about to take my two GTX260 on my Windows PC with an RAC of 46-59K & switch it over to the Linux PC using one GTX260 with a RAC of 27-29K (which also stops due to lacking system RAM which is 2GB which I also will switch with the 4GB Windows PC). In 2 weeks to a month, I'll know for sure if it indeed is true that you get much more done on Linux VS Windows.

Someone told me that this was not the case, & that GPUGRID.net would get as much work done on a Windows or Linux, but I beg to differ & will soon know if it's true. If this is the case, people should know, if it's credits they want & GPUGRID.net wants WU completed faster.
____________

dajeepster
Send message
Joined: 22 Nov 09
Posts: 1
Credit: 130,920,129
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 16613 - Posted: 28 Apr 2010 | 18:26:58 UTC - in response to Message 16610.

my GTX480 is itching to go... it's hungry for some wu with a little chianti :D

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16615 - Posted: 28 Apr 2010 | 21:34:05 UTC - in response to Message 16610.

Hopefully tomorrow we can get out the new beta.

gdf

We will be here to support you whenever you are ready!

____________
Thanks - Steve

samsausage
Send message
Joined: 18 Nov 08
Posts: 12
Credit: 70,480,919
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16616 - Posted: 28 Apr 2010 | 23:28:44 UTC

Quote:
More sharply to the point there does not even seem to be a doubling of performance between the GTX295 and its putative replacement. Generally, one expects that the next generation GPU doubles or triples the performance of the prior generation of cards. For example, the preliminary numbers I have for the 4870 vs 5870 cards I have is that the 5870s are twice as productive, much quieter, and don't draw that much more power ...




Just to point out the GTX295 is a 2 GPU card and the GTX480 is a Single GPU Card. Therefore on a GPU to GPU comparison the performance did increase by quiet a bit and the drivers/software is not optimized for the 480 yet where the 200 series cards have had a long time to get optimized.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16618 - Posted: 29 Apr 2010 | 5:02:54 UTC - in response to Message 16616.


More sharply to the point there does not even seem to be a doubling of performance between the GTX295 and its putative replacement. Generally, one expects that the next generation GPU doubles or triples the performance of the prior generation of cards. For example, the preliminary numbers I have for the 4870 vs 5870 cards I have is that the 5870s are twice as productive, much quieter, and don't draw that much more power ...


Just to point out the GTX295 is a 2 GPU card and the GTX480 is a Single GPU Card. Therefore on a GPU to GPU comparison the performance did increase by quiet a bit and the drivers/software is not optimized for the 480 yet where the 200 series cards have had a long time to get optimized.

This is correct, but misleading too in that the 4870 and 5870 are also single "CPU" cards as well ... based on the numbers that I recall people telling me about the Fermi it either barely matches or lags behind the 5870 and that is not the best from ATI ... and it is far older ...

Generally speaking, when Nvidia comes out with a card that is supposed to compete with ATI (and the converse) one expects that the new card will have 1.5 to 3 times the performance of the rival;s top-of-the-line card... that is not the case here... there is a valid point, almost, that the compiler is not ready, but, that misses the point that it SHOULD have been ready and working and stable .. and it isn't ...

Yes the ATI world has issues as noted by GDF in another thread tonight that they may have figured out where the issue is ... but, part of that is the learning curve of coming up with the understanding of the tools and architecture of a different system ... they know CUDA, now they have to learn Stream ... the same, but different ... we know it can be done, witness MW, Collatz and now DNETC ... soon GPU Grid? we can but hope ...

samsausage
Send message
Joined: 18 Nov 08
Posts: 12
Credit: 70,480,919
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16622 - Posted: 29 Apr 2010 | 12:07:22 UTC - in response to Message 16618.


part of that is the learning curve of coming up with the understanding of the tools and architecture


Exactly what is the case with Fermi, it has brought major changes to the architecture and how data is processed

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16627 - Posted: 29 Apr 2010 | 15:01:25 UTC - in response to Message 16622.

acemdbeta6.23 cuda3 for windows should work on Fermi.
We could not test on Windows as the cards are on Linux for now.

gdf

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16628 - Posted: 29 Apr 2010 | 15:01:45 UTC - in response to Message 16622.

part of that is the learning curve of coming up with the understanding of the tools and architecture


Exactly what is the case with Fermi, it has brought major changes to the architecture and how data is processed

Except Fermi should have been backwards compatible, and does not appear to be so ...

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16630 - Posted: 29 Apr 2010 | 15:54:18 UTC

Processed two beta WU with gtx480 = OK

# 9 - 8 min 0 secs
# 10 - 7 min 29 secs


____________
Ton (ftpd) Netherlands

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16631 - Posted: 29 Apr 2010 | 16:49:19 UTC

ftpd - looks good! That is on WinXP correct?
I think my card downclocked (clockrate reports as 0.81) and is not waking up properly as I am throwing errors on all WUs.
It will be another 4 hours before I get home and get things stightened out.
I will post back as soon as I can.
____________
Thanks - Steve

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16632 - Posted: 29 Apr 2010 | 17:21:01 UTC

Processed another two beta-WU with GTX480 on Windows-XP-pr 06.10.50 (boinc-mgr).

# 16 - 7 min 0 secs
# 50 - 7 min 51 secs

Eight other jobs being processed also the same time!

I have no remote-control with my GTX470, so monday.


____________
Ton (ftpd) Netherlands

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16634 - Posted: 29 Apr 2010 | 18:43:37 UTC - in response to Message 16632.

We have added a way such that you can control the use of CPU.
If you set the environment variable in Windows:

SWAN_SYNC=0

then it will use a full CPU core and be around 20/30% faster.
Unset it and it goes back to standard low cpu usage.

gdf

=[PULSAR]=
Send message
Joined: 22 Feb 10
Posts: 9
Credit: 16,172,951
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16636 - Posted: 29 Apr 2010 | 18:58:39 UTC
Last modified: 29 Apr 2010 | 19:02:12 UTC

so to get try out the new beta for cuda 3.0 all we have to do is to select ACEMD beta is gpugrid preferences?

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16637 - Posted: 29 Apr 2010 | 19:13:23 UTC - in response to Message 16636.
Last modified: 29 Apr 2010 | 19:13:56 UTC

yes. and need to have a cuda3 driver.

=[PULSAR]=
Send message
Joined: 22 Feb 10
Posts: 9
Credit: 16,172,951
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16639 - Posted: 29 Apr 2010 | 19:38:34 UTC

Just gave it a shot on win7 64bit and instant computation error, that was with 197.41 drivers. Tried to install the DEV cuda 3.0 drivers version 197.13 and said I had no supported hardware. This is on a evga gtx 480.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16640 - Posted: 29 Apr 2010 | 19:49:15 UTC

Generally speaking, when Nvidia comes out with a card that is supposed to compete with ATI (and the converse) one expects that the new card will have 1.5 to 3 times the performance of the rival;s top-of-the-line card...


Assuming this would turn practically every card into a failure. Consider this:

- GP-GPU wise it's still a little wild west, but gaming wise both design are quite mature - thei're using their transistors in an efficient way and there's no easy to dramatically improve this, the low hanging fruits are all eaten by now

- both use the same process at the same manufacturer and thus get similar power efficiency at the transistor level

Apart from special new feature the only way left to (greatly) improve performance is by using more transistors. Asking for an advantage of a factor of 3 would require about 3 times as many transistors. And such a chip could likely not be clocked as high (longer signal lengths for cross-chip communication, larger scatter of transistor performance across the chip) you'd probably need anywhere from 3 to 4 times as many transistors. Fine - but you can't use an infinite amount of them. Power consumption goes up linearly with transistor count and you're currenty capped at a maximum of 300 W. And you have to be able to manufacture the huge chip at all.. :p

As you see, nVidia is already pushing it quite hard (some say too hard) at 3.2 billion transistors. Yet at best they could expect about 50% higher performance than Cypress (not talking about fancy architecture tricks or new features).

So you can't blame them for not trying to give you 1.5 times the performance of the best ATI chip - actually they failed because they tried to do so!

Except Fermi should have been backwards compatible, and does not appear to be so ...


It does run general CUDA code just fine (if it doesn't it's probably a driver bug). However, it doesn't run the code hand-crafted for GT200 - because it's a different chip.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16641 - Posted: 29 Apr 2010 | 20:23:52 UTC - in response to Message 16640.

http://www.gpugrid.net/show_host_detail.php?hostid=63357
http://www.gpugrid.net/results.php?hostid=63357

The betas are doing well on a GTX470. Tried a shutdown boinc, and a restart. It started at the begining (but I would expect that on such short tasks).
GPU at 54% use (EVGA Precision; might not tell you much, but suggests potential).
GPU Temp at 81 deg C.
http://www.techpowerup.com/gpuz/79dud/

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16642 - Posted: 29 Apr 2010 | 20:46:30 UTC - in response to Message 16634.

Where do i have to change that in windows-xp-pro?
____________
Ton (ftpd) Netherlands

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 137
Credit: 1,308,230,581
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16643 - Posted: 29 Apr 2010 | 20:58:47 UTC - in response to Message 16642.
Last modified: 29 Apr 2010 | 21:23:50 UTC

MyComputer/right click -> Manage -> Computer managent(Local)/right click -> properties -> Advanced -> environment variables -> settings -> new system variable -> add


Should it work on 6.22 + GTX260? It seems not to work. CPU load 0 - 3% on 4CPU Xeon for both values 0 and 1.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16647 - Posted: 29 Apr 2010 | 22:03:57 UTC - in response to Message 16643.
Last modified: 29 Apr 2010 | 22:26:38 UTC

Tried another close & open Boinc after 13min run time of a task. It picked up at 12min and completed OK.
Also tried Suspend & Resume; the tasks started from zero, again.

PS. The SWAN_SYNC=0 variable works on Binc 6.10.43 W7 Pro, after closing and opening Boinc.
Excellent, Thank You!

Already starting to see the difference, task.

how to set SWAN_SYNC=0 on win7...
Just enter SWAN_SYNC=0 twice ;)

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16648 - Posted: 29 Apr 2010 | 22:28:50 UTC - in response to Message 16647.

Your card seems to run at lower clocks that what it should be.
gdf

Tried another close & open Boinc after 13min run time of a task. It picked up at 12min and completed OK.
Also tried Suspend & Resume; the tasks started from zero, again.

PS. The SWAN_SYNC=0 variable works on Binc 6.10.43 W7 Pro, after closing and opening Boinc.
Excellent, Thank You!

Already starting to see the difference, task.

how to set SWAN_SYNC=0 on win7...
Just enter SWAN_SYNC=0 twice ;)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16650 - Posted: 29 Apr 2010 | 22:56:11 UTC - in response to Message 16648.
Last modified: 29 Apr 2010 | 23:04:38 UTC

It is running at stock GTX470 speeds!
Perhaps W7 64 is a bit slow?
Now only running 6 CPU Boinc tasks (mostly WCG), using 88% CPU. CPU @ 3.3GHz:

This task is much faster :)

It will be left running, as is; to only pick up betas tomorrow (30/04/10), but I may not be around much.

Good Luck,

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16651 - Posted: 29 Apr 2010 | 23:06:01 UTC - in response to Message 16650.

It is running at stock GTX470 speeds!
Perhaps W7 64 is a bit slow?
Now only running 6 CPU Boinc tasks (mostly WCG), using 88% CPU. CPU @ 3.3GHz:

This task is much faster :)

It will be left running, as is; to only pick up betas tomorrow (30/04/10), but I may not be around much.

Good Luck,


If you look at the output of ACEMD here:
http://www.gpugrid.net/result.php?resultid=2243388
Your clock seems to be at only 0.8Ghz and indeed time per step is high.

gdf

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16652 - Posted: 29 Apr 2010 | 23:08:30 UTC - in response to Message 16647.

So do we add SWAN_SYNC=0 in both the User and System?

Tried another close & open Boinc after 13min run time of a task. It picked up at 12min and completed OK.
Also tried Suspend & Resume; the tasks started from zero, again.

PS. The SWAN_SYNC=0 variable works on Binc 6.10.43 W7 Pro, after closing and opening Boinc.
Excellent, Thank You!

Already starting to see the difference, task.

how to set SWAN_SYNC=0 on win7...
Just enter SWAN_SYNC=0 twice ;)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16654 - Posted: 29 Apr 2010 | 23:34:35 UTC - in response to Message 16651.

yes, core count says 112

upgraded to latest Boinc Beta, but says,
30/04/2010 00:20:42 NVIDIA GPU 0: GeForce GTX 470 (driver version 19703, CUDA version 3000, compute capability 2.0, 1248MB, 726 GFLOPS peak)

8-GIANNI_TESTDHFR6-2-10-RND0996_0 took 11min 24sec.

will restart,

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16655 - Posted: 29 Apr 2010 | 23:43:56 UTC - in response to Message 16652.

system

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16656 - Posted: 29 Apr 2010 | 23:45:12 UTC - in response to Message 16654.

[quote]yes, core count says 112

This is wrong reporting. The problem is this:
# Clock rate: 0.81 GHz


g

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16657 - Posted: 30 Apr 2010 | 0:02:36 UTC - in response to Message 16656.

After restart Boinc still thinks clock is 0.81 (6.10.50).

But, this task finished injust under 8min!
Odd.

comfortw
Avatar
Send message
Joined: 28 Oct 08
Posts: 9
Credit: 1,740,304,089
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16660 - Posted: 30 Apr 2010 | 3:23:49 UTC


Hi! please help I can't get beta WU for GTX470, my details are as below

4/30 Starting BOINC client version 6.10.50 for windows_intelx86
4/30 log flags: file_xfer, sched_ops, task
4/30 Libraries: libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
4/30 Data directory: C:\Documents and Settings\All Users\Application Data\BOINC
4/30 Running under account admin
4/30 Processor: 4 AuthenticAMD AMD Phenom(tm) 9650 Quad-Core Processor [Family 16 Model 2 Stepping 3]
4/30 Processor: 512.00 KB cache
4/30 Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni cx16 syscall nx lm svm sse4a osvw ibs page1gb rdtscp 3dnowext 3dnow
4/30 OS: Microsoft Windows XP: Professional x86 Edition, Service Pack 2, (05.01.2600.00)
4/30 Memory: 3.00 GB physical, 4.84 GB virtual
4/30 Disk: 127.99 GB total, 72.23 GB free
4/30 Local time is UTC +8 hours
4/30 NVIDIA GPU 0: GeForce GTX 470 (driver version 19741, CUDA version 3000, compute capability 2.0, 1280MB, 1089 GFLOPS peak)
4/30 NVIDIA GPU 1: GeForce GTX 470 (driver version 19741, CUDA version 3000, compute capability 2.0, 1280MB, 1089 GFLOPS peak)
4/30 Version change (6.10.45 -> 6.10.50)
4/30 GPUGRID URL http://www.gpugrid.net/; Computer ID 51533; resource share 100
4/30 PrimeGrid URL http://www.primegrid.com/; Computer ID 124572; resource share 100
4/30 GPUGRID General prefs: from GPUGRID (last modified 13-Apr-2010 12:34:07)
4/30 GPUGRID Computer location: home
4/30 GPUGRID General prefs: no separate prefs for home; using your defaults
4/30 Reading preferences override file
4/30 Preferences:
4/30 max memory usage when active: 2763.38MB
4/30 max memory usage when idle: 2763.38MB
4/30 max disk usage: 10.00GB
4/30 max download rate: 1024000 bytes/sec
4/30 max upload rate: 1024000 bytes/sec
4/30 (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
4/30 Not using a proxy
4/30 Running CPU benchmarks
4/30 Suspending computation - running CPU benchmarks
4/30 Benchmark results:
4/30 Number of CPUs: 4
4/30 2296 floating point MIPS (Whetstone) per CPU
4/30 4740 integer MIPS (Dhrystone) per CPU
4/30 PrimeGrid Restarting task llr_sob_47464748_0 using llrSOB version 511
4/30 GPUGRID Sending scheduler request: To fetch work.
4/30 GPUGRID Requesting new tasks for GPU
4/30 GPUGRID Scheduler request completed: got 0 new tasks
4/30 GPUGRID Message from server: No work sent
4/30 GPUGRID Message from server: No work is available for ACEMD beta version

same problem on ealier beta version 6.10.45.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16661 - Posted: 30 Apr 2010 | 5:04:08 UTC - in response to Message 16660.

Hi! please help I can't get beta WU for GTX470, ....

Beta work is always hard to come by ... they only issue a few tasks to validate the code ... then make changes and repeat ...

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16666 - Posted: 30 Apr 2010 | 8:46:41 UTC - in response to Message 16657.

After restart Boinc still thinks clock is 0.81 (6.10.50).

But, this task finished injust under 8min!
Odd.



If you use SWAN_SYNC=0 (if you should notice the use of a full CPU), then
# Time per step (avg over 30000 steps): 15.143 ms
should be around 7.1 ms
or at least that's what it is in Linux.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16667 - Posted: 30 Apr 2010 | 8:50:47 UTC - in response to Message 16656.

[quote]yes, core count says 112

This is wrong reporting. The problem is this:
# Clock rate: 0.81 GHz

g


So why the 0.81 clock rate?

30/04/2010 09:21:34 NVIDIA GPU 0: GeForce GTX 470 (driver version 19741, CUDA version 3000, compute capability 2.0, 1248MB, 726 GFLOPS peak)
GPUZ says 1215 or 1220MHz shaders,
http://www.techpowerup.com/gpuz/79dud/

http://www.gpugrid.net/results.php?hostid=63357
http://www.gpugrid.net/result.php?resultid=2244231

PS. EVGA precision (jan 2010) mixes the core & RAM readings up and cant change timings:
core 1674, shader 810 (correct), RAM 405.

My card is Lazy!

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16679 - Posted: 30 Apr 2010 | 14:43:10 UTC - in response to Message 16667.

PS. EVGA precision (jan 2010) mixes the core & RAM readings up and cant change timings:
core 1674, shader 810 (correct), RAM 405.

My card is Lazy!

Try MSI Afterburner, the best GPU tweaking and fan control program I've used. Works better than EVGA precision for me:

MSI Afterburner GPU Control

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16681 - Posted: 30 Apr 2010 | 15:49:37 UTC

Did the beta WUs dry up?
the last one I got was at 30 Apr 2010 5:16:29 UTC.
____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16683 - Posted: 30 Apr 2010 | 16:06:57 UTC - in response to Message 16681.

These are the last two I picked up today,

2246323 1416677 30 Apr 2010 12:14:27 UTC 30 Apr 2010 13:00:35 UTC Completed and validated 1,684.90 198.56 187.28 280.92 ACEMD beta version v6.22 (cuda)
2246484 1416768 30 Apr 2010 13:12:52 UTC 30 Apr 2010 13:37:52 UTC Completed and validated 731.35 713.28 187.28 280.92 ACEMD beta version v6.23 (cuda30)

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16685 - Posted: 30 Apr 2010 | 17:36:20 UTC

Looks like they just started up again :-)
____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16690 - Posted: 30 Apr 2010 | 19:21:39 UTC - in response to Message 16685.

I have been trying to work out why this GTX470 is slower than it should be. (0.81)

Unfortunately I messed this Beta up, when I changed some timings using MSI Afterburner.
2247794 1417527 30 Apr 2010 18:59:09 UTC 30 Apr 2010 19:02:25 UTC Error while computing 77.60 71.07 187.28 --- ACEMD beta version v6.23 (cuda30)

I did manage to increase the Boinc rating to 945 GFLOPS peak, but for some reason it only lets me turn the shaders up from 810MHz to 1055MHz!

Any ideas?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16696 - Posted: 30 Apr 2010 | 21:38:19 UTC - in response to Message 16690.
Last modified: 30 Apr 2010 | 22:22:57 UTC

The Fermi Beta tasks seem to be working well.

I think some crapware that came with this Asus card might be to blame for the shader clock speeds, though I suspect having Boinc 6.10.18 to begin with did not help either, as it was reporting about 110GFlops peak.
Boinc/GPUGrid now sees the shaders as being what they should be at, as does GPU-Z, but I did have to use EVGA Precision to set the shaders to what they should be!

You might want to note that the task run time length is just the same however! So either the shader speed was just a reporting issue and W7 is much slower than XP with Fermi's or I still have a problem;

Recent task
Run time 743.539049
CPU time 714.313
stderr out

<core_client_version>6.10.50</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 470"
# Clock rate: 1.30 GHz
# Total amount of global memory: 1309081600 bytes
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
# Time per step (avg over 50000 steps): 14.815 ms
# Approximate elapsed time for entire WU: 740.739 s
called boinc_finish

</stderr_txt>
]]>

earlier task
Run time 740.302343
CPU time 719.539
stderr out

<core_client_version>6.10.50</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 470"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1309081600 bytes
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
# Time per step (avg over 50000 steps): 14.753 ms
# Approximate elapsed time for entire WU: 737.672 s
called boinc_finish

</stderr_txt>
]]>

Boinc is still reporting the wrong number of processors and cores. Put 6.10.45 back on, same situation.

Did some task suspends, shutdowns and clocked the GTX470 to that of a GTX480 without any issues.
Saw some speed increase in doing so, but GPU load fell from 65% to 60%
GPU 63 deg C, fan at 73% for above.

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16698 - Posted: 30 Apr 2010 | 23:31:08 UTC

I have been trying to get the ACEMD beta to download tasks and it keeps telling me it is unavailable. I've tried this on 6 different occasions over the last 2 days. I hope more ACEMD beta work units become available soon and in greater number.

I've got 2 480 GTX cards ready to crunch!

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16699 - Posted: 30 Apr 2010 | 23:43:16 UTC - in response to Message 16698.

Just select to do Beta work Only; no other tasks, and try again.
You are picking up other work units that wont work on Fermi.

There are plenty right now!

=[PULSAR]=
Send message
Joined: 22 Feb 10
Posts: 9
Credit: 16,172,951
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16703 - Posted: 1 May 2010 | 4:53:11 UTC

So after doing some math my OC GTX480 is averaging 32-33k PPD what are the chances that will improve in the near future? What is a single GTX295 averaging these days?

comfortw
Avatar
Send message
Joined: 28 Oct 08
Posts: 9
Credit: 1,740,304,089
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16704 - Posted: 1 May 2010 | 6:03:12 UTC - in response to Message 16699.

>Just select to do Beta work Only; no other tasks, and try again.

Just remind also to check "Run test application?" to yes.

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16705 - Posted: 1 May 2010 | 7:02:09 UTC

After updating systemlines (twice) with swan_sync=0 (windows xp-pro)
cpu use stays on 0.21 cpu + 1 nvidia (after restarting boinc 06.10.50).

Correct?

Last two beta WU last night within 6 min 30 secs without changing systemlines!
____________
Ton (ftpd) Netherlands

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16706 - Posted: 1 May 2010 | 7:22:38 UTC - in response to Message 16705.

Look at the CPU time reported by your last task compared to earlier ones.

=[PULSAR]=
Send message
Joined: 22 Feb 10
Posts: 9
Credit: 16,172,951
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16707 - Posted: 1 May 2010 | 7:36:10 UTC

How do you change the cpu usage? Right now its at 0.11 cpu

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16708 - Posted: 1 May 2010 | 7:38:06 UTC - in response to Message 16705.

After updating systemlines (twice) with swan_sync=0 (windows xp-pro)
cpu use stays on 0.21 cpu + 1 nvidia (after restarting boinc 06.10.50).

Correct?

Last two beta WU last night within 6 min 30 secs without changing systemlines!


You are almost at the right speed.
# Time per step (avg over 50000 steps): 7.366 ms
It could be that the low priority is making impossible to get to top speedwith this small molecular system or that Windows is slower than Linux. I will submit on Monday something bigger.

gdf

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16710 - Posted: 1 May 2010 | 8:36:10 UTC - in response to Message 16708.

Last beta WU this morning # 63 = 6 min 25 secs, including changing systemfiles.

Better now?

Nice weekend!


____________
Ton (ftpd) Netherlands

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16713 - Posted: 1 May 2010 | 9:13:15 UTC

@sk: 400 MHz core and 800 MHz shaders sounds like a power saving mode to me. Weren't there problems before that some cards didn't wake up from 2D power saving mode properly? Not sure which tool would display the correct clock in real time, though.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16717 - Posted: 1 May 2010 | 12:45:01 UTC - in response to Message 16713.
Last modified: 1 May 2010 | 13:34:49 UTC

I guess there are a mixture of problems:
Firstly, you are right; it is defaulting to a reduced speed, but I cannot be sure exactly what is dropping and by how much, the tools are not reliable! I have to use a number of tweaking tools to get it to read and change the clock speeds correctly, not easy. Actually, I still have my doubts about any of the readings - it may not actually be doing what it says it is doing, or perhaps only partially or intermittently. My return times are still too far out from what they should be.

GPUZ shows that the GPU rates can actually drop more than once. Right down to the following (at present settings),
- GPU core clock - 50.5MHz (normal 608MHz), (OC 707MHz)
- GPU mem clock - 67.5MHz (normal 837MHz), (OC 863MHz times two, I hope)!
- GPU shader clock - 101.0MHz (normal 1215MHz), (OC 1414MHz)

I noticed the memory controller load remains at 9% when the card is doing next to nothing!

The other issues could be to do with the Asus tweaking app that came with it, the OS (Win7 x64), drivers, or Boinc versions. We know Linux is faster, and XP is slightly faster too, so it is hard to say what is going on.
Anyway, when I have time I will put it into an XP system, or a server - perhaps Mon/Tue.
For now I will let it run some Betas, and report back. Perhaps in a few days there will be updates to the tools, drivers and Boinc! Perhaps the drivers already work better on XP than W7.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16719 - Posted: 1 May 2010 | 14:35:28 UTC - in response to Message 16707.

How do you change the cpu usage? Right now its at 0.11 cpu

If you already have the SWAN_SYNC environmental variable set to 0 then it should be using a full thread but BOINC still reports the percentage use just like if you did not have the eniron set. To double check open task manager and take a look ... when I first open it acemdbeta_* is reporting a small percent 0-3 use but if I wait cople seconds it will jump up to a full thread being used. My guess it that it is working properly but BOINC does not bother updating after the task is started (the acemdbeta has not yet initialised full thread) and Task Manager caches the initial value (like BOINC) but after it refreshes it reports correctly.
____________
Thanks - Steve

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16741 - Posted: 2 May 2010 | 11:36:17 UTC

@sk: BOINC hardly knows what a GPU is. There's no way it could alter its clock speeds and / or power saving settings. It just calls the executeable (GPU-Grid app). The memory clocks are OK, though: multiply them by 4 and you get the "advertising numbers". My HD4870 normally runs the mem at 900 real-MHz = 3600 marketing-MHz. And a memory controller load of 9% might be OK for 67 MHz. The desktop is in a frame buffer in GPU mem and needs to be read / updated according to the display refresh rate. 9% does sound high for such load, but who knows what else is going on..
To check the clock speeds you could use the artefact tester of some of the nice tools. Set it to display a static image, watch the FPS and adjust the clocks. Should be a nice feedback since it's a windowed application.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16744 - Posted: 2 May 2010 | 13:10:43 UTC - in response to Message 16717.
Last modified: 2 May 2010 | 13:12:11 UTC

We know Linux is faster, and XP is slightly faster too,

SK, XP32 and XP64 are quite a bit faster than either Win 7 32 or 64 for GPUGRID. Nothing has changed in the newer GPUGRID versions. I've been monitoring the times hoping the problem would be fixed. It hasn't been.

The relative speeds are still as described in this thread:

http://www.gpugrid.net/forum_thread.php?id=1729

And before that, here:

http://www.gpugrid.net/forum_thread.php?id=1449

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16746 - Posted: 2 May 2010 | 15:00:58 UTC - in response to Message 16744.

We are going way off topic here but, Beyond, I'll quote you,

when I switched from XP-64 to W7-64, compute time went up from 27k to 30k per WU

OK, so you found an 11% or 12% difference, even with Aero off and W7 largely optimized for crunching GPUGrid tasks. I found less, when I last looked, but either way I think it is reasonable to say there is a slight difference between XP and vista/W7, as I also mentioned a big difference with Linux (35% faster, or something like that) - semantics.
I still think some of this is down to the CPU usage of GPUGrid WU’s being slower under W7 than XP; after all it has to make that blue circle go round ;). Perhaps the Environmental Variable (SWAN_SYNC=0) will reduce some of this difference or an app update could (given that Vista/W7 are not any slower on other GPU projects).

Who knows, when a Fermi app is optimized and released it might not make any difference if we use XP or Vista.

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16748 - Posted: 2 May 2010 | 16:49:20 UTC - in response to Message 16704.

>Just select to do Beta work Only; no other tasks, and try again.

Just remind also to check "Run test application?" to yes.


Thank you! I already selected only to run the ACEMD beta but didn't know that "Run test application" had to be switched to yes. That did the trick and I'm now downloading beta work units for my 480 GTX!

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16749 - Posted: 2 May 2010 | 17:42:39 UTC
Last modified: 2 May 2010 | 17:48:48 UTC

I have 2 480 GTX cards now completing beta work units and they seem to be very slow. I'm running W7 x64 and both cards clocked at 775mhz/1000mhz with fan on auto. These GPU clocks are stable after many hours of successful gaming. GPU usage on both cards is around 40%. I added the SWAN_SYNC=0 but so far each work unit is only using 3% of all of my processor (i7 920 with hyperthreading enabled, 3.4ghz overclock). My BOINC options allow for full CPU usage at all times.

I could really appreciate some help to get each work unit to use the entire CPU thread and not just 3%.

http://www.gpugrid.net/result.php?resultid=2259951

<core_client_version>6.10.50</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 1
# There are 2 devices supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
# Device 1: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Time per step (avg over 50000 steps): 24.510 ms
# Approximate elapsed time for entire WU: 1225.485 s
called boinc_finish

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16750 - Posted: 2 May 2010 | 18:30:57 UTC

The task on GPU Grid don't use the CPU to do useful work, increasing this percentage will not substantially make the task run faster... mostly the CPU side is a "helper" thread to load data onto the GPU and remove results ... unlike SaH's ATI application or Einstein's CUDA application where substantial processing takes place on the CPU side ...

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16751 - Posted: 2 May 2010 | 19:16:26 UTC - in response to Message 16750.

Danger30Q, your last task ran in 15min, which is faster than the 18min for the previous Betas. So you made some improvement whatever you did.

I have the same problem with my GTX470. I had to overclock it to get the time down from 1091sec (18min) to about 690sec (11.5min). But this is a bad fix!
MrS suggested my card was defaulting to some power saving mode and the clocks were not going back up correctly. Don’t know if the problem is to do with the drivers or the card, or what, but it is not the fault of Boinc, according to MrS.

Paul, the CPU usage can speed up the project. Hence a card on a board with a 1.6GHz AMD Athlon will not get through as many tasks as a card on a board with an i7-980X.
Also, Vista/W7 slows down the project compaired to XP and Linux is much faster as it appropriates a full core to support the GPUGrid task.

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16758 - Posted: 2 May 2010 | 23:32:22 UTC

Does anyone know how to get each work unit to maximize a full thread on the CPU? I used the SWAN_SYNC=0 on Win7 x64 and my BOINC preferences are set to use 100% of CPU if needed. I'm still only at 3% CPU usage for each work unit. I think if I can get that to 12% (full thread usage with 8 total threads on the i7 920) then my work units should finish a bit faster.

tng*
Send message
Joined: 3 May 09
Posts: 8
Credit: 204,386,894
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 16759 - Posted: 3 May 2010 | 0:03:53 UTC - in response to Message 16751.

I have the same problem with my GTX470. I had to overclock it to get the time down from 1091sec (18min) to about 690sec (11.5min). But this is a bad fix!
MrS suggested my card was defaulting to some power saving mode and the clocks were not going back up correctly. Don’t know if the problem is to do with the drivers or the card, or what, but it is not the fault of Boinc, according to MrS.

Paul, the CPU usage can speed up the project. Hence a card on a board with a 1.6GHz AMD Athlon will not get through as many tasks as a card on a board with an i7-980X.
Also, Vista/W7 slows down the project compaired to XP and Linux is much faster as it appropriates a full core to support the GPUGrid task.


Something that I ran across somewhere: this system started out doing beta WUs in ~1200 seconds. There's a setting in the NVIDIA control panel, under 3D Settings>Manage 3D settings, called "Power management mode". Setting this to "Prefer maximum performance" instead of the default "Adaptive" brought times down to below 1000 seconds. SWAN_SYNC=0 brought times down below 900 seconds, mostly around or below 850.

On the principle of "What's worth doing, is worth overdoing.", after seeing the improvement from SWAN_SYNC=0, I tried disabling hyperthreading. The one task that I ran with hyperthreading off did run faster, and with a shorter time per step, but not so much so that I was willing to continue the experiment.

[Question for project staff] Is SWAN_SYNC=0 just continuous polling?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16762 - Posted: 3 May 2010 | 1:19:36 UTC - in response to Message 16759.
Last modified: 3 May 2010 | 2:06:39 UTC

tng, I just put the Fermi to have a go at your suggestion.

Boinc,
03/05/2010 02:15:24 NVIDIA GPU 0: (driver version 19621, CUDA version 3000, compute capability 2.0, 1280MB, 50 GFLOPS peak) :)

Installed driver,
03/05/2010 02:40:34 NVIDIA GPU 0: GeForce GTX 470 (driver version 19741, CUDA version 3000, compute capability 2.0, 1248MB, 726 GFLOPS peak)

Set option to max performance.

Ran a Fermi Beta,
Still taking about 11min.
3 May 2010 1:37:41 UTC 3 May 2010 1:51:13 UTC Completed and validated 656.59 649.57 187.28 280.92 ACEMD beta version v6.23 (cuda30)
Boinc still reading sleeping clock rates?
http://www.gpugrid.net/result.php?resultid=2262368

Used Asus SmartDoctor to up the engine to 707, then EVGA Precision to up the shaders and then MSI Afterburner to up the fan speed. Restarted Boinc,
03/05/2010 02:53:16 NVIDIA GPU 0: GeForce GTX 470 (driver version 19741, CUDA version 3000, compute capability 2.0, 1248MB, 1165 GFLOPS peak)

Started another task,
time down to just under 10min.

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16763 - Posted: 3 May 2010 | 2:14:44 UTC

Nevermind, I got SWAN_SYNC now entered properly and each work unit now uses an entire thread all for itself. I also changed the Global Settings "Power management mode" to "Prefer maximum performance". That should help in crunching and in gaming.

tng*
Send message
Joined: 3 May 09
Posts: 8
Credit: 204,386,894
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 16764 - Posted: 3 May 2010 | 2:31:37 UTC - in response to Message 16762.

tng, I just put the Fermi to have a go at your suggestion.

Boinc,
03/05/2010 02:15:24 NVIDIA GPU 0: (driver version 19621, CUDA version 3000, compute capability 2.0, 1280MB, 50 GFLOPS peak) :)

Installed driver,
03/05/2010 02:40:34 NVIDIA GPU 0: GeForce GTX 470 (driver version 19741, CUDA version 3000, compute capability 2.0, 1248MB, 726 GFLOPS peak)

Set option to max performance.

Ran a Fermi Beta,
Still taking about 11min.
3 May 2010 1:37:41 UTC 3 May 2010 1:51:13 UTC Completed and validated 656.59 649.57 187.28 280.92 ACEMD beta version v6.23 (cuda30)
Boinc still reading sleeping clock rates?
http://www.gpugrid.net/result.php?resultid=2262368


Clock rate of .81 GHz is standard for a GTX 470. I don't think BOINC reads the clock rate from the card, I think it just IDs the card and reports the clock rate based on that.

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16765 - Posted: 3 May 2010 | 8:22:04 UTC

My first beta-test on gtx470 = OK
# 145 takes 6 min 19 secs
including systemfiles changed
boinc 06.10.50 says 0.17 cpu
taskmanager says 1.0 cpu
windows xp-pro

____________
Ton (ftpd) Netherlands

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16766 - Posted: 3 May 2010 | 8:41:17 UTC - in response to Message 16759.



[Question for project staff] Is SWAN_SYNC=0 just continuous polling?



yes


gdf

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16767 - Posted: 3 May 2010 | 8:49:06 UTC

gtx480 now 5 min 21 secs
Time per step (avg over 50000 steps): 6.371 ms
# Approximate elapsed time for entire WU: 318.529 s

Better now?
____________
Ton (ftpd) Netherlands

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16770 - Posted: 3 May 2010 | 9:07:57 UTC - in response to Message 16767.

gtx480 now 5 min 21 secs
Time per step (avg over 50000 steps): 6.371 ms
# Approximate elapsed time for entire WU: 318.529 s

Better now?


Yes. In Linux running standalone it takes 5.8/5.9 ms. So it is almost there now.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16774 - Posted: 3 May 2010 | 11:07:22 UTC - in response to Message 16765.

My first beta-test on gtx470 = OK
# 145 takes 6 min 19 secs
including systemfiles changed
boinc 06.10.50 says 0.17 cpu
taskmanager says 1.0 cpu
windows xp-pro

Your first test on your GTX470 ran well, but the next two dropped to half speed.
My problem is that my card does not come out of half speed, ever. Basically it tries to default to power saving mode no matter what I do. The tools I have must be reading the maximum values, but the app never sees them.
I think the drivers are to blame.

PS. 0.81 is not normal - it is just typical of the wrong values being reported.

Clock rate: 1.30 GHz
This is what I clocked it to, but it is not performing like it should, so it must still be dropping to half that.

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16775 - Posted: 3 May 2010 | 11:39:20 UTC

The first one was running "stand-alone".
All other jobs paused! 8 processor!


____________
Ton (ftpd) Netherlands

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16778 - Posted: 3 May 2010 | 13:42:29 UTC

It's like BOINC is only recognizing the core clock on the cards instead of the shader clock. All of my work units on my 480 GTX cards have used only 0.8ghz which is very close to 1/2 of my shader clock of 1550mhz. BOINC doesn't realize that the Fermi cards only have shader clock and no core clock.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16782 - Posted: 3 May 2010 | 14:18:37 UTC - in response to Message 16778.

For these present Fermi Betas,
a GTX480 should complete in about 5min 30sec,
a GTX470 should finish in about 6min 30sec.

My times are almost twice that, as are your times (Danger30Q).
We are both using Win7 64bit,
ftpd is using XP X86.
There might be something in that - perhaps the Win 7 drivers are not up to the job yet?

We both optimized as well as we could, so I'm going to pull some systems apart and setup a test XP system to try this GTX470 on. I hope there are still Betas around when I get it setup :)
- will let you know how I get on.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16783 - Posted: 3 May 2010 | 14:40:16 UTC - in response to Message 16774.

My problem is that my card does not come out of half speed, ever. Basically it tries to default to power saving mode no matter what I do. The tools I have must be reading the maximum values, but the app never sees them.
I think the drivers are to blame.

I seem to remember that you mentioned somewhere that your card was an ASUS. Have you tried uninstalling the ASUS programs that came with the card? They're the worst junk ever. I'm sure you've checked all the usual suspect settings like forcing max performance, yada yada. Another very likely possibility is that the ASUS BIOS is not behaving. I'd check with them for an update. Interestingly, I have 3 GT 240 GDDR5/512 cards that supposedly have identical specs, all different brands. The Gigabyte is just slightly faster than the MSI, and the ASUS is much slower than either. Switched them between machines and the ASUS is still always much slower. I've also had 2 ASUS 9600GSO cards. One was very slow and the other had a fan that failed after a few months. Also have an ASUS HD 4770 ATI card but that runs fine (have 8 HD 4770 cards from 4 different manufacturers and they all run perfectly). Maybe ASUS only designs well for GPUs that begin with the letter "A" :-)

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16785 - Posted: 3 May 2010 | 15:18:41 UTC

3-5-2010 17:10:39 GPUGRID Sending scheduler request: To fetch work.
3-5-2010 17:10:39 GPUGRID Requesting new tasks for CPU and GPU
3-5-2010 17:10:43 GPUGRID work fetch suspended by user
3-5-2010 17:10:44 GPUGRID Scheduler request completed: got 0 new tasks
3-5-2010 17:10:44 GPUGRID Message from server: No work sent
3-5-2010 17:11:01 GPUGRID work fetch resumed by user
3-5-2010 17:11:20 GPUGRID Sending scheduler request: To fetch work.
3-5-2010 17:11:20 GPUGRID Requesting new tasks for CPU and GPU
3-5-2010 17:11:25 GPUGRID Scheduler request completed: got 0 new tasks
3-5-2010 17:11:25 GPUGRID Message from server: No work sent
3-5-2010 17:12:00 GPUGRID Sending scheduler request: To fetch work.
3-5-2010 17:12:00 GPUGRID Requesting new tasks for GPU
3-5-2010 17:12:05 GPUGRID Scheduler request completed: got 0 new tasks
3-5-2010 17:12:05 GPUGRID Message from server: No work sent
3-5-2010 17:12:40 GPUGRID Sending scheduler request: To fetch work.
3-5-2010 17:12:40 GPUGRID Requesting new tasks for CPU
3-5-2010 17:12:45 GPUGRID Scheduler request completed: got 0 new tasks
3-5-2010 17:12:45 GPUGRID Message from server: No work sent
3-5-2010 17:12:45 GPUGRID Message from server: Fermi-class GPU needed
3-5-2010 17:15:21 GPUGRID Sending scheduler request: To fetch work.
3-5-2010 17:15:21 GPUGRID Requesting new tasks for GPU
3-5-2010 17:15:26 GPUGRID Scheduler request completed: got 0 new tasks
3-5-2010 17:15:26 GPUGRID Message from server: No work sent

No more work for GTX260 on my HOME computer which may crunch all applications?
____________
Ton (ftpd) Netherlands

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16786 - Posted: 3 May 2010 | 16:16:25 UTC - in response to Message 16785.

Double check you home profile!
It looks like your GTX260 is using task your profile for Fermi's

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16787 - Posted: 3 May 2010 | 16:51:36 UTC - in response to Message 16786.
Last modified: 3 May 2010 | 17:15:46 UTC

Kev,

(Of course) i did that already. This morning it was working OK, but this afternoon .....

Perhaps beta's ask for fermi-gpu?
____________
Ton (ftpd) Netherlands

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16790 - Posted: 3 May 2010 | 19:22:10 UTC - in response to Message 16787.
Last modified: 3 May 2010 | 20:10:21 UTC

Sorry Ton, just checking the obvious! If there are no other Betas it would make good sense to check for Fermi.

Tried the GTX470 on XP but any drivers I tried kept blue screening the system. It was an old system so I put it in another W7 system (Will try XP again later).
Installed the latest Beta driver (19775).
Stayed away from the Asus-ware this time.
GPUZ (0.4.2) says GPU 608MHz, RAM 837MHz, Shaders 1215MHz.
Swan_sync set to zero.

Boinc says,
03/05/2010 19:41:17 NVIDIA GPU 0: GeForce GTX 470 (driver version 19775, CUDA version 3000, compute capability 2.0, 1248MB, 181 GFLOPS peak)

Eventually picked up a Beta Fermi task...
17min 37sec to run,
and Clock rate: 0.81 GHz, again!

1,057.33 904.20 187.28 280.92 ACEMD beta version v6.23 (cuda30)

Set Power management to maximun performance, restarted Boinc (6.10.43).

Next task set took 960sec (16min)

Changed settings using EVGA Precision to core1800 shaders1300 Ram 650. Still have no faith in these readings.

Restarted Boinc,
03/05/2010 20:03:23 NVIDIA GPU 0: GeForce GTX 470 (driver version 19775, CUDA version 3000, compute capability 2.0, 1248MB, 291 GFLOPS peak)

Ran another Beta. EVGA says 49 to 50% GPU usage GPUZ says 50% too!

Task took about 16min 30sec. So, basically the EVGA changes did nothing!

Boinc 6.10.51 says,
03/05/2010 20:40:10 NVIDIA GPU 0: GeForce GTX 470 (driver version 19775, CUDA version 3000, compute capability 2.0, 1248MB, 1165 GFLOPS peak)

Ended up installing the Asus SmartDoctor to try and get things working normally. GPUZ says GPU 707, Mem 906, Shaders 1414

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16792 - Posted: 3 May 2010 | 20:45:30 UTC - in response to Message 16790.

I have Precision 1.9.2 and it reports for my 480 correctly.

When I look at the graphs it shows...
Core = 701
Shader = 1401
Memory = 1850

When I look at the sliders they say
Processor Clock = 1401 (shaders)
Memory Clock = 1850

IIRC you can only download this version from EVGA you have registered one of their 470s or 480s. I have not seen it floating out and about the internet yet ... doesn't mean it's not there, I just have not seen it.
____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16793 - Posted: 3 May 2010 | 21:07:25 UTC - in response to Message 16792.

I'm stuck using 1.9.1 - and it was released in Jan!

It says,
Core Clock 1851 - that sounds like the shaders
Shader Clock 1300 - sounds like RAM
Memory Clock 650 - sounds like Core

A right mess.
At the time being these Asus cards look bad!
I will try it in another XP system tomorrow, and if its still half speed it will get RTM'd.

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16804 - Posted: 4 May 2010 | 11:47:36 UTC
Last modified: 4 May 2010 | 11:49:01 UTC

core_client_version>6.10.51</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 1.40 GHz
# Total amount of global memory: 1610153984 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Time per step (avg over 50000 steps): 6.416 ms
# Approximate elapsed time for entire WU: 320.780 s
called boinc_finish

</stderr_txt

My first beta-wu #83 using boinc 06.10.51 with driver 197.75 in 5 min 24 secs

Correct?
____________
Ton (ftpd) Netherlands

eternal_fantasy
Send message
Joined: 3 May 10
Posts: 5
Credit: 5,341,531
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 16806 - Posted: 4 May 2010 | 12:05:19 UTC

Hello, Just want to post my experience running a GTX 480.

There seems to be very little Beta work units for the new fermi GPUs to crunch... kept getting "No work is available for ACEMD beta version" after around 20 completed WUs per day, and that is with 8~minute per WU.

With unaltered settings from my account, I receive 2 WUs downloaded, crunching one with another standing by. After a few completed and uploaded WU requesting additional WUs is declined even though I have only one WU downloaded and crunching, with none waiting, resulting in down time waiting for the completed WU to be uploaded and another downloaded.

Finally I have noticed that quite frequently WUs only from GPUGRID are having trouble being downloaded to my PC. They get bogged down to around 2kb/s, basically having to cancel the download when I see it and request another, but more often then not are refused, and have to wait a while whilst my GPU lies idle.

This is my first foray into GPUGRID, and would like some thoughts on the above issues.

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16807 - Posted: 4 May 2010 | 12:09:52 UTC - in response to Message 16806.

I think they are a bit behind in creating beta work units for the Fermi cards. There's not much we can do but wait. I'm only crunching GPUGrid about 8 hours a day b/c of lack of beta work units.

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16808 - Posted: 4 May 2010 | 12:11:23 UTC

EVGA Precision 1.9.3 is now available for download to the public and works with Fermi.

http://www.evga.com/precision/

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16809 - Posted: 4 May 2010 | 12:45:52 UTC - in response to Message 16808.

Using EVGA Precision 1.9.3,
I get nothing for Core Clock on the right hand part - no reading at all, but I can change this a bit using ASUS Smart Doctor and see it on the left hand pane.

I can move the shaders and the memory clock using EVGA Precision (not 100% sure what they should be)!

What are your settings at?

My GTX470 core is 704 (default is 608)
memory is 1751 (915 to 1945 range)
Shader are 1407 (1675 to 2680 range)

Thanks,

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16810 - Posted: 4 May 2010 | 12:46:19 UTC - in response to Message 16808.

Just posting cause I'd love to see some beta wu's available so I can test my new 2x470 setup (linux app)

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16811 - Posted: 4 May 2010 | 13:29:48 UTC

<core_client_version>6.10.51</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 470"
# Clock rate: 1.21 GHz
# Total amount of global memory: 1341718528 bytes
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
# Time per step (avg over 50000 steps): 7.576 ms
# Approximate elapsed time for entire WU: 378.781 s
called boinc_finish

</stderr_txt>

My first beta WU with gtx470 using 06.10.51 and driver 197.75 taking 6 min 25 secs.
____________
Ton (ftpd) Netherlands

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16812 - Posted: 4 May 2010 | 13:34:04 UTC - in response to Message 16811.

Thanks Ton,

My Setup and poor results:

W7 64bit with GTX470
Boinc 6.10.51, Driver 197.75
Core Clock 703MHz, (ASUS Smard Doctor + GPUZ + EVGA Precision 1.9.3)
Shader Clock 1406MHz, (ASUS Smard Doctor + GPUZ + EVGA Precision 1.9.3)
RAM:
3348MHz acording to ASUS SmartDoctor,
1806MHz going by EVGA Precision 1.9.3,
903MHz going by GPUZ 0.4.2
No Other Boinc Tasks (just GPUGrid Fermi Betas)
Swan_Sync=0
NVidia Control panel,
Power Management Mode set to Prefer Maximum Performance
Last 3 Tasks (at the above settings),
2275201 1436286 4 May 2010 12:56:57 UTC 4 May 2010 13:18:32 UTC Completed and validated 915.99 814.70 187.28 280.92 ACEMD beta version v6.23 (cuda30)
2275143 1436248 4 May 2010 12:37:16 UTC 4 May 2010 13:03:03 UTC Completed and validated 914.72 814.14 187.28 280.92 ACEMD beta version v6.23 (cuda30)
2275127 1436237 4 May 2010 12:29:29 UTC 4 May 2010 12:45:17 UTC Completed and validated 888.60 812.34 187.28 280.92 ACEMD beta version v6.23 (cuda30)

So my tasks are still taking about 15min 30sec rather than 6min 25sec.

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16813 - Posted: 4 May 2010 | 14:06:04 UTC - in response to Message 16812.

Do you set switch physx (nvidia) = disabled? This is correct.
Set systemperformance to highspeed!

Good luck!
____________
Ton (ftpd) Netherlands

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16816 - Posted: 4 May 2010 | 15:36:45 UTC - in response to Message 16813.
Last modified: 4 May 2010 | 16:25:11 UTC

Physx was on, within NVidia control panel.

When I switched it off it made no difference,

2275632 1436530 4 May 2010 14:52:38 UTC 4 May 2010 15:12:24 UTC Completed and validated 901.78 816.36 187.28 280.92 ACEMD beta version v6.23 (cuda30)
2275578 1436458 4 May 2010 14:38:53 UTC 4 May 2010 14:57:15 UTC Completed and validated 900.44 815.89 187.28 280.92 ACEMD beta version v6.23 (cuda30)

Going to try nTune!

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16818 - Posted: 4 May 2010 | 17:07:13 UTC - in response to Message 16816.

I'm trying to determine why I can't seem to pull any WUs on my 470 set up. I have verified that my gpugrid settings are such that I am only pulling beta work units, but I am still not able to get anything. Others on my team are pulling WUs, but mine is not getting anything. Is this because I am using Linux?

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16822 - Posted: 4 May 2010 | 19:52:46 UTC

Would running the GTX 480s in SLI with Windows 7 x64 be the reason for both cards only using 40% gpu usage? I've verified with both EVGA Precision 1.9.3 and GPUz that my overclocked settings of 1550mhz shaders and 2000mhz memory are correct and holding while running GPUgrid.

I'll try turning PhysX off but I don't see why that would matter.

I'll also run some Vantage and game benchmarks to see if I'm getting the results I should be. I haven't done any benchmarks with these new 480s but playing Battlefield Bad Company 2 has seemed fine.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16823 - Posted: 4 May 2010 | 20:06:16 UTC - in response to Message 16822.

Has anyone managed to get a GTX480 to crunch a Beta in 5min 30sec (or there abouts) when running Windows 7 or Vista?

Ditto for a GTX470 (just 6min 25sec)?

We know people have them working on XP at these speeds, and on Linux (in the labs).


Thanks,

samsausage
Send message
Joined: 18 Nov 08
Posts: 12
Credit: 70,480,919
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16824 - Posted: 4 May 2010 | 21:51:21 UTC

Same Problem here with Win 7 64bit and one 480 card, it's taking about 9 minutes to do one WU and that is mildly overclocked to 1600 Shaders.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16827 - Posted: 4 May 2010 | 23:02:18 UTC - in response to Message 16824.
Last modified: 4 May 2010 | 23:03:05 UTC

Thanks,
The 1600 shaders are a reasonable overclock for a GTX480, at this stage.
I would hope they go above 1650, but who knows what live tasks will bring?

Your numbers are out by about 1.6 times that of the best reported Win XP task. I'm getting similar results, albeit on a GTX470. So perhaps the issues are with the Vista/W7 drivers? Either that or someone is not telling us something!

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16828 - Posted: 5 May 2010 | 10:49:53 UTC - in response to Message 16827.

The beta has been promoted to the application acemd6.73 cuda3.
gdf

eternal_fantasy
Send message
Joined: 3 May 10
Posts: 5
Credit: 5,341,531
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 16829 - Posted: 5 May 2010 | 11:15:51 UTC - in response to Message 16828.

The beta has been promoted to the application acemd6.73 cuda3.
gdf

Does that mean that we can now accept non-beta work units for our Fermi GPUs?

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16830 - Posted: 5 May 2010 | 13:06:40 UTC - in response to Message 16829.

I seem to be able to only pull ACEMD - GPU molecular dynamics v6.04 (cuda). What settings do I need to enable or set to pull the new version? And is that version number the same on Linux and Windows?

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16834 - Posted: 5 May 2010 | 16:16:16 UTC - in response to Message 16830.
Last modified: 5 May 2010 | 16:19:20 UTC

GDF ... this is frustrating, we process betas, you say you have released it to production but then don't provide directions on what we need to set our preferences to get the new version?

Is it a Linux version, a Windows version, both???

I'll start guessing on my Win7 x64:

The ACEMD ver2 setting gave me an old 6.03 which crashed as would be expected with on a 480 :-(

Now trying ACEMD and we will see what happens.

I am at work without a remote connection to my PC with the 480 but apparently I am no longer getting work so I can't tell what is going on.
____________
Thanks - Steve

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16835 - Posted: 5 May 2010 | 16:54:49 UTC - in response to Message 16834.

there were no workunits on the the acemd queue. We are now submitting many.
These are for Fermi only.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16836 - Posted: 5 May 2010 | 17:58:48 UTC - in response to Message 16835.
Last modified: 5 May 2010 | 18:16:11 UTC

Built another XP system x86 sp3,
Installed the GTX470 with the latest Beta (19775)
Installed Boinc and attached to GPUGrid:
05/05/2010 18:28:25 NVIDIA GPU 0: GeForce GTX 470 (driver version 19775, CUDA version 3000, compute capability 2.0, 1280MB, 1089 GFLOPS peak)

Configured the system to optimize for GPUGrid tasks (not attached to any other project, using auto report and set the Swan_Sync system variable to zero).

Ran a Beta,
2282519 1440796 5 May 2010 17:24:02 UTC 5 May 2010 17:34:42 UTC Completed and validated 382.67 326.08 187.28 280.92 ACEMD beta version v6.23 (cuda30)
Closed Boinc during the run and reopened Boinc, Still managed to finish in 6 min 37 sec.
This is about the expected time (no overclocking of any kind).

Now running a 6.73 task,
2282490 1440606 5 May 2010 17:20:07 UTC 10 May 2010 17:20:07 UTC In progress --- --- --- --- Full-atom molecular dynamics v6.73 (cuda30)

This long task looks OK so far, about 30min in and 9% complete; expected to complete in about 6h roughly. GPU usage is presently about 75% I have it set to use one full CPU core, temps were 91deg C, so I turned the fan up to 79% and the temps droped to 82deg C. At that speed it is not quite as loud as my GTX260! The clock rates are staying true. Looks like the problems I was having were all down to using W7 and possibly the drivers for W7.

WhiteFireDragon
Avatar
Send message
Joined: 22 Jun 09
Posts: 5
Credit: 74,526,885
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 16837 - Posted: 5 May 2010 | 18:50:30 UTC - in response to Message 16836.


Now running a 6.73 task,
2282490 1440606 5 May 2010 17:20:07 UTC 10 May 2010 17:20:07 UTC In progress --- --- --- --- Full-atom molecular dynamics v6.73 (cuda30)

This long task looks OK so far, about 30min in and 9% complete; expected to complete in about 6h roughly. GPU usage is presently about 75% I have it set to use one full CPU core, temps were 91deg C, so I turned the fan up to 79% and the temps droped to 82deg C. At that speed it is not quite as loud as my GTX260! The clock rates are staying true. Looks like the problems I was having were all down to using W7 and possibly the drivers for W7.

how are you getting these new WU's to work? what did you have to set in the preferences?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16839 - Posted: 5 May 2010 | 19:07:25 UTC - in response to Message 16837.

Run all applications worked for me.

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16841 - Posted: 5 May 2010 | 19:23:09 UTC - in response to Message 16839.

I had run all set and i grabbed 6 6.03's that all errored out.

WhiteFireDragon
Avatar
Send message
Joined: 22 Jun 09
Posts: 5
Credit: 74,526,885
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 16842 - Posted: 5 May 2010 | 19:25:01 UTC - in response to Message 16839.

and this is with winXP right? so it shouldn't matter if i used win7 right?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16843 - Posted: 5 May 2010 | 19:39:53 UTC - in response to Message 16842.
Last modified: 5 May 2010 | 19:41:27 UTC

Yes, I am now using XP, but you will still be able to pick up tasks using Win 7.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16844 - Posted: 5 May 2010 | 19:41:18 UTC
Last modified: 5 May 2010 | 19:42:42 UTC

Win7 x64, I set to ACEMD application only and have downloaded 2 WU v6.73 (cuda30) and while I can't see them directly they have not errored after a couple hoursof processing. I will update in a couple more hours when I get home.
____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16845 - Posted: 5 May 2010 | 19:46:28 UTC - in response to Message 16844.
Last modified: 5 May 2010 | 20:17:14 UTC

I think the server can now determine that you have a Fermi, and allocate tasks accordingly.

Reached 51% in 2h 30min on this GTX470 (native clocks), so looks like it will take 5h.
- not bad saying my OC'd GTX260 does a short task in about 6h (albeit on W7). So I'm guessing a GTX470 is slightly faster than a GTX295. When a few results are in we will know.

eternal_fantasy
Send message
Joined: 3 May 10
Posts: 5
Credit: 5,341,531
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 16846 - Posted: 5 May 2010 | 20:22:07 UTC - in response to Message 16844.

Win7 x64, I set to ACEMD application only and have downloaded 2 WU v6.73 (cuda30) and while I can't see them directly they have not errored after a couple hoursof processing. I will update in a couple more hours when I get home.

Did you accept "Run test applications" in the preference menu?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16847 - Posted: 5 May 2010 | 22:57:35 UTC - in response to Message 16846.
Last modified: 5 May 2010 | 23:23:55 UTC

That task finished in 4.82h on a GTX470 (stock)

2282490 1440606 5 May 2010 17:20:07 UTC 5 May 2010 22:21:15 UTC Completed and validated 17,345.38 17,027.81 7,954.42 11,931.63 Full-atom molecular dynamics v6.73 (cuda30)

That is 59.4K per day :)
I guess a GTX480 could do over 70K per day at stock, for now.

For the next task I raized the shaders a bit (1430), and the fan (83%).
Perhaps 4.5h,

=[PULSAR]=
Send message
Joined: 22 Feb 10
Posts: 9
Credit: 16,172,951
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16849 - Posted: 6 May 2010 | 5:17:37 UTC
Last modified: 6 May 2010 | 5:17:59 UTC

Hmmm just enabled ACEMD and unchecked beta, updated client and still giving me 6.23 beta WU's. Run test apps is "yes"

Any ideas? Out of 6.73 WU's already?

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16850 - Posted: 6 May 2010 | 5:24:48 UTC - in response to Message 16849.

I haven't been able to pull anything since this started, betA or otherwise.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16852 - Posted: 6 May 2010 | 6:05:52 UTC - in response to Message 16850.

I haven't been able to pull anything since this started, betA or otherwise.

And since the apps were updated the switch below hasn't been working on any of my 5 machines. It worked before.

"If no work for selected applications is available, accept work from other applications? - yes"

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16853 - Posted: 6 May 2010 | 7:29:04 UTC - in response to Message 16845.

Reached 51% in 2h 30min on this GTX470 (native clocks), so looks like it will take 5h.
- not bad saying my OC'd GTX260 does a short task in about 6h (albeit on W7). So I'm guessing a GTX470 is slightly faster than a GTX295. When a few results are in we will know.

My factory OCed GTX 260/216 today did the last 2 v6.72 long WUs in 6:45:43 and 6:45:33 in XP. Assuming a GTX 295 is twice as fast as my 260, the GTX 470 doesn't match it yet, at least with this app and current drivers. Looks like your 260 is suffering a heavy Win7 penalty while your 470 is running the faster XP as well.

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16855 - Posted: 6 May 2010 | 8:19:47 UTC

p25-IBUCH_0511_pYEEI_long_100505-0-40-RND1276_0
Workunit 1440675
Aangemaakt 5 May 2010 17:06:49 UTC
Sent 5 May 2010 17:25:27 UTC
Received 6 May 2010 8:13:32 UTC
Server state Over
Outcome Success
Client state Geen
Exit status 0 (0x0)
Computer ID 35174
Report deadline 10 May 2010 17:25:27 UTC
Run time 16731.948278
CPU time 16567.52
stderr out <core_client_version>6.10.51</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 1.40 GHz
# Total amount of global memory: 1610153984 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 1.40 GHz
# Total amount of global memory: 1610153984 bytes
# Number of multiprocessors: 15
# Number of cores: 120
# Time per step (avg over 1230000 steps): 13.348 ms
# Approximate elapsed time for entire WU: 16684.689 s
called boinc_finish

</stderr_txt>
]]>


p24-IBUCH_0511_pYEEI_long_100505-0-40-RND5021_0
Workunit 1440674
Aangemaakt 5 May 2010 17:06:49 UTC
Sent 5 May 2010 17:25:24 UTC
Received 6 May 2010 8:13:32 UTC
Server state Over
Outcome Success
Client state Geen
Exit status 0 (0x0)
Computer ID 35174
Report deadline 10 May 2010 17:25:24 UTC
Run time 16619.810539
CPU time 16464.64
stderr out <core_client_version>6.10.51</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 1.40 GHz
# Total amount of global memory: 1610153984 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Time per step (avg over 1250000 steps): 13.292 ms
# Approximate elapsed time for entire WU: 16615.186 s
called boinc_finish

</stderr_txt>
]]>

Received two 6.73 WU for gtx480 Long I-Buch. Duration 4:38:51 and 4:36:59
Received points each = 11,931.63!
Yet no more wu availabel???

____________
Ton (ftpd) Netherlands

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16856 - Posted: 6 May 2010 | 8:39:32 UTC - in response to Message 16855.
Last modified: 6 May 2010 | 8:51:07 UTC

Beyond, neither of our GTX260 comparisons is too special! My comparison was not good as I compared a GTX260 on Win7 x64 running a 6.03 task to a Fermi task.

I think your comparison to a 6.72 task was more appropriate, as they run faster, but it would have been a better reference if your card was also on XP x86 (rather than X64, which is slightly faster) and more importantly, ran at the default stock speed of 1242MHz, rather than 1.6GHz; your card is about 30% faster than reference clock rates (and performances more like a ref. GTX285):
2282512 1440614 5 May 2010 17:36:33 UTC 6 May 2010 7:06:43 UTC Completed and validated 24,333.88 4,988.16 7,954.42 11,931.63 Full-atom molecular dynamics v6.72 (cuda)

Stock GTX470 (608MHz, 1215MHz):
2282490 1440606 5 May 2010 17:20:07 UTC 5 May 2010 22:21:15 UTC Completed and validated 17,345.38 17,027.81 7,954.42 11,931.63 Full-atom molecular dynamics v6.73 (cuda30)
2282519 1440796 5 May 2010 17:24:02 UTC 5 May 2010 17:34:42 UTC Completed and validated 382.67 326.08 187.28 280.92 ACEMD beta version v6.23 (cuda30)

GTX470 with 715MHz GPU and Shaders (linked) at 1430Mhz:
2284963 1442319 6 May 2010 3:10:42 UTC 6 May 2010 7:34:20 UTC Completed and validated 15,465.16 15,365.53 7,954.42 11,931.63 Full-atom molecular dynamics v6.73 (cuda30)
2283030 1440680 5 May 2010 19:44:47 UTC 6 May 2010 3:10:42 UTC Completed and validated 15,487.81 15,366.70 7,954.42 11,931.63 Full-atom molecular dynamics v6.73 (cuda30)
2283885 1441680 5 May 2010 22:52:03 UTC 6 May 2010 3:15:44 UTC Completed and validated 326.50 313.22 187.28 280.92 ACEMD beta version v6.23 (cuda30)

ftpd, your GTX480 stock time of 16,731 is better than my GTX470 stock time of 17,345, but not by quite as much as I would have expected (about 1500). Were you running any CPU tasks at the time?

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16859 - Posted: 6 May 2010 | 9:44:36 UTC - in response to Message 16856.

Kev,

All CPU (8) were used during processing GPUgrid (gtx480)for other boinc-tasks.
I am now running one job on gtx470 also all processors (8) used!
____________
Ton (ftpd) Netherlands

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16860 - Posted: 6 May 2010 | 9:46:03 UTC - in response to Message 16856.

This GTX260sp216 card is at 1.4GHz (a bit closer to 1.242GHz) and uses XP x86:
2278410 1438105 5 May 2010 2:20:21 UTC 5 May 2010 15:22:50 UTC Completed and validated 29,996.34 4,593.03 7,954.42 11,931.63 ACEMD - GPU molecular dynamics v6.03 (cuda)

cenit
Send message
Joined: 11 Nov 09
Posts: 23
Credit: 668,841
RAC: 0
Level
Gly
Scientific publications
watwat
Message 16865 - Posted: 6 May 2010 | 12:18:39 UTC - in response to Message 16572.

In English, Fermi = femto


LOL! The first time I see this. It looks totally ridiculous.. but apparently is true.

MrS

[OT]
it's a tribute to enrico fermi: 10^-15 m is used when measuring nuclear dimensions and so femtometers are also called as fermi.

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16868 - Posted: 6 May 2010 | 14:34:03 UTC

The result of my first gtx470 normal 6.73 job.

p49-IBUCH_0510_pYEEI_long_100505-1-40-RND0275_0
Workunit 1443153
Aangemaakt 6 May 2010 8:34:17 UTC
Sent 6 May 2010 8:37:55 UTC
Received 6 May 2010 14:27:17 UTC
Server state Over
Outcome Success
Client state Geen
Exit status 0 (0x0)
Computer ID 35268
Report deadline 11 May 2010 8:37:55 UTC
Run time 20281.515625
CPU time 20152.63
stderr out <core_client_version>6.10.51</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 470"
# Clock rate: 1.21 GHz
# Total amount of global memory: 1341718528 bytes
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 470"
# Clock rate: 1.21 GHz
# Total amount of global memory: 1341718528 bytes
# Number of multiprocessors: 14
# Number of cores: 112
# Time per step (avg over 210000 steps): 16.415 ms
# Approximate elapsed time for entire WU: 20518.322 s
called boinc_finish

</stderr_txt>
]]>

No changes made on card, just swan_sync change!
Windows XP pro 8 processors all used for other boinc-cpu-tasks!


____________
Ton (ftpd) Netherlands

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16869 - Posted: 6 May 2010 | 14:52:06 UTC - in response to Message 16856.
Last modified: 6 May 2010 | 14:53:23 UTC

I think your comparison to a 6.72 task was more appropriate, as they run faster, but it would have been a better reference if your card was also on XP x86 (rather than X64, which is slightly faster) and more importantly, ran at the default stock speed of 1242MHz, rather than 1.6GHz

My shaders are set to 1600, which is on the modest side of what ETA has suggested. Thought you were doing the same (per your previous posts). Core and memory are at the MSI factory set speed. The card runs cool, it's at 59C right now and 50% fan speed even though it has the OC and is the model with the single fan. Of interest: what's the Fermi temp and fan?

To clear this up: XP64 is not at all faster in GPUGRID compared to XP32, they are the same speed. They are however significantly faster than Win7 of any variety, as outlined here:

http://www.gpugrid.net/forum_thread.php?id=1729

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16870 - Posted: 6 May 2010 | 15:34:50 UTC - in response to Message 16852.
Last modified: 6 May 2010 | 15:46:55 UTC

I haven't been able to pull anything since this started, betA or otherwise.

And since the apps were updated the switch below hasn't been working on any of my 5 machines. It worked before.

"If no work for selected applications is available, accept work from other applications? - yes"

The above setting is now working again. v6.72 availability seems to be spotty though.

Edit: One thing that's irritating is having to babysit the settings/clients constantly to get v6.72 WUs.

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16873 - Posted: 6 May 2010 | 16:44:09 UTC - in response to Message 16868.


# Time per step (avg over 210000 steps): 16.415 ms

no reason for me to rush to the shop to get GTX470...

____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16874 - Posted: 6 May 2010 | 16:53:29 UTC - in response to Message 16870.

I’m not picking up any 6.72 tasks, just 6.03, Betas and 6.73 (Fermi) tasks, can't be bothered to task babysit myself.

The Fermi seems to default to 91 deg C if you leave the fan on Auto. I would not recommend doing that!
When I had it crunching at stock I turned the fan up to 79% and the temps dropped to 82 deg C
The case was not good for cooling, and sat in a corner in a warm room!

When overclocked to 715/1430 I had to up the fan speed to 83% to keep it at 82 deg C.
I just built that system up using XP to test the Fermi actually worked properly (as it was about half speed on W7) to decide if I wanted to return it. The card is now awaiting a new computer to run in.

I also wanted stock bench marks. So I ran the first full task at stock and then ran 3 full tasks and 2 Betas at 715/1430.
So,
a GTX470 should presently get you around 59.4K credits per day on XP x86 (optimized as previously described).
and if overclocked to 715/1430 is should bring home about 66.5K

As CPU usage can be optimized so that a full core is set aside for these tasks, I would expect that an x64 operating system would have a slight edge over an x86 system.
Also, the faster the CPU, the faster the task will complete (but only by a few percent), and obviously if you use your CPU heavily elsewhere it can slow the tasks down.

When GPUGrid manages to build and test an optimized Fermi applicaiton, these figures should improve.

ftpd,
To benefit from the SWAN_SYNC=0 environmental variable, I think you also need to set Boinc to only use 7 threads (as you crunch CPU tasks as well); this will free up a CPU core/thread for use with GPUGrid. I expect this is the reason your tasks run slower than they could do. Also, you did run some Betas with no CPU usage and these were faster.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16875 - Posted: 6 May 2010 | 17:07:17 UTC - in response to Message 16874.

The Fermi seems to default to 91 deg C if you leave the fan on Auto. I would not recommend doing that!
When I had it crunching at stock I turned the fan up to 79% and the temps dropped to 82 deg C
The case was not good for cooling, and sat in a corner in a warm room!

When overclocked to 715/1430 I had to up the fan speed to 83% to keep it at 82 deg C.

Wow, think I'll buy a couple of Fermis next winter and sell my furnace :-)

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16876 - Posted: 6 May 2010 | 17:48:39 UTC - in response to Message 16875.

Wow, think I'll buy a couple of Fermis next winter and sell my furnace :-)

OC'ed GTX275 was really heating the room in my house this winter. I turned off heating there but it did well, saving me couple of bucks :-) I'm not kidding, it was one the hottest rooms.

____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16877 - Posted: 6 May 2010 | 18:47:19 UTC - in response to Message 16876.
Last modified: 6 May 2010 | 18:48:14 UTC

My Palit GTX260sp216 uses about the same amount of electric as this GTX470, but makes more noise! It has 2 fans, which explains the noise and the lower temps. When NVidia release their stranglehold on Fermi card designs I'm sure there will be many Fermi's with non-standard heatsinks and fans. I can't imagine that the present designs are particularly good.
At least this Fermi is slightly shorter than the GTX260, does more work and is not quite as noisy.

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16887 - Posted: 7 May 2010 | 12:27:02 UTC - in response to Message 16877.

When overclocked to 715/1430 I had to up the fan speed to 83% to keep it at 82 deg C.

sounds strange. mine GTX275 hardly OC'ed 702/1582?1.1V @75% fan speed gives 67 *C max, quiet often around 65. So, no doubt - GTX400 are hotter then GTX200

____________

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16889 - Posted: 7 May 2010 | 15:11:06 UTC

I found out why my SLI GTX 480s were taking on average 1100 seconds to complete the beta work units. I turned off SLI and that did the trick. My last few work units have completed in 700 seconds at stock speeds. Oddly, it still shows my gpu clock speed as .81ghz. I've noticed other users will get this to 1.4ghz or so.

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16892 - Posted: 7 May 2010 | 15:15:05 UTC - in response to Message 16889.
Last modified: 7 May 2010 | 15:15:31 UTC

when can we expect a decent pool of WUs to draw from? They seem to get pulled so fast that its impossible to continually crunch. or in my case, get a single WU. Don't mean to sound whiny, really am just curious.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16895 - Posted: 7 May 2010 | 16:47:44 UTC

Getting the v6.72 WUs are often a problem too.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16898 - Posted: 7 May 2010 | 18:04:06 UTC - in response to Message 16895.
Last modified: 7 May 2010 | 18:04:21 UTC

We will try to upload more workunits and also to upload the Linux application.
gdf

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16899 - Posted: 7 May 2010 | 18:13:03 UTC - in response to Message 16898.

gdf, is that why im not pulling any wu's the app is not even available?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16900 - Posted: 7 May 2010 | 20:02:29 UTC - in response to Message 16899.
Last modified: 7 May 2010 | 20:04:03 UTC

If you had left them on a Win7 platform you might well have picked up Fermi tasks (even though they would have ran slowly)!
Sounds like they are trying to upload the Linux version this evening along with more tasks (presumably for Windows and Linux). Tasks should run faster on Linux than on Win7, so hopefully you will start getting through some work soon.

My guess is that it would be best to receive both Betas and normal work units. Keep an eye on what you are picking up, just in case you start picking up non-fermi tasks (which would all fail). If you do, select no new tasks from within Boinc. Then just select to pick up Betas, and not the 6.04 tasks, before re-enabling new tasks for Boinc.

Good luck,

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16901 - Posted: 7 May 2010 | 21:16:13 UTC - in response to Message 16900.

yeah i'm currently set for beta's and acemd's only. Can't seem to pull a beta, so I'm just gonna assume everything works on my end and I just have asstastic luck with being lucky enough to get a beta unit. Tired of troubleshooting ghosts. just gonna leave this alone for the time being.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16907 - Posted: 8 May 2010 | 13:30:26 UTC - in response to Message 16901.
Last modified: 8 May 2010 | 13:39:33 UTC

Put this Fermi (GTX470) into another system (AMD 5200+ CPU, 2GB DDR2, XP x86 SP3) basically to have a look at it compared to when it was on the faster system (faster CPU and system design, but same XP x86 sp3, driver and boinc versions). I'm expecting some differences.

Running Stock, without optimization, to begin with.
First Beta returned in 481 sec (8min):
2299002 1451202 8 May 2010 12:25:54 UTC 8 May 2010 12:36:05 UTC Completed and validated 481.17 55.67 187.28 280.92 ACEMD beta version v6.23 (cuda30)

Now running a long Fermi task.

Noticed a few things:

When running the Beta, GPU usage was 82%
Running the long 6.73 WU, GPU uses 64%

When running the Beta, CPU usage (without optimization) was 15%
When running the 6.73 WU, CPU usage is 22%

Also, this different system has better cooling:
GPU 84 deg C when running the Beta, and 83 deg C when running the 6.73 WU.
Auto Fan speed 56% and 55% respectively.



On a previous system (with a better CPU) running the GTX470 at stock, a Beta finished in 383sec (6min 25sec) when optimized (swan_sync=0, not crunching CPU tasks, report tasks immediately), for the same 280.92 credits as the above Beta.
- So it took 26% longer on the system with the slower CPU and when swan_sync=0 was not enabled;
using a faster system (CPU) and swan_sync=0 shortened return time by 26%

The current 6.73WU will tell me how much slower this live WU is when using a slower system (mainly down to CPU) and without enabling swan_sync=0.

A 6.73 WU (running at stock) finished in 17,345 sec (4h 50min) for 11,931.63 credits (suggesting 59400 credits per day).

Next, I will run the Beta and 6.73 WU's with swan_sync=0 enabled:

That will show the difference between running with swan_sync enabled and not enabled; how much faster swan_sync can make the task run and the credit difference.
It will also determine the different between running the tasks on different systems (the eight threaded system compared to this dual core AMD 5200+ CPU); the influence of the CPU.

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16910 - Posted: 8 May 2010 | 17:06:03 UTC - in response to Message 16907.

Thank you for that info skgiven. You are really dissecting these new cards and I'm sure a lot of us new 470/480 owners appreciate. I know I certainly do.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16912 - Posted: 8 May 2010 | 19:43:42 UTC - in response to Message 16910.
Last modified: 8 May 2010 | 20:34:28 UTC

The long Fermi Work Unit completed in 22884sec (6h 21min), on XP x86, AMD 5200+, with no optimization, and running no CPU tasks:
2298270 1450742 8 May 2010 12:32:12 UTC 8 May 2010 18:58:06 UTC Completed and validated 22,884.27 7,483.39 7,954.42 11,931.63 Full-atom molecular dynamics v6.73 (cuda30)

So without using the swan_sync=0 variable, and using a lesser system (CPU) than before this task took about 32% longer.
- So a faster CPU and using the swan_sync=0 variable allows a 6.73 WU to finish 32% faster
(slightly better than the 26% observed for the 6.23 Beta)

- Now running another long WU (at stock) with the swan_sync=0 variable in place .


PS. Danger30Q,
When I had my GTX470 on W7 x64, and the same CPU as you, its performance was poor - similar to the performance you are seeing now; your quickest WU is 21,806. I am still seeing this when using no optimizations, but under XP x86 and using a lesser system (especially the CPU).

After trying everything I could think of with W7, I installed XP x86 on the same system (just different HDD), I set swan_sync=0 and found that my GTX470 completed a work unit much faster; 25% less time than your GTX480.

When I overclocked the card to 715MHz (with linked Shader rates) it completed a task (on XP x86, with swan_sync=0 on),
in 41% less time than your GTX480 on W7 64bit.
By my calculations a GTX480 should be 21% faster than a GTX470, so that is some difference!

I cant tell what setup you use, but I would guess you are crunching on your CPU and perhaps you are not using the swan_sync=0 variable?
- Think about this - A GT240 (£50 to £80) can do as much work as an Intel i7-980X (about £855).

I would strongly suggest that for now, and where possible, anyone with a Fermi should use XP or Linux, rather than Vista or W7, should use swan_sync=0 (XP), and should leave at least one CPU thread free to allow the GPU to excel.

Tomorrow we will have a fair idea how much the CPU matters when it comes to Fermi (swan_sync=0 now in place).

Provisionally (25% complete), it looks like about 20500sec to finish on the lesser system (and CPU) with swan_sync=0 enabled.
- This indicates the importance of the overall system performance, and in particular the CPU (just 17306sec on an 8 threaded system). So it would appear that the AMD 5200+ takes 18% more time than an i7-920, using the same GPU and operating system.
Althought this is not exact it is clear that the CPU makes a big difference!
It also strongly suggests that swan_sync=0 is an important factor (about 8%).

Good luck,

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16920 - Posted: 9 May 2010 | 4:19:57 UTC - in response to Message 16912.
Last modified: 9 May 2010 | 4:23:04 UTC

I do have SWAN_SYNC=0 enabled and both cards definitely have their own CPU thread. I'm not using my CPU for anything else except browsing the forums with Firefox. My CPU is only at 3.4ghz. My gpu usage on both cards is hovering around 55-57% at all times. Win7 x64 is just slow.

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16930 - Posted: 9 May 2010 | 17:56:30 UTC

Is there anyway I can run my GTX 295 along with my (2) GTX 480s on an EVGA Classified motherboard? I remember hearing issues with the Classified not being able to run 4 gpus in GPUgrid but I hope I'm wrong.

The 480s replaced my GTX 295 and I may either sell it or put it in a separate Linux-based home server and run GPUgrid on it. Until I decide what to do with it, I'd love to run it along with my 480s. I have a Corsair 1000W power supply.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16932 - Posted: 9 May 2010 | 19:19:42 UTC - in response to Message 16930.
Last modified: 9 May 2010 | 20:05:32 UTC

When I put the GTX470 into an AMD 5200+ system (without using swan_sync=0), it took 22,884sec to complete a task (11931 credits) which would earn 45K per day.
With swan_sync=0 in use, it took 20,325sec to complete a similar task. At that rate it would earn about 50.7K per day.
So, swan_sync=0 enabled on an AMD 5200+ system allows the task to complete 12.7% faster.

2298987 1451188 9 May 2010 0:52:49 UTC 9 May 2010 18:48:17 UTC Completed and validated 20,301.06 20,075.09 7,954.42 11,931.63 Full-atom molecular dynamics v6.73 (cuda30)
2298907 1451132 8 May 2010 12:36:05 UTC 9 May 2010 0:46:56 UTC Completed and validated 20,354.42 20,161.84 7,954.42 11,931.63 Full-atom molecular dynamics v6.73 (cuda30)

On an i7 system with similar setup (XP x86, swan_sync=0, native clocks, report tasks immediately) a task completed in 17,345sec (59K per day).
So, the faster system (i7) increased the speed significantly (by about 17%) compaired to a slower system (5200+).

Conclusions,
Dont use Vista or Win7 with a Fermi.
Put your Fermi into a fast system with a fast CPU.
Use swan_sync=0, and free up a core/thread (set Boinc to use 99% of the cores) so the GPU can benefit from the extra speed that will provide.

Now running a task at 715MHz GPU, 1430MHz Shaders, 1700MHz RAM:
The case it is in has better cooling, so at 66% fan the temp is 73 deg C.
The GPU % usage droped from about 64% to 60% when I upped the clocks!
Memory usage is only 382MB.
Task due to fin in 18,900sec; about 7% faster. On an i7 the same increase in clock rate sped the task up by 12%.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 16935 - Posted: 10 May 2010 | 9:20:18 UTC - in response to Message 16932.

There is now a large (~1000) heap of production WUs on the acemd queue.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16939 - Posted: 10 May 2010 | 14:37:30 UTC - in response to Message 16935.

There is now a large (~1000) heap of production WUs on the acemd queue.

Except that now it seems to only send to the Fermi v6.73 app and won't send to the v6.72 app for the rest of us? Why has the much faster v6.72 app been disabled and the slow v6.03 app left active? Can't be to make Fermi look faster than it really is I hope :-(

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16944 - Posted: 10 May 2010 | 16:38:22 UTC - in response to Message 16939.

There is now a large (~1000) heap of production WUs on the acemd queue.

Except that now it seems to only send to the Fermi v6.73 app and won't send to the v6.72 app for the rest of us?

Thanks much for turning v6.72 back on :-)

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16945 - Posted: 10 May 2010 | 20:33:58 UTC

skgiven,

I wonder if I have to put swan_sync=0 for linux running machine?
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16947 - Posted: 10 May 2010 | 21:51:18 UTC - in response to Message 16945.
Last modified: 10 May 2010 | 21:54:19 UTC

You do not need to make any such environmental variable changes with the present Linux application. It already uses a full core by default. You can see this by looking at your GPU and CPU times to complete any 6.04 WU. Your daily return speaks for itself:
41,755 credits per day on a GTX275.

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16956 - Posted: 11 May 2010 | 10:34:35 UTC

Are the WU for fermi-cards smaller?

Crunching one WU takes 2 hrs 16 min now (gtx480).

Correct?

Please allow to submit more WU then instead of 2 per card/cpu.
____________
Ton (ftpd) Netherlands

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16957 - Posted: 11 May 2010 | 10:35:39 UTC - in response to Message 16947.
Last modified: 11 May 2010 | 10:42:45 UTC

These Shorter Fermi WU's bring more points.

2312870 1453463 11 May 2010 2:32:38 UTC 11 May 2010 7:28:13 UTC Completed and validated 8,796.39 8,486.95 4,503.74 6,755.61 Full-atom molecular dynamics v6.73 (cuda30)
2311711 1453000 10 May 2010 21:13:38 UTC 11 May 2010 5:01:24 UTC Completed and validated 8,710.02 8,430.89 4,503.74 6,755.61 Full-atom molecular dynamics v6.73 (cuda30)

D273r71-TONI_HERGunb1-0-100-RND0189_1
D309r257-TONI_HERGunb1-0-100-RND7525_1

The last few longer Fermi tasks (IBUCH_0511_pYEEI_long) would only allow me to get about 53.8K per day (on this slow but OC'd system) and only used about 60% GPU.

These new shorter tasks would allow me to get 66.7K per day (on the same slow system).

The same card with the same clocks in an i7 sould get 78K per day.

Has there been a Fermi app improvement or is it just that these TONI_HERGunb tasks run particularly fast (GPU usage is now about 82%)?
We are talking about a 24% improvement.

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16958 - Posted: 11 May 2010 | 12:36:51 UTC

skgiven,
thanx man :-) I thought there is way (rather then get fermi) to improve my RAC.

I'm keeping eye on this thread to know when I should get fermi :-)
____________

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 16967 - Posted: 11 May 2010 | 18:58:17 UTC

I'm not sure if it's the newer work units but my GPU usage on both of the 480s is around 70%. A few days ago, I was lucky to hit 60%. Excellent job if this is the result of Fermi work unit improvements.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16998 - Posted: 13 May 2010 | 15:35:18 UTC - in response to Message 16605.

Performance for a reference molecular system (DHFR):
GTX275 9.0 ms/step (cuda3.0)
GTX275 8.5 ms/step (cuda2.2)
GTX470 7.1 ms/step (cuda3.0)
GTX480 5.9 ms/step (cuda3.0)

Running temperature of about 92 degrees for all of them.

gdf


In the latest code we reached:
GTX480 5.45 ms/step (cuda3.0)

gdf

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17000 - Posted: 13 May 2010 | 16:15:41 UTC - in response to Message 16998.

Excellent improvements, we look forward to more :cheers:
Those stats are for stock shader settings and stock voltages?
I'm also assuming the are from Linux systems? Can you share with us which flavor and version of Linux and CPU/ RAM specs.
____________
Thanks - Steve

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17001 - Posted: 13 May 2010 | 16:25:15 UTC - in response to Message 17000.

Excellent improvements, we look forward to more :cheers:
Those stats are for stock shader settings and stock voltages?
I'm also assuming the are from Linux systems? Can you share with us which flavor and version of Linux and CPU/ RAM specs.


Fedora core 10, GTX480 at stock clocks, X58 chipset and i7 2.66Ghz.

gdf

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 17003 - Posted: 13 May 2010 | 16:58:42 UTC - in response to Message 17001.

So then is it possible for us to run Fermi's on Linux now? I didn't see an annoucement.

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17004 - Posted: 13 May 2010 | 17:11:06 UTC - in response to Message 17003.

So then is it possible for us to run Fermi's on Linux now? I didn't see an annoucement.

wake up man :-) I'm crunching GPUGRID on ubuntu a year already :-)
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17005 - Posted: 13 May 2010 | 17:54:55 UTC - in response to Message 17004.

He was referring to the Fermi capable app being released to the Linux public.

MrS
____________
Scanning for our furry friends since Jan 2002

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17011 - Posted: 13 May 2010 | 20:19:02 UTC

ouch... sorry
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17023 - Posted: 14 May 2010 | 8:38:23 UTC - in response to Message 17005.
Last modified: 14 May 2010 | 8:38:36 UTC

I will release today the beta for Linux.
Promised.
gdf

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 17030 - Posted: 14 May 2010 | 12:31:56 UTC - in response to Message 17023.

And I will crunch them!!

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17042 - Posted: 14 May 2010 | 15:52:35 UTC - in response to Message 17030.

The new acemdbeta for Linux 64 bit is on.

gdf

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 17047 - Posted: 14 May 2010 | 17:58:39 UTC - in response to Message 17042.

so far have completed and validated 2 of them. looks good so far. Damn they are small, finished in 422 secs.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17077 - Posted: 15 May 2010 | 18:00:43 UTC - in response to Message 17047.

CUDA 3.1 beta is available to developers.
So shortly after the 3.0, maybe they have sorted some of the inefficiency of the 3.0 version.
We have not tried it yet.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17079 - Posted: 15 May 2010 | 19:10:28 UTC - in response to Message 17077.
Last modified: 15 May 2010 | 19:26:34 UTC

maybe they have sorted some of the inefficiency of the 3.0 version.
gdf

More likely they want developers to bug test it for them, before they release their Fermi based Teslas. You can see the updates they made here. See anything tasty?
Testing for them would be fine if they gave you a box of toys.

It wont be available to the general public (GTX470 and GTX480) users until next month, so no rush.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17089 - Posted: 17 May 2010 | 9:11:44 UTC - in response to Message 16998.

Performance for a reference molecular system (DHFR):
GTX275 9.0 ms/step (cuda3.0)
GTX275 8.5 ms/step (cuda2.2)
GTX470 7.1 ms/step (cuda3.0)
GTX480 5.9 ms/step (cuda3.0)

Running temperature of about 92 degrees for all of them.

gdf


In the latest code we reached:
GTX480 5.45 ms/step (cuda3.0)

gdf


In the latest code and with cuda3.1beta we reached:
GTX480 4.9 ms/step (cuda3.1beta)

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17090 - Posted: 17 May 2010 | 11:04:18 UTC - in response to Message 17089.
Last modified: 17 May 2010 | 11:08:32 UTC

Was that 11% improvement soley a result of using CUDA 3.1?

- was it the same code used to get 5.45 on CUDA 3.0?

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17091 - Posted: 17 May 2010 | 11:24:11 UTC - in response to Message 17090.

Was that 11% improvement soley a result of using CUDA 3.1?


Stop slavering SK it's disgusting.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17092 - Posted: 17 May 2010 | 11:30:17 UTC - in response to Message 17090.

Yes, just cuda3.1.

gdf

GPUGRID Role account
Send message
Joined: 15 Feb 07
Posts: 134
Credit: 1,349,535,983
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 17093 - Posted: 17 May 2010 | 12:52:05 UTC - in response to Message 17090.

was it the same code used to get 5.45 on CUDA 3.0?


Better performance in the CUDA 3.1 FFT library.

MJH

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 17106 - Posted: 18 May 2010 | 3:05:39 UTC

So compared to your cuda3.0 benchmark of 5.9ms for the 480 GTX, this new cuda3.1beta gives an entire 1s improvement at 4.9ms. That is a significant improvement!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17141 - Posted: 18 May 2010 | 20:11:12 UTC - in response to Message 17106.

It's a "1 ms per step" improvement, not 1 s ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 17173 - Posted: 20 May 2010 | 13:20:55 UTC

I created a dual boot with XP Pro and I installed boinc and the nvidia drivers. The only work units that I'm downloading are the v6.03 and they immediately error out. When I use Win 7 with my 480s, I only receive the v6.73 work units and those complete just fine.

What must I do with XP to get the v6.73 work units? Thank you in advance.

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17174 - Posted: 20 May 2010 | 13:36:13 UTC - in response to Message 17173.

I created a dual boot with XP Pro and I installed boinc and the nvidia drivers. The only work units that I'm downloading are the v6.03 and they immediately error out. When I use Win 7 with my 480s, I only receive the v6.73 work units and those complete just fine.

What must I do with XP to get the v6.73 work units? Thank you in advance.


In your account go to GPUGRID preferences and uncheck ACEMD ver 2
____________
Radio Caroline, the world's most famous offshore pirate radio station.
Great music since April 1964. Support Radio Caroline Team -
Radio Caroline

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,785,911,851
RAC: 9,152,841
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17176 - Posted: 20 May 2010 | 13:44:31 UTC - in response to Message 17173.

Your Windows XP and Windows 7 installations will have different HostIDs. Perhaps they've been set to different venues?

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 17180 - Posted: 20 May 2010 | 16:02:03 UTC - in response to Message 17176.

Richard, what specifically do you mean by different venues?

I did uncheck ACEMDv2 and will report back tonight once I get home and switch over to XP.

I appreciate all the help.

nico342
Send message
Joined: 20 Oct 08
Posts: 11
Credit: 2,647,627
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17181 - Posted: 20 May 2010 | 16:44:22 UTC - in response to Message 17180.

Is there any news about crunching with Fermi cards and Linux ?

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 17183 - Posted: 20 May 2010 | 17:03:47 UTC - in response to Message 17181.

Is there any news about crunching with Fermi cards and Linux ?


My 470s have been happily crunching beta units for several days now.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17184 - Posted: 20 May 2010 | 17:11:16 UTC - in response to Message 17180.
Last modified: 20 May 2010 | 17:11:46 UTC

@Danger ...

To check to see which Location your PC is using (this is what Richard was referring to as venue)...
GPUGrid Computers on this account
http://www.gpugrid.net/hosts_user.php

the third column should be named "Location"


then go to GPUGrid preferences
http://www.gpugrid.net/prefs.php?subset=project
to check your settings for the "Location" you set your WinXP instance to.
____________
Thanks - Steve

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 17215 - Posted: 22 May 2010 | 0:18:28 UTC

Thanks for the help everyone in getting XP Pro set up. I have completed 2 work units significantly quicker on XP compared to Win7 and that's without the SWAN_SYNC=0 enabled. I'll post some comparisons once a few work units complete with XP Pro and SWAN_SYNC enabled. I'll try to find the same work units to compare between XP and Win7. I'm sure my results will be similar to what others have already posted but maybe there have been some optimizations.

It's nice to see XP using over 97% gpu usage compared to the maximum ~70% (if lucky) on Win7.

STE\/E
Send message
Joined: 18 Sep 08
Posts: 368
Credit: 315,972,298
RAC: 98,666
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 17249 - Posted: 24 May 2010 | 8:37:57 UTC - in response to Message 16604.

The future will be brighter when also ATI works nicely.

gdf


As I no longer own any NVIDIA Cards I keep waiting, I'm almost Tempted to go out & buy a couple of Fermi's ... NOT ... haha

____________
STE\/E

Danger30Q
Send message
Joined: 11 Jul 09
Posts: 21
Credit: 3,021,211
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 17333 - Posted: 26 May 2010 | 12:24:54 UTC

Looks like the work unit pool for 6.73 has dried up. I can't get any more on either of my systems.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,785,911,851
RAC: 9,152,841
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17340 - Posted: 26 May 2010 | 13:38:16 UTC - in response to Message 17333.

Looks like the work unit pool for 6.73 has dried up. I can't get any more on either of my systems.

Go fishing for v6.05 instead. The Fermi version seems to work fine.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17434 - Posted: 29 May 2010 | 14:35:15 UTC

My experience with WinXP 32 has shown that there is only a very minor difference between HT ON and HT OFF for an i7-920 when you have SWAN_SYNC=0.
____________
Thanks - Steve

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,575,120,193
RAC: 7,991,076
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17435 - Posted: 29 May 2010 | 16:14:08 UTC - in response to Message 16932.


Now running a task at 715MHz GPU, 1430MHz Shaders, 1700MHz RAM:

What voltage did you set to drive the GTX470 with those settings? At stock voltage the application crashed so I increased it to 1.050 V.

Any other recommendations? Thanks in advance :-) !

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17439 - Posted: 29 May 2010 | 19:10:24 UTC - in response to Message 17435.

Any other recommendations?


Well.. yes. Better don't increase the voltage ;)
It reduces chip longevity and drives power consumption up. On a chip like Fermi it's also a considerable factor that increased temperature (due to voltage increase) increases the leakage quite a bit, so your card becomes less power efficient (=higher electricity cost). I'd rather try to improve cooling & temperatures. That would also give you an increased frequency headroom. Probably not as much as voltage increases, but without the negative side effects.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17442 - Posted: 29 May 2010 | 19:58:12 UTC - in response to Message 17439.

My GTX470 is at 704MHz GPU, 1407MHz shaders and 854MHZ RAM (x4).
Voltages are not increased!

chumbucket843
Send message
Joined: 22 Jul 09
Posts: 21
Credit: 195
RAC: 0
Level

Scientific publications
wat
Message 17449 - Posted: 30 May 2010 | 4:29:43 UTC - in response to Message 17439.

Any other recommendations?


Well.. yes. Better don't increase the voltage ;)
It reduces chip longevity and drives power consumption up. On a chip like Fermi it's also a considerable factor that increased temperature (due to voltage increase) increases the leakage quite a bit, so your card becomes less power efficient (=higher electricity cost). I'd rather try to improve cooling & temperatures. That would also give you an increased frequency headroom. Probably not as much as voltage increases, but without the negative side effects.

MrS

OCing is not as detrimental as your make it sound. modern processes are designed to handle very high and very low temperatures. a major consideration when designing a chip is robustness. in fact it is so important that they are overly conservative with clocks and volts, especially with server/professional chips. that's free performance for us. 400 series is great for OCing too, partially from architecture and partially from high leakage.

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,575,120,193
RAC: 7,991,076
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17454 - Posted: 30 May 2010 | 8:59:52 UTC - in response to Message 17449.
Last modified: 30 May 2010 | 9:53:30 UTC

Thanks to all for your help.
skgiven, I am using your settings now.

Edit:
Sorry but with the standard setting of 0,962 V the appliction craches. I have to select at least 0,975 V to have a stabile GTX470 with skgiven's clock settings :-/

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17458 - Posted: 30 May 2010 | 11:21:48 UTC - in response to Message 17454.

@roundup: well, if you want to overclock to a given frequency then there's no guarantee that your chip can make it at a certain voltage. Chips vary. The increase to 0.975 V is not too much, though :)
The jump from 0.96 to 1.05 V would have increased your power consumption by 19%, i.e. would have brought you from 210 W TDP to 250 W. That's not even factoring in increased leakage due to higher temperatures and the power consumption increase due to frequency. If you go to 0.975 V it's a much more modest 2.7% increase.
BTW: the frequency increase itself increases your power consumption by 15% - but that's OK since you're also getting 15% more performance out of the card.

@chumbucket843: actually they make the chips more vulnerable to damage by shrinking the transistor dimensions. Having a dopant atom swap place with a neighbouring atom starts to hurt if your channel is only a couple of atoms long. We're not that small yet, but it serves to illustrate the problem.
For CPUs I'd agree that they can take quite a beating. So far I've only seen one single chip fail personally (and I've seen many). However, the situation is different for GPUs: the high end chips are already being driven quite hard at the edge of stability. And usually they run at 80 - 95°C, much higher than CPUs. Not because they could take it by design, but rather because it's too expensive and loud to cool them any better. And because noone's gaming 24/7. They're made so that most of them survive the warrenty period under typical loads - which does not include BOINC.
And the GTX-Fermi cards are not professional cards. They're meant for gamers (hence the crippling of double precision performance). And the high leakage is actually something nVidia would love to get rid of - however, it's a byproduct of the process variation in TSMCs infamous 40 nm process. So there's not much they could do about it without crippling performance. They've got one thing going for them, though: on an absolute scale their stock voltage is quite low to keep power consumption somewhat in check (one may argue whether they succeed at that - but that's not the point). Hence the chip degradation due to voltage alone (not talking about temperature here) is not as strong as for other chips, and is thus less important upon voltage increases. So practically you'll only have to deal with temperature & power consumption increases upon voltage increases.

BTW: I think overclocking and overvolting have to be clearly distinguished. I love overclocking, as it provides performance basically for free and actually increases efficiency. But I don't generally recommend high voltages, as they reduce the chip lifetime and efficiency (past a certain point).

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17470 - Posted: 30 May 2010 | 20:05:37 UTC - in response to Message 17458.

Fortunately the Fermi's can be over-volted in very small amounts. I initially tried to increase it, when I was struggling to use a Fermi with Win7 (I failed to get reasonable performance, and that is still the picture). I was able to increase my GTX470 to about 750MHz if I remember correctly, and stay under 1V - not that I was ever going to keep it at that!

I think a small tweak in Voltage is reasonable if it allows reasonable performance gain and relative power usage. So a 15% increase in performance while increasing power usage by 15% seems fair enough. If you take into consideration the full power used by the system, it might be more like 8% increase in power consumption.

I agree that increasing the Voltage too much is not just wasteful in terms of power consumption, but in terms of reducing the longevity of your card.
I forked out £320 to crunch with a Fermi; I want to crunch not crash.
- Getting burnt is bad, but burning yourself is just stupid!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17471 - Posted: 30 May 2010 | 21:32:28 UTC - in response to Message 17470.

Agreed.. just want to clarify a detail, if it wasn't clear before: 15% power for 15% performance is only due to frequency increase. Touching the voltage adds to this (or in fact multiplies). So in roundups example he'd get 1.15 * 1.19 = 1.37, i.e. a 37% increase in power consumption if he went for the higher clocks at 1.05 V.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 851
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17528 - Posted: 4 Jun 2010 | 15:48:18 UTC - in response to Message 17092.

Yes, just cuda3.1.

gdf


When do you plan to release this CUDA3.1 (beta) client?

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17555 - Posted: 10 Jun 2010 | 11:09:35 UTC - in response to Message 17528.

Only when CUDA3.1 is out.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17561 - Posted: 11 Jun 2010 | 0:41:12 UTC - in response to Message 17555.

GeForce/ION Release 256 BETA 257.15 May 24, 2010

http://www.nvidia.com/Download/Find.aspx?lang=en-us
http://www.nvidia.com/object/winxp-257.15-beta.html
http://www.nvidia.com/object/win7-winvista-32bit-257.15-beta.html

Adds support for CUDA Toolkit 3.1 which includes significant performance increases for double precision math operations. See CUDA Zone for more details.

The XP driver has been working fine for weeks.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,785,911,851
RAC: 9,152,841
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17567 - Posted: 11 Jun 2010 | 10:07:07 UTC

I've recently moved my Fermi GTX 470 from host 71413 (Windows 7, 32-bit) to host 43404 (Windows XP, 32-bit, running as a service). I moved the 9800GTX+ in the opposite direction.

On both hosts, I started the Fermi with driver 197.75, and then upgraded to driver 257.15_beta to test some CUDA 3.1 stuff for another project. I don't think the speed of the current cuda30 v6.05 ACEMD2 changed significantly with the driver change: if anything, it was slightly slower on the Beta driver (as you might expect). I think we'll have to wait for a new cuda31 app as well before we see any benefit from the driver.

What was significant was the increase in speed when I put the Fermi into the WinXP box. Times went down from 19,000+ seconds (with Swan_Sync and 95% CPU usage) to 11,000+ seconds and under 15% CPU. It's difficult to tell how much of that is due to a more modern hardware platform, and how much was the operating system, but it was a dramatic change.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17568 - Posted: 11 Jun 2010 | 11:07:30 UTC - in response to Message 17567.

Until they compile using the CUDA 3.1 toolkit you will see no change in using the 257.15 driver.

It has been well reported (by me and others) that Fermi cards dont work well under Vista or Win 7. They work, but at 60% speed. I dont know why, but it is either the driver or the app.

Basically, if you have a Fermi use XP or Linux.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,785,911,851
RAC: 9,152,841
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17569 - Posted: 11 Jun 2010 | 11:40:01 UTC - in response to Message 17568.

Those reports are largely what prompted me to make the swap - I just thought I'd post some more actual figures.

The 9800GTX+ which moved in the opposite direction also slowed down, but to a lesser extent - maybe 20% - and started using a lot more CPU. Now, its new host has a slower CPU, so you would expect it to need more seconds - but not a four-fold increase. That must be down to Windows 7, too.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17571 - Posted: 11 Jun 2010 | 16:12:00 UTC - in response to Message 17569.

Those reports are largely what prompted me to make the swap - I just thought I'd post some more actual figures.

The 9800GTX+ which moved in the opposite direction also slowed down, but to a lesser extent - maybe 20% - and started using a lot more CPU. Now, its new host has a slower CPU, so you would expect it to need more seconds - but not a four-fold increase. That must be down to Windows 7, too.

Yep, you got it. The slowdown for Win7 was reported as soon as the OS was officially released. It's specific to GPUGRID and nothing has been done to resolve the problem so far. Here's a thread on the subject:

http://www.gpugrid.net/forum_thread.php?id=1729

mreuter80
Send message
Joined: 6 Feb 10
Posts: 1
Credit: 5,434,095
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwatwat
Message 17574 - Posted: 11 Jun 2010 | 17:13:57 UTC - in response to Message 17571.

It's not specific to GPUGRID. Also other application suffer from windows WDDM drivers.
An explanation can be found here: http://forums.nvidia.com/lofiversion/index.php?t150236.html
As per this thread NVIDIA working on a solution with MS.

Profile bloodrain
Send message
Joined: 11 Dec 08
Posts: 32
Credit: 748,159
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwatwatwatwat
Message 17613 - Posted: 15 Jun 2010 | 0:04:54 UTC - in response to Message 16513.

hehe. one thing i learn with just release video cards is bad memory tends to be first issue with cards.

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,575,120,193
RAC: 7,991,076
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17623 - Posted: 15 Jun 2010 | 13:54:02 UTC - in response to Message 17471.

Agreed.. just want to clarify a detail, if it wasn't clear before: 15% power for 15% performance is only due to frequency increase. Touching the voltage adds to this (or in fact multiplies). So in roundups example he'd get 1.15 * 1.19 = 1.37, i.e. a 37% increase in power consumption if he went for the higher clocks at 1.05 V.

MrS

One more interesting observation regarding the voltage:
I now have a second GTX470 (same type, same manufacturer) running in a different computer (i7-860@ 3.5 GHz, let's call it PC2). The 470 selects automatically the setting of 1.025 V - even without overclocking. In the first computer (i7-920@stock, called PC1 for now) the setting was 0,962 V with the ability for stabile overclocking at 0,975 V.
I changed the graphic cards between the computers to check if the different voltage setting is caused by the card ot by other parts of the computers (like Motherboards, processors, power supply, etc). The result is that in both computers the same voltage setting as before occurred: PC2 operates the GTX470 again @ 1,025 V, PC1 again @ 0,962 V.


Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17625 - Posted: 15 Jun 2010 | 14:58:20 UTC - in response to Message 17623.

Perhaps you have a PCIE voltage overclock set on the motherboard. This can happen if it is linked to the bus.
Alternatively, you have a bad Motherboard or the reading is incorrect.
Check you did not up it in NVidia Control Panel or some software (if you had a different card installed previously).

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17627 - Posted: 15 Jun 2010 | 16:30:17 UTC

@roundup - The fermi's have their voltages individually tweaked before being sent out for stability so you will see differences between cards. Some chips are just better than others and need less volts. I believe this is the situation you are observing as you say the volts are the same per card no matter which board you install them in.
____________
Thanks - Steve

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17629 - Posted: 15 Jun 2010 | 18:26:39 UTC - in response to Message 17627.

@Snow: actually he said the opposite is true, the PCs apparently run their cards at different voltages, no matter which card is put in.

One thing I would check is whether both systems use the same software versions, as far as possible. Important is the tool to read out and set the voltage. It could be that after some time with Fermis the developer realized a mistake in reading out the voltages (this can happen disturbingly easy) and corrected it in a later release.

A forgotten setting somewhere in some tuning software could obviously also be possible.

MrS
____________
Scanning for our furry friends since Jan 2002

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,575,120,193
RAC: 7,991,076
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17630 - Posted: 15 Jun 2010 | 21:53:48 UTC - in response to Message 17629.

@Snow: actually he said the opposite is true, the PCs apparently run their cards at different voltages, no matter which card is put in.

Exactly.
When I put the second card in the second computer, no tuning software was active. The same applied, when I changed the cards between the computers. At the time of changing there was no GPU relevant tuning software in place - just the 197.75 driver.
It`s only the CPU in the second computer that has been tuned from 2.8 GHz to 3,5 GHz (with proper cooling solution, of course). This computer tunes up the voltage of both GTX 470 to 1.025 V.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 851
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17631 - Posted: 16 Jun 2010 | 0:44:27 UTC - in response to Message 17555.

This brand new 257.21 WHQL driver isn't beta and supports CUDA 3.1.
Does it means from (y)our point of view, that CUDA 3.1 is out?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17635 - Posted: 16 Jun 2010 | 10:00:32 UTC - in response to Message 17631.
Last modified: 16 Jun 2010 | 10:02:46 UTC

Yes, CUDA 3.1 has been released to the General Public by NVidia. Thanks for the post.
I started a new thread for CUDA 3.1

Roundup, did you check the Bios (to see if the OC was linking PCIE to your BUS/QPI)? How are you OC'ing the i7? (Bios or software)?

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17641 - Posted: 16 Jun 2010 | 20:57:35 UTC

I'd try switching to stock settings for the motherboard and check which voltage the card gets. Afterwards OC again, of course ;) And can you tell a difference in GPU temperature between the two machines, preferably using the same card? Maybe with the case open to reduce the effect of case air flow. I know these numbers will fluctuate quite a bit, but if you can see a clear difference then the voltage difference is certainly for real.
And maybe do a sanity check first: monitor the temperature over a few minutes while running GPU-Grid to make sure it's stable. Then change the voltage to the value corresponding to the other PC and check if you can spot a temperature and/or fan speed change at all.

MrS
____________
Scanning for our furry friends since Jan 2002

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,575,120,193
RAC: 7,991,076
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17647 - Posted: 17 Jun 2010 | 8:09:51 UTC - in response to Message 17635.


Roundup, did you check the Bios (to see if the OC was linking PCIE to your BUS/QPI)? How are you OC'ing the i7? (Bios or software)?

I am OC'ing the i7-860 with BIOS: Tuning up the base clock, QPI and RAM multiplier tuned down a bit. The i7-920 operates at stock speed.


I'd try switching to stock settings for the motherboard and check which voltage the card gets. Afterwards OC again, of course ;) And can you tell a difference in GPU temperature between the two machines, preferably using the same card? Maybe with the case open to reduce the effect of case air flow. I know these numbers will fluctuate quite a bit, but if you can see a clear difference then the voltage difference is certainly for real.
And maybe do a sanity check first: monitor the temperature over a few minutes while running GPU-Grid to make sure it's stable. Then change the voltage to the value corresponding to the other PC and check if you can spot a temperature and/or fan speed change at all.

Even with stock settings for the motherboards and graphic cards one GTX470 desires 1.025 V.
For checks of GPU temperature and fan speed both cases were open - in fact they are open since I have built in the cards :).
I got the following results (GPUs at stock speeds / stock voltages):
GPU 1: 0,962 V, GPU load 65%, 79°C, Fan 52%
GPU 2: 1.025 V, GPU load 74%, 85°C, Fan 57%

Then I selected the voltage of GPU 2 down to 0,975 V. After 10 minutes I got the following values:
GPU 2: 0.975 V, GPU load 74%, 82°C, Fan 54%.

Now I OC'ed both GPU to 702/1708/1404 (GPU/Mem/Shader). This delivered the following results:
GPU 1: 0,975 V, GPU load 63%, 82°C, Fan 54% (I know already that I have to increase the voltage for GPU 1 to 0,975 V for stable operation)
OC with 0.975 V led to a black screen for GPU 2 after 6 minutes. However, rebooting showed that the ACEMD2 application did NOT crash.

Changing the cards between both computers does not show significantly different results. GPU 2 led to a black screen in the other PC immediately when OC'ed and set to 0,975 V.

It seems to me that NVIDIA sets different factory voltages dependent on the quality of certain parts of the cards.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17652 - Posted: 17 Jun 2010 | 14:19:30 UTC - in response to Message 17647.

I am OC'ing the i7-860 with BIOS: Tuning up the base clock, QPI and RAM multiplier tuned down a bit. The i7-920 operates at stock speed.
In the Bois of your i7-860 system, is the PCIE clock linked to the Bus, and does the bus automatically increase the voltage on that board?

Even with stock settings for the motherboards and graphic cards one GTX470 desires 1.025 V.
For checks of GPU temperature and fan speed both cases were open - in fact they are open since I have built in the cards :).
I got the following results (GPUs at stock speeds / stock voltages):
GPU 1: 0,962 V, GPU load 65%, 79°C, Fan 52%
GPU 2: 1.025 V, GPU load 74%, 85°C, Fan 57%

So both cards did not automatically go to 1.025V?
Only the second GPU on the i7-860?

Architecturally there is a big difference between the two motherboards in question, with the controller being on the CPU for the i7-860. However, I think what may have happened is that the first card installed set the Voltage. So, earlier, when you moved the cards, it kept the voltage it had before, thinking it was the exact same card, making it appear that GPU Voltage was motherboard dependent.

Then I selected the voltage of GPU 2 down to 0,975 V. After 10 minutes I got the following values:
GPU 2: 0.975 V, GPU load 74%, 82°C, Fan 54%.

That looks like a good stable voltage for that system.

Now I OC'ed both GPU to 702/1708/1404 (GPU/Mem/Shader). This delivered the following results:
GPU 1: 0,975 V, GPU load 63%, 82°C, Fan 54% (I know already that I have to increase the voltage for GPU 1 to 0,975 V for stable operation)
OC with 0.975 V led to a black screen for GPU 2 after 6 minutes. However, rebooting showed that the ACEMD2 application did NOT crash.

So the GPU failed, but the app is essentially run via the CPU, which was fine (being native), and presumably recovered to the last checkpoint. Anyway, that Voltage was not enough to support GPU2 at that frequency.

Changing the cards between both computers does not show significantly different results. GPU 2 led to a black screen in the other PC immediately when OC'ed and set to 0,975 V.
To me this suggests that on the i7-920 the card is less stable, so as I suggested above, perhaps the 860 allows for a greater Voltage range (automatically Volt regulating to draw, up to a point). It could for example supply 0.980 rather than 0.975 exactly, or the 920 could be supplying 0.970! Only a small difference but perhaps enough to let the 860 run for 10 minutes, while the 920 failed immediately.

It seems to me that NVIDIA sets different factory voltages dependent on the quality of certain parts of the cards.
For sure, and as Steve said, "The fermi's have their voltages individually tweaked before being sent out for stability so you will see differences between cards".
This would naturally even apply to the same card range from the same manufacturer, as it depends on the individual quality (leakage) of the core.

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,575,120,193
RAC: 7,991,076
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17653 - Posted: 17 Jun 2010 | 14:56:38 UTC - in response to Message 17652.

In the Bois of your i7-860 system, is the PCIE clock linked to the Bus, and does the bus automatically increase the voltage on that board?

No, the PCIE clock in the 860 is running at stock speed. Voltage settings for the board (MSI P55-GD65) are set to auto - so yes, it increases voltage automatically.


So both cards did not automatically go to 1.025V?
Only the second GPU on the i7-860?

No, just GPU 2 operates automatically at 1.025 V - in both PC. The other card went automatically to 0,962 V - also in both PC. I have to increase the voltage by one step for this card to enable a stable overclocking.

It seems to me that NVIDIA sets different factory voltages dependent on the quality of certain parts of the cards.
For sure, and as Steve said, "The fermi's have their voltages individually tweaked before being sent out for stability so you will see differences between cards".
This would naturally even apply to the same card range from the same manufacturer, as it depends on the individual quality (leakage) of the core.

Yes, all the testing fully confirmed Snow Crash's posting regarding the individually tweaked voltages.

Thanks for you help to interpret the test results.
@MrSpadge: Thanks for the suggestion of the test setup.

chumbucket843
Send message
Joined: 22 Jul 09
Posts: 21
Credit: 195
RAC: 0
Level

Scientific publications
wat
Message 17722 - Posted: 28 Jun 2010 | 1:42:46 UTC

CUDA 3.1 is out.
http://developer.nvidia.com/object/cuda_3_1_downloads.html

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18722 - Posted: 19 Sep 2010 | 19:59:41 UTC - in response to Message 17722.
Last modified: 19 Sep 2010 | 20:01:38 UTC

The developer version of CUDA 3.2 is presently in Beta and the latest Beta driver supports CUDA 3.2.
Hopefully by the end of the month these will bring further improvements, particularly to GTX460’s.

I decided to consolidate some of my older systems; basically I bought a new GTX470 to replace 3 systems; each with a single GT240.
The new Asus ENGTX470 is now in my i7-920 system, along with an older ENGTX470. This consolidation of resources is to reduce my electric bill and the number of systems I use while improving the amount of crunching I do.

Anyway, I noticed ASUS made a few changes to their ENGTX470 card since its first release:
It now says Ver 2 on the back, the fan can run faster, it stays cooler, and the VDDC is only 0.9370 Volts – even when overclocked to 731MHz ;)

There are no obvious external changes to the card, but it does come with new Firmware (70.00.21.00.03). I picked it up for £219 inc.
When idle the system uses 158W. With the cards overclocked to 731MHz and 715MHz the system draws around 500W at the socket (crunching 6 CPU tasks and 2 Fermi tasks).

I saved about 400W overall, and will do more work.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18723 - Posted: 20 Sep 2010 | 8:24:35 UTC - in response to Message 18722.

and the VDDC is only 0.9370 Volts – even when overclocked to 731MHz


That would be pure luck - or a side effect of a more mature 40 nm process at TSMC. Or maybe both ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,378,322,716
RAC: 9,641,992
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18727 - Posted: 20 Sep 2010 | 23:28:58 UTC

I hope this version will improve the performance of Windows 7 machine, hopefully on par with XP, or is this just wishful thinking?

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,575,120,193
RAC: 7,991,076
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18734 - Posted: 21 Sep 2010 | 17:02:24 UTC - in response to Message 18722.

Anyway, I noticed ASUS made a few changes to their ENGTX470 card since its first release:
It now says Ver 2 on the back, the fan can run faster, it stays cooler, and the VDDC is only 0.9370 Volts – even when overclocked to 731MHz ;)

Hi skgiven,
what is your clock setting for the memory in this 731MHZ setup?
I guess you are using 1462 MHz for the Shaders, right?

Greetings

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18739 - Posted: 21 Sep 2010 | 22:23:27 UTC - in response to Message 18734.

... in this 731MHZ setup?
I guess you are using 1462 MHz for the Shaders, right?


The shaders always run at twice the core clock for Fermi based chips (GF 100, 104, 106, 108).

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18740 - Posted: 21 Sep 2010 | 22:43:52 UTC - in response to Message 18739.
Last modified: 22 Sep 2010 | 13:04:22 UTC

Yeah, the shaders are linked to the GPU speed, so they are at 1462 MHz.

I left the GDDR at NVidia’s reference 3348MHz, but is read as 1674MHz using EVGA Precision and half that going by GPUz.

I presumed increasing the RAM speed makes little difference to performance, and might heat up the card and increase power usage, but I did not try (well not since I got errors using the first card when trying desperately to run it on Win7).

The GTX480 uses Ram speeds of 3696, possibly to balance against it’s native 700MHz core, so perhaps I should be trying to match the RAM speeds better (3859MHz for the 731MHz card).

I just started to try the other card at 1887/3774, to see if it makes any difference:
A quick look and it seems stable.
Memory Controller load fell from 22% to 19% and there was no change is temps or noticeable change in system power usage, but I will have to wait a while to see if it finishes a task, and if it is any faster.

- Turned out that I started to get lots of failures, one card stopped running and I even a system reboot. So both cards have now been pegged back to 715MHz, with default RAM speeds.

Any RAM speed improvement, would not offset the cost of result failures and reboots, so it's definitely not worth it.

Perhaps with a single card in the system, cooling would be easier, and slightly upping the RAM might be worthwhile, but I don’t have the time to test this any further. I also remember that there were concerns with the GF100 memory controller, which is probably why they chose 4GHz max memory, when they could have used 5GHz. My guess is the controller failed and not the RAM.

People might be more successful trying to up the RAM on GF104 or GF106 cards, but I would suggest you test them another way before doing it here. I also doubt you will see significant improvement and may see a slight reduction in performance, for tasks that actually run successfully.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 851
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18771 - Posted: 27 Sep 2010 | 16:55:21 UTC - in response to Message 18722.

The developer version of CUDA 3.2 is presently in Beta and the latest Beta driver supports CUDA 3.2.
Hopefully by the end of the month these will bring further improvements, particularly to GTX460’s.

Actually it's more than a beta, it's a Release Candidate.

...The new Asus ENGTX470 is now in my i7-920 system, along with an older ENGTX470. ... (crunching 6 CPU tasks and 2 Fermi tasks).

You should take in account that an i7-920 acually has 4 cores and Hyper Threading doubles them. But there is only one FP unit in each core, what is used by most projects. Therefore HT won't double the number of simultaneous FP tasks. That means if you run 4 tasks (using FP) on this i7-920, you'd get the same performace (tasks will finish in half the time, or will do twice as much calculations during the same time - it depends on the project). (correct me if you have different experience). Considering how GPUGRID works, the best performance you could achieve would be running 4 CPU tasks and 2 GPUGRID tasks. But I haven't had experiments with i7 and GPUGRID.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18772 - Posted: 27 Sep 2010 | 19:54:21 UTC - in response to Message 18771.

That means if you run 4 tasks (using FP) on this i7-920, you'd get the same performace (tasks will finish in half the time, or will do twice as much calculations during the same time - it depends on the project). (correct me if you have different experience).


To achieve this you'd need either
- 100% regular code: no mispredicted branches, no data which must be fetched from caches or memory
- an infinitely large L1 cache as main memory with an access latency of 1 cycle

Sadly neither applies to the real world, the CPU often has to wait for something. That's why higher memory speeds, larger and faster caches etc. all help performance. HT can be used to keep a core busy if one thread runs into such a situation, even if both threads would be purely using FP.

Personally I have seen very good BOINC performance of i5 / i7 CPUs with HT on.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 851
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18773 - Posted: 27 Sep 2010 | 22:05:15 UTC - in response to Message 18772.
Last modified: 27 Sep 2010 | 22:25:23 UTC

Sadly neither applies to the real world, the CPU often has to wait for something. That's why higher memory speeds, larger and faster caches etc. all help performance. HT can be used to keep a core busy if one thread runs into such a situation, even if both threads would be purely using FP.

Personally I have seen very good BOINC performance of i5 / i7 CPUs with HT on.

MrS

You right about that, that's why I wrote "it depends on the project" - the more FP utilization the less gain with HT turned on. Crunching tasks are using far more FP than "real wolrd" applications, more likely as a benchmark does. Also, an FP operation takes much longer than an integer or some data moving op, using different part of the execution unit, so it can run ahead with code execution and prefetch, while the FP operation lasts.
When i7 was released, I was experimentig with rosetta@home (the other project I crunch for), and I made three observations:
1. Core i7's FP is not any faster than the Core2duo's or Core2quad's at the same clock speed (while integer performance and overall system performance is significantly better, and i7 is available at higher stock speeds than C2D or C2Q)
2. HT doesn't double Rosetta's performance, nor makes significant improvement (I don't remember the exact numbers, but it was about under 10%)
3. Rosetta's RAC (and BOINC's CPU benchmark results) increase almost in direct ratio with C2Q's and i7's clock speed regardless of FSB and RAM speed or cache size or even the MB. (I had a Core2Quad 6600 for a long time, starting in a G965 chipset MB, later in a P45 chipset motherboard, and even later I got a C2Q 9550 in another P45 chipset MB. So I came to that the tasks are limited only by FP speed)
I'm really curious how much performance gain you can have with HT turned on and in which project.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18780 - Posted: 29 Sep 2010 | 8:20:58 UTC - in response to Message 18773.

I haven't turned HT off yet (as there was no reason), but I encourage you to try it with Einstein CPU WUs. Also back in the SETI classic days the original Northwood with HT gained 50% throughput. Granted, the current app will probably more optimized and thus see less gain. but it shows the potential of HT for BOINC.
And somewhat surpisingly: it didn't matter if I threw some ABC (pure integer) into the mix, the Einstein times did not really change.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Fred J. Verster
Send message
Joined: 1 Apr 09
Posts: 58
Credit: 35,833,978
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18782 - Posted: 29 Sep 2010 | 11:02:32 UTC - in response to Message 18780.
Last modified: 29 Sep 2010 | 11:03:23 UTC

When looking at the last Validated Results, I noticed that running the
ACEMD2: GPU molecular dynamics v6.11 (cuda31), on the 200
series NVidia cards, (too) often errors out, while the FERMI's, in this case,
a GTX480 & GTX470(on an ASUS P5E Mobo and QX9650 CPU), all running 'stock', do the job whithout errors.

200 vs 400 series

Different CUDA version.
Also .

I do have 3 200 series cards and older: 8500GT, 8900GTX+ and GTS250, can I put them to use on GPUGrid and what app. should I use?
____________

Knight Who Says Ni N!

Post to thread

Message boards : Graphics cards (GPUs) : GPUGRID and Fermi

//