Advanced search

Message boards : Graphics cards (GPUs) : ACEMD2 6.12 cuda and 6.13 cuda31 for windows and linux

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19362 - Posted: 9 Nov 2010 | 11:59:09 UTC
Last modified: 9 Nov 2010 | 14:04:53 UTC

I have just benchmarked 6.12 under Linux on a GTX275 and there is less than 2% difference in time between 6.12 and 6.05.

For Linux, 6.12 also uses less CPU than 6.05 and so it is slower. If you want to get to maximum speed use the environment variable SWAN_SYNC=0. Just set in .bashrc (if you use bash as a shell):
export SWAN_SYNC=0

We have asked Anderson to add a simpler way to set this variable and there will be a specific configuration for it in the boinc client.

Also 6.12 is capable of higher performance but we need all applications to be updated (even the cuda3.2 ones) to start this mode of computing.

Please report here what sort of problems do you have with 6.12. Politely and accurately. We simply don't see any problem on our machines.

gdf

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19371 - Posted: 9 Nov 2010 | 16:39:27 UTC - in response to Message 19362.
Last modified: 9 Nov 2010 | 16:41:33 UTC

For Linux, 6.12 also uses less CPU than 6.05 and so it is slower. If you want to get to maximum speed use the environment variable SWAN_SYNC=0. Just set in .bashrc (if you use bash as a shell):
export SWAN_SYNC=0

WTF?
In what GUI do I set this variable?
What is .bashrc?
How will it interact with any other program I'm running?

And please don't use in-talk, I'm a user, not nerd. I hate terminal inputs, but will make them if it's good explained why I should and what I'm doing there and why there is no decent, i.e. GUI, way of doing this.

Edith says:
I've just started a new 6.12, and my GPU is freezing. I think the main task atm is really just organizing my screen.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19382 - Posted: 9 Nov 2010 | 20:27:48 UTC - in response to Message 19371.

I've just started a new 6.12, and my GPU is freezing. I think the main task atm is really just organizing my screen.

It's now running since 3:50h and has done 2.888%, if it goes on with that "speed" it will be finished in just under 130h, or nearly 6 days. It may or may not be within the deadline, it's about that time.
With the old app a WU of that type would have been finished in 17h, so this new app increases the crunching time by a whopping factor 8, not as expected the speed.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19384 - Posted: 9 Nov 2010 | 20:39:37 UTC - in response to Message 19382.

Well, clearly a factor 8 is not a general issue.
What if you update drivers to get the cuda3.1 app.

gdf

I've just started a new 6.12, and my GPU is freezing. I think the main task atm is really just organizing my screen.

It's now running since 3:50h and has done 2.888%, if it goes on with that "speed" it will be finished in just under 130h, or nearly 6 days. It may or may not be within the deadline, it's about that time.
With the old app a WU of that type would have been finished in 17h, so this new app increases the crunching time by a whopping factor 8, not as expected the speed.

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19386 - Posted: 9 Nov 2010 | 21:37:49 UTC - in response to Message 19384.
Last modified: 9 Nov 2010 | 21:38:06 UTC

Well, clearly a factor 8 is not a general issue.
What if you update drivers to get the cuda3.1 app.

gdf

I've the highest numbered drivers from the ubuntu repository. The version number is 195.36.24.
I've experimented with my old card and some bricolage in the past and it didn't work at all, I wasn't even able to reinstall the proper one back that came with ubuntu, although as proprietorial. I had to do a complete system reinstall and tedious file backups afterwards with a friend of mine as a helping hand, I don't want to have that experience again.

If you guaranty me that nothing will go wrong, I would try to hand-install some newer stuff, but I need precise instructions, and of course a way to undo everything in case something goes wrong.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19389 - Posted: 9 Nov 2010 | 22:16:09 UTC - in response to Message 19386.

No, I can't guarantee.
You might want to try with the USB Linux stick without installing anything in your system. (See the join section)
gdf

Profile Bikermatt
Send message
Joined: 8 Apr 10
Posts: 37
Credit: 3,839,902,185
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19391 - Posted: 9 Nov 2010 | 23:04:01 UTC

The 6.12 app has cut 8 to 10 thousand seconds off my GT240 task times in Ubuntu 10.4 with Nvidia driver 260.19.12!

I was about to retire my 240s but I think I will try to get them back into a system now. I there anyway they can be put in the same system with a fermi?

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19408 - Posted: 10 Nov 2010 | 10:28:14 UTC - in response to Message 19391.

Is 6.12 a release app now?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19410 - Posted: 10 Nov 2010 | 11:44:25 UTC - in response to Message 19408.

Yes, 6.12 is in use:

GT200 cards on Linux can run ACEMD2: GPU molecular dynamics v6.12 (cuda)
Fermi cards on Linux can run ACEMD2: GPU molecular dynamics v6.06 (cuda30)

GT200 cards on Windows can run ACEMD2: GPU molecular dynamics v6.12 (cuda)
Fermi cards on Windows can run ACEMD2: GPU molecular dynamics v6.11 (cuda31)

The test app for Fermi on Linux is ACEMD beta version v6.37 (cuda31)
The test app for Fermi on Linux is ACEMD beta version v6.37 (cuda31)

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19412 - Posted: 10 Nov 2010 | 12:49:39 UTC - in response to Message 19410.

just reattached again ang got 6.11 :( and downgraded drivers and still got 6.11

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19414 - Posted: 10 Nov 2010 | 16:11:33 UTC - in response to Message 19412.
Last modified: 10 Nov 2010 | 16:49:25 UTC

For GF200 series cards you need to use an earlier driver (before 25715) to run the 6.12 app for Windows.
197.45 might be the best one.

If you use one of the more recent drivers you will end up running the ACEMD2: GPU molecular dynamics v6.11 (cuda31) app which only works well for Fermi cards.

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19415 - Posted: 10 Nov 2010 | 16:54:42 UTC - in response to Message 19414.

I'll just stay off the project. I aren't going back to an ancient driver just to get the right app.
____________
Radio Caroline, the world's most famous offshore pirate radio station.
Great music since April 1964. Support Radio Caroline Team -
Radio Caroline

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19416 - Posted: 10 Nov 2010 | 17:45:20 UTC - in response to Message 19415.

In fact, you shouldn't. The cuda31 app should run on all cards.
Only the cuda app runs on g200 only. That's due to incompatibility of cuda2 with fermi. Nevertheless, the best driver is not the last one but is the last before the change of technology (in this case g200 to fermi) which is the 195 driver.

gdf

I'll just stay off the project. I aren't going back to an ancient driver just to get the right app.

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19417 - Posted: 10 Nov 2010 | 18:12:41 UTC - in response to Message 19416.

In fact, you shouldn't. The cuda31 app should run on all cards.
Only the cuda app runs on g200 only. That's due to incompatibility of cuda2 with fermi. Nevertheless, the best driver is not the last one but is the last before the change of technology (in this case g200 to fermi) which is the 195 driver.

I have the 195 driver running, and I can't see anything even remotely "best" on my machine. The WU is now 25:34h running and still at 17.168%, so another 100h waiting before it will finish if I don't decide to let my GPU crunch something, not just sit idle. 45°C is just idle temperature, with DNetC I get to 70°. The only good thing is that the 4 CPU-WUs all get a whole core, but would rather give one away than waste my GPU time so much.

The 260-drivers are not in a repository, and all I've read about the manual installation of nvidia drivers in different Linux boards is that you have to maintain and babysit your machine very very close, and the possibility to loose the screen completely after a kernel update seems to be a very likely possibility.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19423 - Posted: 11 Nov 2010 | 9:55:47 UTC - in response to Message 19417.

I have uploaded the new cuda3.1 for Linux to have some further feedbacks from a larger pool of machines. This version will work with the GTX460.

gdf

Profile [AF>Libristes>GNU-Linux] ...
Send message
Joined: 30 Nov 08
Posts: 2
Credit: 3,479,719
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwatwatwat
Message 19424 - Posted: 11 Nov 2010 | 11:11:58 UTC - in response to Message 19423.

For what it's worth, since the upgrade, all the WUs have been erroring out on my Linux box.

I run Ubuntu 10.10 64-bits (2.6.35-22 kernel), Boinc client 6.10.58 (the default one from Brekeley's site - manual install, not from repository), Nvidia 9800 GTX+ GPU with Nvidia drivers 256-53 (manual install, not from repositories).

Should I upgrade/downgrade my Nvidia drivers ?
Should I switch to Boinc 6.12 (which is still not recommended on Berkeley's site) ?

The message in the Boinc Manager I get is :
"GPUGRID Output file 406-KASHIF_HIVPR_n1_unbound_so_ba1-62-100-RND1104_0_0 for task 406-KASHIF_HIVPR_n1_unbound_so_ba1-62-100-RND1104_0 absent"

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19425 - Posted: 11 Nov 2010 | 12:48:03 UTC - in response to Message 19424.
Last modified: 11 Nov 2010 | 12:52:21 UTC

I see the problem, new version coming up.

gdf

For what it's worth, since the upgrade, all the WUs have been erroring out on my Linux box.

I run Ubuntu 10.10 64-bits (2.6.35-22 kernel), Boinc client 6.10.58 (the default one from Brekeley's site - manual install, not from repository), Nvidia 9800 GTX+ GPU with Nvidia drivers 256-53 (manual install, not from repositories).

Should I upgrade/downgrade my Nvidia drivers ?
Should I switch to Boinc 6.12 (which is still not recommended on Berkeley's site) ?

The message in the Boinc Manager I get is :
"GPUGRID Output file 406-KASHIF_HIVPR_n1_unbound_so_ba1-62-100-RND1104_0_0 for task 406-KASHIF_HIVPR_n1_unbound_so_ba1-62-100-RND1104_0 absent"

Profile Bikermatt
Send message
Joined: 8 Apr 10
Posts: 37
Credit: 3,839,902,185
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19426 - Posted: 11 Nov 2010 | 13:16:07 UTC

6.12 task were running good on my GT240 now I have three in a row that have died.

<core_client_version>6.10.56</core_client_version>
<![CDATA[
<message>
process exited with code 127 (0x7f, -129)
</message>
<stderr_txt>

</stderr_txt>
]]>

This is the same error that my GTX 470s are getting now, and they were getting the same error on the beta work units.

Both systems are Ubuntu 10.4 with driver 260.19.12

The project has been reset on both systems, any suggestions?

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19427 - Posted: 11 Nov 2010 | 13:20:02 UTC - in response to Message 19426.

6.13 cuda31 for Linux is now out.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19428 - Posted: 11 Nov 2010 | 17:58:28 UTC - in response to Message 19427.
Last modified: 11 Nov 2010 | 17:58:49 UTC

Running CUDA 3.1 on GT240s does not presently offer high resource utilization or credit. On a quad GT240 system I have been running tasks on the CUDA 3.1 app for several days. The run times vary between 70,000sec and 107,000sec. Only the KASHIF_HIVPR tasks have a reasonably short runtime and finish between 70000sec and 75,000sec (within 24h). However, even these tasks are about 38% slower than the 6.05 app; KASHIF_HIVPR took between 51,000sec and 54,000sec.
Most other tasks take over 24h and miss the 24h return time bonus. Hence the systems daily average credit dropped from 60K to 34K.

Examples of KASHIF_HIVPR tasks (all W7 x64 1.59GHz or Vista x64 1.60GHz):

(6.05 app) Time per step = 51ms V
(6.05 app) Time per step = 47ms V
(6.05 app) Time per step = 52ms V
(6.05 app) Time per step = 54ms 7
(6.05 app) Time per step = 53ms 7
(6.12 app) Time per step = 58ms 7
(6.12 app) Time per step = 59ms 7
(6.12 app) Time per step = 55ms 7
(6.11 app) Time per step = 78ms V
(6.11 app) Time per step = 72ms V
(6.11 app) Time per step = 75ms V
(6.11 app) Time per step = 72ms V

Examples of IBUCH_*_pYEEI tasks:

(6.05 app) Time per step = 48ms V
(6.05 app) Time per step = 52ms V
(6.05 app) Time per step = 49ms V
(6.05 app) Time per step = 53ms 7
(6.12 app) Time per step = 58ms 7
(6.12 app) Time per step = 58ms 7
(6.12 app) Time per step = 58ms 7
(6.11 app) Time per step = 70ms V
(6.11 app) Time per step = 73ms V
(6.11 app) Time per step = 70ms V
(6.11 app) Time per step = 92ms 7

Note the card in the Windows 7 system (7) is usually about 5% slower than the Vista (V)systems cards; due to the card spec, slight speed difference and different system hardware (CPU).

For GT240’s on Windows, 6.12 is on average (over different tasks) about 8% slower than 6.05, and 6.11 is around 43% slower than 6.05.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19429 - Posted: 11 Nov 2010 | 18:25:03 UTC - in response to Message 19428.

Running CUDA 3.1 on GT240s does not presently offer high resource utilization or credit. On a quad GT240 system I have been running tasks on the CUDA 3.1 app for several days. The run times vary between 70,000sec and 107,000sec. Only the KASHIF_HIVPR tasks have a reasonably short runtime and finish between 70000sec and 75,000sec (within 24h). However, even these tasks are about 38% slower than the 6.05 app; KASHIF_HIVPR took between 51,000sec and 54,000sec.

For GT240’s on Windows, 6.12 is on average (over different tasks) about 8% slower than 6.05, and 6.11 is around 43% slower than 6.05.

At least. Does this slowdown include using syan_sync and losing a CPU core with 6.11 & 6.12?

Profile Bikermatt
Send message
Joined: 8 Apr 10
Posts: 37
Credit: 3,839,902,185
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19431 - Posted: 11 Nov 2010 | 20:33:31 UTC

In Linux app 6.12 driver 260.19.12 GT 240 at stock clocks:

KASHIF_HIVPR tasks ~42ms per step

IBUCH_*_pYEEI tasks: ~38ms per step

My first 6.13 app just finished in Linux on a GTX 470 driver 260.19.12 also.

It was an IBUCH and it ran about 800 sec longer than the 6.06 app, so not good for this one but we will see after a few different tasks have run.


Profile [AF>Libristes>GNU-Linux] ...
Send message
Joined: 30 Nov 08
Posts: 2
Credit: 3,479,719
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwatwatwat
Message 19432 - Posted: 11 Nov 2010 | 20:47:27 UTC - in response to Message 19431.

The 6.13 realease seems to have corrected the problem I had on my Linux box. Thanx.
But the new WUs are taking 100+ hours on my 9800 GTX+, which doesn't give me any leaway with the 5 days deadline.
Even if I leave my computer running 24/7 (which I usually do), I'm not even sure I'll meet that deadline every time, as I sometimes have to double boot on windows to let my children play a couple of games.
Any chance of having shorter WUs for the low end graphic cards ?

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19433 - Posted: 11 Nov 2010 | 21:31:55 UTC - in response to Message 19417.

The 260-drivers are not in a repository, and all I've read about the manual installation of nvidia drivers in different Linux boards is that you have to maintain and babysit your machine very very close, and the possibility to loose the screen completely after a kernel update seems to be a very likely possibility.

I've found a way to install the new drivers via a packet manager and DKMS (Thank you Ralf Recker):
This is an address for a PPA (Personal Package Archive), thus nothing official, but it worked:
https://launchpad.net/~ubuntu-x-swat/+archive/x-updates

I had to restart the computer afterwards, as the update included some stuff close to the kernel, but so what.

Now I've got the 260.19.12 installed, but the WU isn't running any faster. Perhaps I have to get a new one, as this one had already started, I'll let you know.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19435 - Posted: 11 Nov 2010 | 21:45:32 UTC - in response to Message 19433.

Tomorrow, I will run benchmarks on our g200 systems. We practically run only on Fermi now in the lab.

Every time that there is a change in application, it costs a lot of effort to tune the systems for you guys and for us, but it is a necessary step to move forward the application and to keep the pace with new drivers and cards.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19438 - Posted: 12 Nov 2010 | 0:29:30 UTC - in response to Message 19435.

I would like to have an onsite option whereby the cruncher could select to run specific applications according to their set systems member profile. The researchers could still set the default app as is, by driver, but allow an override for the cruncher to select online. That way we could use whatever driver we deem useful to our individual needs (within the app requirements). It would for example allow some people to use a 260.99 driver and run the cuda (6.12 or 6.13) app. There are many normal computer usage reasons to have different (mostly newer) drivers (than 195 for example), and obvious benefits to the project; crunchers that know what they are doing naturally want to optimize for performance. It would certainly be handier for me to use an up to date driver and run the 6.12 app than to use a 4 port KVM and better for the project; a 43% increase in that systems productivity.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19439 - Posted: 12 Nov 2010 | 1:45:59 UTC - in response to Message 19438.

I would like to have an onsite option whereby the cruncher could select to run specific applications according to their set systems member profile. The researchers could still set the default app as is, by driver, but allow an override for the cruncher to select online. That way we could use whatever driver we deem useful to our individual needs (within the app requirements). It would for example allow some people to use a 260.99 driver and run the cuda (6.12 or 6.13) app. There are many normal computer usage reasons to have different (mostly newer) drivers (than 195 for example), and obvious benefits to the project; crunchers that know what they are doing naturally want to optimize for performance. It would certainly be handier for me to use an up to date driver and run the 6.12 app than to use a 4 port KVM and better for the project; a 43% increase in that systems productivity.

Seconded

Profile Bikermatt
Send message
Joined: 8 Apr 10
Posts: 37
Credit: 3,839,902,185
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19440 - Posted: 12 Nov 2010 | 3:17:06 UTC

GTX 470 driver 260.19.12

IBUCH_*_pYEEI tasks:

6.06 app ~ 11.2ms per step
6.13 app ~ 11.8ms per step

The one input_*-TONI task that I have ran was .7ms per step slower also.

On a positive note, I do have a GTX 460 running in Linux now, I can't imagine it could perform worse than it did with the 6.11 app in Win7.

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19442 - Posted: 12 Nov 2010 | 8:33:38 UTC - in response to Message 19433.

Now I've got the 260.19.12 installed, but the WU isn't running any faster. Perhaps I have to get a new one, as this one had already started, I'll let you know.

It looks like it went considerably faster nevertheless.
p249-IBUCH_1_pYEEI_101109-0-20-RND9042_0

What I've done besides installing a new driver was, after another hint by Ralf via PM to set the nice-value to 0, and the temperature went up from 40° to 55°.

I'm at work atm, will say more once I'm at my puter again in the evening.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19443 - Posted: 12 Nov 2010 | 12:59:51 UTC - in response to Message 19442.

Bikermatt, a week or so ago nobody could even use a GTX460 with Linux. Let’s hope some of our slowdown is offset by new Linux crunchers.

Your GTX460 takes 21ms per step for an IBUCH task under Linux and 32ms under Win7. Although there might be some WU difference most of your Win7 tasks take around 29 to 32ms per step.

Anyone tried a GTS450 on Linux?

CElliott
Send message
Joined: 29 Oct 10
Posts: 4
Credit: 2,675,358
RAC: 0
Level
Ala
Scientific publications
watwatwat
Message 19447 - Posted: 12 Nov 2010 | 18:50:46 UTC

I think I may know why I am seeing frequent errors with the acemd2_6.11_windows_intelx86__cuda31.exe client. Every now and then one of my GPUs becomes very slow -- the GTX 250 in a computer that has a 250 and a GTX 460. I think it may be overclocked too much, and I have slowed it down, and it did work well with Seti@Home. I can tell it is not doing anything because the fraction done only advances about 0.010% per minute, whereas normal progress is about 0.08% per minute, and the GPU temperature is only about 36 degrees C, with a normal of about 46. The solution is to reboot the computer and a reset may help also. However, when the system restarts the workunit always errors out. The last time I watched it in Windowns XP Task Manager and Boincmgr. It appears that what "may" have happened is that Boinc tried to start the app on device 0 (the 460) several times, and then tried device 1 (the 250), which already had a task running. Here is the <stderr> section:

<stderr_out>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# Using device 1

# There are 2 devices supporting CUDA

# Device 0: "GeForce GTX 460"

# Clock rate: 2.05 GHz

# Total amount of global memory: 1073283072 bytes

# Number of multiprocessors: 7

# Number of cores: 56

# Device 1: "GeForce GTS 250"

# Clock rate: 1.78 GHz

# Total amount of global memory: 1073545216 bytes

# Number of multiprocessors: 16

# Number of cores: 128

MDIO ERROR: cannot open file "restart.coor"

# Using device 0

# There are 2 devices supporting CUDA

# Device 0: "GeForce GTX 460"

# Clock rate: 2.05 GHz

# Total amount of global memory: 1073283072 bytes

# Number of multiprocessors: 7

# Number of cores: 56

# Device 1: "GeForce GTS 250"

# Clock rate: 1.78 GHz

# Total amount of global memory: 1073545216 bytes

# Number of multiprocessors: 16

# Number of cores: 128

# Using device 0

# There are 2 devices supporting CUDA

# Device 0: "GeForce GTX 460"

# Clock rate: 2.05 GHz

# Total amount of global memory: 1073283072 bytes

# Number of multiprocessors: 7

# Number of cores: 56

# Device 1: "GeForce GTS 250"

# Clock rate: 1.78 GHz

# Total amount of global memory: 1073545216 bytes

# Number of multiprocessors: 16

# Number of cores: 128

SWAN : FATAL : Failure executing kernel sync [swan_fast_fill] [700]

Assertion failed: 0, file swanlib_nv.c, line 124



This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.


</stderr_txt>

I see this error message often: 'MDIO ERROR: cannot open file "restart.coor"', and as far as I can tell, restart.coor is always where it should be. Is it possible that the app times out trying to read restart.coor when the hard disk is busy?

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19452 - Posted: 13 Nov 2010 | 11:34:53 UTC - in response to Message 19447.

The restart is not an actual error, just ignore it. 2.05 GHz is quite high for a GTX460, though. Are you running at elevated temperatures and extreme cooling? In case of problems with that setup I suggest lowering the clock as a first step.

MrS
____________
Scanning for our furry friends since Jan 2002

biodoc
Send message
Joined: 26 Aug 08
Posts: 183
Credit: 6,466,114,375
RAC: 1,393,151
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19453 - Posted: 13 Nov 2010 | 11:42:59 UTC - in response to Message 19442.


What I've done besides installing a new driver was, after another hint by Ralf via PM to set the nice-value to 0, and the temperature went up from 40° to 55°.


I noticed the linux ver 6.13 app has a default nice value of 10. At nice=10, my gtx460 GPU temp is 44C and %cpu for the app is close to 0. If I manually set nice to 5, I see the same low temp and %cpu. If I set the nice value to 4, GPU temp jumps to 58C and %cpu=12. Would it be possible to distribute the app at a default nice value of 4?

64 bit Ubuntu 10.04
boinc 6.10.58
GTX460
nvidia driver version 260.19.14

Thanks much for developing a linux app that works with fermi cards!!

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19454 - Posted: 13 Nov 2010 | 14:53:49 UTC - in response to Message 19453.


What I've done besides installing a new driver was, after another hint by Ralf via PM to set the nice-value to 0, and the temperature went up from 40° to 55°.

I noticed the linux ver 6.13 app has a default nice value of 10. At nice=10, my gtx460 GPU temp is 44C and %cpu for the app is close to 0. If I manually set nice to 5, I see the same low temp and %cpu. If I set the nice value to 4, GPU temp jumps to 58C and %cpu=12. Would it be possible to distribute the app at a default nice value of 4?

On my Windows machines I also boost the priority (via eFMer Priority64), generally to "high". GPUGRID runs much faster and still only uses a small portion of 1 CPU, no SWAN_SYNC needed.

Profile Bikermatt
Send message
Joined: 8 Apr 10
Posts: 37
Credit: 3,839,902,185
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19455 - Posted: 13 Nov 2010 | 15:43:35 UTC

The 6.13 app is running slower on my GTX 460 in Win7 compared to the 6.11 app.

The GTX 460 is running good in Linux on the 6.13 app, so far I am seeing around 21ms per step for the IBUCH tasks.

p2-IBUCH_15_PQpYEEIPI_101019-14-40-RND7762_2

# Time per step (avg over 1250000 steps): 33.258 ms
# Approximate elapsed time for entire WU: 41573.039 s

application version ACEMD2: GPU molecular dynamics v6.13 (cuda31)

p25-IBUCH_3_PQpYEEIPI_101019-14-40-RND3646_1

# Time per step (avg over 275000 steps): 31.505 ms
# Approximate elapsed time for entire WU: 39381.489 s

application version ACEMD2: GPU molecular dynamics v6.11 (cuda31)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19456 - Posted: 13 Nov 2010 | 18:02:35 UTC - in response to Message 19454.

Tested my GTX260 (at factory settings) on Kubuntu.

While running a nice fast GIANNI task without setting swan_sync to zero it was very slow; after 12h it had not reached 50% complete (47.5%). So it would not have finished within 24h.

I then freed up a CPU core, configured swan_sync=0, restarted and the task sped up considerably:
It finished in about 15½h, suggesting the task would have finished in around 7h if I had used swan_sync from the start. Just under 12ms per step.

I’m now running one of the slower IBUCH tasks, but it should still finish in around 9h 40min.

The faster tasks use to take my GTX260 around 6½h on XP, when the card was overclocked and a CPU core freed up.


Bikermatt, have you left a CPU core free for your GTX460 on Win7?
You might also want to try Beyond's method of incresing priority using eFMer Priority64.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19457 - Posted: 13 Nov 2010 | 18:59:19 UTC - in response to Message 19456.

Tested my GTX260 (at factory settings) on Kubuntu.

While running a nice fast GIANNI task without setting swan_sync to zero it was very slow; after 12h it had not reached 50% complete (47.5%). So it would not have finished within 24h.

I then freed up a CPU core, configured swan_sync=0, restarted and the task sped up considerably:
It finished in about 15½h, suggesting the task would have finished in around 7h if I had used swan_sync from the start. Just under 12ms per step.

Have you tried setting the nice level as suggested above by several people. It would be interesting to compare this to using SWAN_SYNC.

I know of no other DC projects that act this way. Suspect that it's a programming issue that needs to be addressed.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19458 - Posted: 13 Nov 2010 | 19:04:32 UTC - in response to Message 19457.

The GIANNI tasks are using the new algorithm for faster speed. It is a test.
Hopefully, soon all the simulations will use that.
It should be quite a bit faster on every cards.

gdf

Tested my GTX260 (at factory settings) on Kubuntu.

While running a nice fast GIANNI task without setting swan_sync to zero it was very slow; after 12h it had not reached 50% complete (47.5%). So it would not have finished within 24h.

I then freed up a CPU core, configured swan_sync=0, restarted and the task sped up considerably:
It finished in about 15½h, suggesting the task would have finished in around 7h if I had used swan_sync from the start. Just under 12ms per step.

Have you tried setting the nice level as suggested above by several people. It would be interesting to compare this to using SWAN_SYNC.

I know of no other DC projects that act this way. Suspect that it's a programming issue that needs to be addressed.

BorgHunter
Send message
Joined: 2 Mar 10
Posts: 1
Credit: 140,175,416
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 19459 - Posted: 14 Nov 2010 | 1:27:57 UTC

I've been having a problem starting with 6.12 where my GPU utilization is only around 10%. I run Rosetta and WCG on my four CPU cores, and previously I never had a problem with GPUGRID; it'd have healthy GPU utilization and I'd speed along nicely with tasks. Starting with 6.12, and including 6.13 (cuda31), I need to tell BOINC to only allocate 75% of my CPU cores (i.e. 3 of my 4 cores), then my GPU utilization jumps to around 80%.

Here's my uname:

borghunter@apollo ~ $ uname -a
Linux apollo 2.6.35-ARCH #1 SMP PREEMPT Sat Oct 30 21:22:26 CEST 2010 x86_64 Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz GenuineIntel GNU/Linux


Here's what 75% CPU allocation in BOINC looks like in nvidia-smi:
borghunter@apollo ~ $ nvidia-smi -a

==============NVSMI LOG==============


Timestamp : Sat Nov 13 19:18:28 2010

Driver Version : 260.19.21


GPU 0:
Product Name : GeForce GTX 275
PCI Device/Vendor ID : 5e610de
PCI Location ID : 0:1:0
Board Serial : 212899432077126
Display : Connected
Temperature : 70 C
Fan Speed : 40%
Utilization
GPU : 78%
Memory : 16%

And this is with 100% CPU allocation:
borghunter@apollo ~ $ nvidia-smi -a

==============NVSMI LOG==============


Timestamp : Sat Nov 13 19:26:30 2010

Driver Version : 260.19.21


GPU 0:
Product Name : GeForce GTX 275
PCI Device/Vendor ID : 5e610de
PCI Location ID : 0:1:0
Board Serial : 212899432077126
Display : Connected
Temperature : 63 C
Fan Speed : 40%
Utilization
GPU : 9%
Memory : 2%

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19460 - Posted: 14 Nov 2010 | 9:24:41 UTC - in response to Message 19459.

HI,
this is because from this version the default is not to use a full CPU to drive the GPU. If you want to get back as before, just add to your .bashrc, export SWAN_SYNC=0

gdf

I've been having a problem starting with 6.12 where my GPU utilization is only around 10%. I run Rosetta and WCG on my four CPU cores, and previously I never had a problem with GPUGRID; it'd have healthy GPU utilization and I'd speed along nicely with tasks. Starting with 6.12, and including 6.13 (cuda31), I need to tell BOINC to only allocate 75% of my CPU cores (i.e. 3 of my 4 cores), then my GPU utilization jumps to around 80%.

Here's my uname:
borghunter@apollo ~ $ uname -a
Linux apollo 2.6.35-ARCH #1 SMP PREEMPT Sat Oct 30 21:22:26 CEST 2010 x86_64 Intel(R) Core(TM) i5 CPU 750 @ 2.67GHz GenuineIntel GNU/Linux


Here's what 75% CPU allocation in BOINC looks like in nvidia-smi:
borghunter@apollo ~ $ nvidia-smi -a

==============NVSMI LOG==============


Timestamp : Sat Nov 13 19:18:28 2010

Driver Version : 260.19.21


GPU 0:
Product Name : GeForce GTX 275
PCI Device/Vendor ID : 5e610de
PCI Location ID : 0:1:0
Board Serial : 212899432077126
Display : Connected
Temperature : 70 C
Fan Speed : 40%
Utilization
GPU : 78%
Memory : 16%

And this is with 100% CPU allocation:
borghunter@apollo ~ $ nvidia-smi -a

==============NVSMI LOG==============


Timestamp : Sat Nov 13 19:26:30 2010

Driver Version : 260.19.21


GPU 0:
Product Name : GeForce GTX 275
PCI Device/Vendor ID : 5e610de
PCI Location ID : 0:1:0
Board Serial : 212899432077126
Display : Connected
Temperature : 63 C
Fan Speed : 40%
Utilization
GPU : 9%
Memory : 2%

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19461 - Posted: 14 Nov 2010 | 9:51:32 UTC

Hi Gianni,

My gtx295 card uses max 50-56% of gpu.
My gtx480 card uses max 48% of gpu.

Temp = OK

All machines Windows XP-pro - 260.99 driver - boinc 06.10.58
Running application 6.13 (cuda31).

Very slow!!

Correct?
____________
Ton (ftpd) Netherlands

biodoc
Send message
Joined: 26 Aug 08
Posts: 183
Credit: 6,466,114,375
RAC: 1,393,151
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19462 - Posted: 14 Nov 2010 | 11:09:33 UTC

I've had 2 of the KASHIF WUs fail with this output:

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 460"
# Clock rate: 1.53 GHz
# Total amount of global memory: 804454400 bytes
# Number of multiprocessors: 7
# Number of cores: 56
SIGABRT: abort called
Stack trace (13 frames):
../../projects/www.gpugrid.net/acemd2_6.13_x86_64-pc-linux-gnu__cuda31(boinc_catch_signal+0x4d)[0x47d48d]
/lib/libc.so.6(+0x33af0)[0x7f9be5b11af0]
/lib/libc.so.6(gsignal+0x35)[0x7f9be5b11a75]
/lib/libc.so.6(abort+0x180)[0x7f9be5b155c0]
../../projects/www.gpugrid.net/acemd2_6.13_x86_64-pc-linux-gnu__cuda31[0x48abeb]
../../projects/www.gpugrid.net/acemd2_6.13_x86_64-pc-linux-gnu__cuda31[0x433d50]
../../projects/www.gpugrid.net/acemd2_6.13_x86_64-pc-linux-gnu__cuda31[0x430246]
../../projects/www.gpugrid.net/acemd2_6.13_x86_64-pc-linux-gnu__cuda31[0x42f957]
../../projects/www.gpugrid.net/acemd2_6.13_x86_64-pc-linux-gnu__cuda31[0x41480d]
../../projects/www.gpugrid.net/acemd2_6.13_x86_64-pc-linux-gnu__cuda31[0x407ae0]
../../projects/www.gpugrid.net/acemd2_6.13_x86_64-pc-linux-gnu__cuda31[0x408346]
/lib/libc.so.6(__libc_start_main+0xfd)[0x7f9be5afcc4d]
../../projects/www.gpugrid.net/acemd2_6.13_x86_64-pc-linux-gnu__cuda31[0x407849]

Exiting...

</stderr_txt>
]]>

One KASHIF WU finished though:

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 460"
# Clock rate: 1.53 GHz
# Total amount of global memory: 804454400 bytes
# Number of multiprocessors: 7
# Number of cores: 56
MDIO ERROR: cannot open file "restart.coor"
# Time per step (avg over 1000000 steps): 28.011 ms
# Approximate elapsed time for entire WU: 28010.660 s
22:21:57 (4713): called boinc_finish

</stderr_txt>
]]>

64 bit Ubuntu 10.04
boinc 6.10.58
GTX460
nvidia driver version 260.19.14

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19463 - Posted: 14 Nov 2010 | 11:39:56 UTC - in response to Message 19462.

On linux, at normal priority 0 and with SWAN_SYNC=0, the %GPU used should be around 95%.

gdf

Tom Philippart
Send message
Joined: 12 Feb 09
Posts: 57
Credit: 23,376,686
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 19464 - Posted: 14 Nov 2010 | 13:08:31 UTC - in response to Message 19461.

Hi Gianni,

My gtx295 card uses max 50-56% of gpu.
My gtx480 card uses max 48% of gpu.

Temp = OK

All machines Windows XP-pro - 260.99 driver - boinc 06.10.58
Running application 6.13 (cuda31).

Very slow!!

Correct?


around 50% usage on a gtx260 under win7 too!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19465 - Posted: 14 Nov 2010 | 13:29:12 UTC
Last modified: 14 Nov 2010 | 13:29:37 UTC

If you use one cpu core by default people complain that they want their cores back. If you try not to use it, GPU performance suffers and peolpe complain. Personally I'd go with the latter, but one thing is clear: deciding for either version will never satisfy everyone.

GDF, you said you're talking with David about adding some feature to BOINC so it will be easier to configure the behaviour than using an environment variable. Couldn't you use methods already existing in BOINC? For example in Rosetta under "my account/Rosetta@Home settings" they've added a setting "Target CPU run time" where I can decide on any number of hours. Soemhow this gets passed to the application. And it's tied to the 4 profiles, so one can make dfferent choices for different PCs.

I think that's just what we need. And probably set the default to "use one core", as SWAN_SYNC=1 appears to have a heavy impact on GPU performance.

MrS
____________
Scanning for our furry friends since Jan 2002

Bobrr
Send message
Joined: 2 Jul 10
Posts: 7
Credit: 27,699,565
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 19469 - Posted: 14 Nov 2010 | 17:37:49 UTC - in response to Message 19443.

I have a GTS 450 w/1024Mb DDR5
'Processor: 1 AuthenticAMD AMD Athlon(tm) 64 Processor 3200+ [Family 15 Model 47 Stepping 0]
Processor: 512.00 KB cache
Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow up rep_good pni lahf_lm
OS: Linux: 2.6.35-22-generic
Memory: 1.96 GB physical, 5.74 GB virtual
Disk: 682.02 GB total, 639.43 GB free'

It doesn't seem to be recognised in GPUGRID.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19472 - Posted: 14 Nov 2010 | 18:13:06 UTC - in response to Message 19469.

BobR, you will need the latest NVidia driver.

Bobrr
Send message
Joined: 2 Jul 10
Posts: 7
Credit: 27,699,565
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 19476 - Posted: 14 Nov 2010 | 19:03:26 UTC - in response to Message 19472.

Latest drivers installed and working.

'OpenGL version string: 4.1.0 NVIDIA 260.19.21'

Any other clues.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,518,011,851
RAC: 8,607,124
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19477 - Posted: 14 Nov 2010 | 19:06:02 UTC - in response to Message 19476.

Latest drivers installed and working.

'OpenGL version string: 4.1.0 NVIDIA 260.19.21'

Any other clues.

Ensure that all graphics drivers have finished loading before BOINC starts. Other Linux users (I'm not one, I just read the posts on the boards) sometimes suggest a delay in the BOINC startup script.

Bobrr
Send message
Joined: 2 Jul 10
Posts: 7
Credit: 27,699,565
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 19478 - Posted: 14 Nov 2010 | 19:26:13 UTC - in response to Message 19477.

The drivers were installed last week, and the system including BOINC was rebooted since.

'# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 260.19.21 (buildmeister@builder101) Thu Nov 4 21:47:28 PDT 2010'

Did I miss anything?

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,518,011,851
RAC: 8,607,124
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19479 - Posted: 14 Nov 2010 | 19:43:38 UTC - in response to Message 19478.

The drivers were installed last week, and the system including BOINC was rebooted since.

'# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 260.19.21 (buildmeister@builder101) Thu Nov 4 21:47:28 PDT 2010'

Did I miss anything?

Your host 84614 is not reporting any coprocessors visible to BOINC. You probably need to wait for a Linux BOINC specialist to advise you on that problem.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19481 - Posted: 15 Nov 2010 | 0:36:50 UTC - in response to Message 19479.

BobR, your GPU is not listed here and there might be a problem with that, unless you just took the card out before posting, or uninstalled the driver.

When I looked at your task list I can see that you did download and run many tasks when the GPU was present. The problem is that they all failed. Most ran and immediately failed on the v6.06 (cuda30) application, which does not work on a GTS450, giving you a bad credit rating. As a result you will only get one new task a day until you start completing tasks.

http://www.gpugrid.net/results.php?hostid=84614

Maximum daily WU quota per CPU 1/day,
http://www.gpugrid.net/show_host_detail.php?hostid=84614

This task ran for 31h before failing, with the error message "process got signal 11."

The cuda3.1 app should work but that task time is too long. To give yourself a good chance of finishing a task you should leave a CPU core free in Boinc Manager and use swan_sync=0. This will significantly reduce run time; cut it in half at least.

Bobrr
Send message
Joined: 2 Jul 10
Posts: 7
Credit: 27,699,565
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 19482 - Posted: 15 Nov 2010 | 4:22:01 UTC - in response to Message 19481.

Thank you for the explanation. It does explain why the host was not doing so good. I purchased this card specifically to provide more computing power as it was touted to be a great double precision capable card. It should have increased the power to produce more results especially in Linux. I understand htis card has compute capability 2.3.Alos, according to 2 other BOINC projects (Einsstein and Milky), this card also doesn't show up in the summary.

How do I determine the projects to run to take best advantage of the card. I'm not worried about points or credits, just being able to work on WU's as efficiently as possible.

Thank you.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19484 - Posted: 15 Nov 2010 | 9:40:11 UTC - in response to Message 19482.

Hi BobR, As your GPU is not listed here, at Einstein and at MW, I would say it is not being recognised, but perhaps this is a server reporting issue (it is a new card type, GF106).
What does the 13th line down of the Boinc Manager Messages say about the card? (post back). It should look something like this,
14/11/2010 15:50:57 NVIDIA GPU 0: GeForce GTS 450 (driver version 260.19, CUDA version 3100, compute capability 2.1, 1024MB, 602 GFLOPS peak).
I thought the GTS450 is a CC2.1 card, but if it is a CC2.2 card there might need to be an application update before it works, but I cannot confirm this, and I am not sure if Boinc is identifying card types by Compute Capability. Apps are selected according to CC at Folding.
On Einstein your tasks mostly failed, but even the ones that appeared to run actually had issues, “Error writing shared memory data (size limit exceeded)!” was listed repeatedly on the task details. This suggests the project does not support you card as yet, but you would need to confirm that over at Einstein.

If I am correct, the GTS450 can use up to 2GB system memory to backup the cards 1GB memory.

You have not run any tasks at MW. Suggest you ask if the GTS450 works there before trying.
As you have a 3200+ CPU, I would reiterate, do not use it to crunch alongside the GPU; just crunch with the GPU. The CPU is needed to support the GPU and run the system, and that CPU would not do anywhere near the work of that GPU.
I still think your GTS450 might work at GPUGrid now, but you would need to set the swan_sync to zero, and not use the CPU core to crunch with for other projects; for it to run for 31h suggests it can run. Of course because this is a new GF106 design you might need to wait for a CUDA3.2 based app. I think you should set swan_sync to equal zero in your .bashrc, stop crunching CPU projects, restart and give it a go again here. There is no public GPUGrid app that needs over 500MB GDDR space to run.
As for which projects to crunch on, that’s up to you to decide – do you prefer astrophysics, nuclear research, maths or projects such as this one which advances science, computer modelling and medical research?

PS. So that nobody falls for the NVidia OEM Trap:
DO NOT buy an OEM GTS450 – they only have 144shaders (not 192) and 3 SM’s not 4. It is basically a GTS440 with DDR5 and reasonable clock rates. As for the GTS430, it would not be any better than a GT240, so it is not worth buying to crunch with.

Bobrr
Send message
Joined: 2 Jul 10
Posts: 7
Credit: 27,699,565
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 19485 - Posted: 15 Nov 2010 | 14:02:06 UTC - in response to Message 19484.

Line 13 reads:
'Sat 13 Nov 2010 08:00:09 PM EST No usable GPUs found'
Interestingly enough, that was the message a few days ago, after the install and before I updated the drivers. The messages from Einstein and GPUGRID are:
'Sat 13 Nov 2010 08:00:09 PM EST Einstein@Home Application uses missing NVIDIA GPU'
'Sat 13 Nov 2010 08:00:10 PM EST GPUGRID Application uses missing NVIDIA GPU'
Which is confusing. I will get in touch with Einstein@home also.

'If I am correct, the GTS450 can use up to 2GB system memory to backup the cards 1GB memory' I may have to UP my RAM!

As to MW, it hasn't run any tasks since before the new card was installed. Interesting also.

As to the CPU, I'll work on changing the setup as recommended.

As to the projects, I was enquiring as to which would support the GTS450 best, not which to crunch. I'll sort that out also.

Thank you for your support and keep up the good word.I saw an article in Einstein@Home asking for suggestions as to how to increase the number of participants. In Canada and I'm sure across most of the participating nations, libraries have computers sitting idle or mostly idle. They would be a good source for expanding the systems. I will be bringing this up with my local library. I encourage anyone else reading this to do so also.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19486 - Posted: 15 Nov 2010 | 15:02:43 UTC - in response to Message 19485.

I guess you need to install the drivers again.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,518,011,851
RAC: 8,607,124
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19487 - Posted: 15 Nov 2010 | 15:08:38 UTC - in response to Message 19486.

I guess you need to install the drivers again.

And, as I said a few posts ago, ensure that the drivers have time to load and fully initialise before BOINC tries to use them. This is clearly a BOINC/driver issue, since it affects all projects equally.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19489 - Posted: 15 Nov 2010 | 18:18:36 UTC - in response to Message 19458.

The GIANNI tasks are using the new algorithm for faster speed. It is a test.
Hopefully, soon all the simulations will use that.
It should be quite a bit faster on every cards.

gdf

Tested my GTX260 (at factory settings) on Kubuntu.

While running a nice fast GIANNI task without setting swan_sync to zero it was very slow; after 12h it had not reached 50% complete (47.5%). So it would not have finished within 24h.

I then freed up a CPU core, configured swan_sync=0, restarted and the task sped up considerably:
It finished in about 15½h, suggesting the task would have finished in around 7h if I had used swan_sync from the start. Just under 12ms per step.

Have you tried setting the nice level as suggested above by several people. It would be interesting to compare this to using SWAN_SYNC.

I know of no other DC projects that act this way. Suspect that it's a programming issue that needs to be addressed.

GDF, the GIANNI_DHFR500 tasks are running well here with boosted priority and no SWAN_SYNC even on my GT 240 cards.

SK, I tested the GIANNI_DHFR500 WUs with no SWAN_SYNC myself. My GT 240 card with a modest shader OC to 1500MHz ran these 2 GIANNI_DHFR500 WUs in 41772.922 and 41906.609 seconds. GPU usage was a steady 89%. This was without setting SWAN_SYNC, just boosting the GPUGRID app's priority to high. All 4 CPU projects ran at normal speeds as did AQUA. No cores wasted.

http://www.gpugrid.net/result.php?resultid=3293056
http://www.gpugrid.net/result.php?resultid=3277797

You have one machine that also has finished 2 GIANNI_DHFR500 WUs, I assume using SWAN_SYNC. The times are considerably slower even though you are using a higher OC to 1600MHz. The times are 49102.764 and 46696.179 seconds:

http://www.gpugrid.net/result.php?resultid=3287168
http://www.gpugrid.net/result.php?resultid=3283754

While SWAN_SYNC obviously works, I think it's akin to hunting mice with an elephant gun. Boosting priority and/or regulating polling via the application seems a more elegant solution. For instance, Collatz uses the following method to control polling:

>> Polling behavior for the GPU within the Brook runtime: b (default 1)
>> See the option w for starters. If that time has elapsed, the GPU polling starts. This can be done
>> by continuously checking if the task has finished (b-1), enabling the fastest runtimes, but potentially
>> creating a high CPU load (a bit dependent on driver version). Second possibility is to release the time
>> slice allotted by the OS, so other apps can run (b0). The catch is that there is some interaction with
>> the priority. The time slice is only released to other tasks of the same priority. So raising the priority
>> effectively disables the release and the behavior is virtually identical to setting this parameter
>> to -1. If a raised priority and a low CPU time is wanted, one should leave it at the default of 1. This
>> suspends the task for at least 1 millisecond, enabling also tasks of lower priority to use the CPU in the
>> meantime. One can use also b2 or b3 if one wants a smoother system behavior.
>> Possible values:
>> b-1: busy waiting
>> b0: release time slice to other tasks of same priority
>> b1, b2 or b3: release time slice for at least 1, 2, or 3 milliseconds respectively

Seems this would have the same effect as SWAN_SYNC without using an entire CPU core.


Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19490 - Posted: 15 Nov 2010 | 22:08:03 UTC - in response to Message 19489.
Last modified: 15 Nov 2010 | 22:08:45 UTC

>> Polling behavior for the GPU within the Brook runtime: b (default 1)
>> See the option w for starters. If that time has elapsed, the GPU polling starts. This can be done
>> by continuously checking if the task has finished (b-1), enabling the fastest runtimes, but potentially
>> creating a high CPU load (a bit dependent on driver version). Second possibility is to release the time
>> slice allotted by the OS, so other apps can run (b0). The catch is that there is some interaction with
>> the priority. The time slice is only released to other tasks of the same priority. So raising the priority
>> effectively disables the release and the behavior is virtually identical to setting this parameter
>> to -1. If a raised priority and a low CPU time is wanted, one should leave it at the default of 1. This
>> suspends the task for at least 1 millisecond, enabling also tasks of lower priority to use the CPU in the
>> meantime. One can use also b2 or b3 if one wants a smoother system behavior.
>> Possible values:
>> b-1: busy waiting
>> b0: release time slice to other tasks of same priority
>> b1, b2 or b3: release time slice for at least 1, 2, or 3 milliseconds respectively

Seems this would have the same effect as SWAN_SYNC without using an entire CPU core.
[/quote]


The longest kernel in the DHFR workunit is of the order of a couple of milliseconds..., but we can make the priority higher maybe instead of changing the polling method.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19492 - Posted: 16 Nov 2010 | 3:49:39 UTC - in response to Message 19490.

The longest kernel in the DHFR workunit is of the order of a couple of milliseconds..., but we can make the priority higher maybe instead of changing the polling method.

I've been setting the priority at high but maybe it could be a bit lower and still get the same boost.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19493 - Posted: 16 Nov 2010 | 4:02:51 UTC - in response to Message 19489.
Last modified: 16 Nov 2010 | 4:04:41 UTC

Tested my GTX260 (at factory settings) on Kubuntu.

While running a nice fast GIANNI task without setting swan_sync to zero it was very slow; after 12h it had not reached 50% complete (47.5%). So it would not have finished within 24h.

I then freed up a CPU core, configured swan_sync=0, restarted and the task sped up considerably:
It finished in about 15½h, suggesting the task would have finished in around 7h if I had used swan_sync from the start. Just under 12ms per step.

Have you tried setting the nice level as suggested above by several people. It would be interesting to compare this to using SWAN_SYNC.

SK, I tested the GIANNI_DHFR500 WUs with no SWAN_SYNC myself. My GT 240 card with a modest shader OC to 1500MHz ran these 2 GIANNI_DHFR500 WUs in 41772.922 and 41906.609 seconds. GPU usage was a steady 89%. This was without setting SWAN_SYNC, just boosting the GPUGRID app's priority to high. All 4 CPU projects ran at normal speeds as did AQUA. No cores wasted.

http://www.gpugrid.net/result.php?resultid=3293056
http://www.gpugrid.net/result.php?resultid=3277797

You have one machine that also has finished 2 GIANNI_DHFR500 WUs, I assume using SWAN_SYNC. The times are considerably slower even though you are using a higher OC to 1600MHz. The times are 49102.764 and 46696.179 seconds:

http://www.gpugrid.net/result.php?resultid=3287168
http://www.gpugrid.net/result.php?resultid=3283754

Two more GIANNI_DHFR500 WUs, 1 from each of my other GT 240 cards. Faster yet. Times of 37537.328 seconds for the 1st card (1500MHz, 97% GPU) and 35969.656 seconds for the 2nd card (1550MHz, 99% GPU). Again, no SWAN_SYNC, priority simply boosted to high via eFMer Priority 64.

http://www.gpugrid.net/result.php?resultid=3294938
http://www.gpugrid.net/result.php?resultid=3294258

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19495 - Posted: 16 Nov 2010 | 10:20:23 UTC - in response to Message 19493.
Last modified: 16 Nov 2010 | 10:46:28 UTC

Thanks Beyond, eFMer Priority will have another home for a while.

These GIANNI_DHFR500 tasks are very fast; probably the fastest tasks we have seen to date, 17.5 to 25ms per step on GT240s. I hope these advancements make their way into the other WU’s. Pleased to see one turn up on my GTX470, just about to start.

Although I had a few pauses/restarts during my GIANNI_DHFR500 GT240 runs (which usually adds a few minutes to run time), a big difference between our cards speeds is because I’m stuck using Vista; I think 11% was the ball park figure for 6.05, probably about the same for 6.12. Never the less, I’m sure the priority increase does help; the difference between our times is >11%. It would still need to be mapped out for Fermi’s and non-fermi’s and on XP, Vista, W7, Linux (nice values). Did you run any same type tasks without priority increases, to see what the differences are? This would really need to be compared on the same system (and perhaps with different priorities), and then confirmed on other systems and apps. The trouble with doing this on a GT240 is that it takes a few days, and you have to change the priority at the beginning of a task run (difficult for me with 4 cards all at different run stages and running different task types).

While swan_sync seems to be important on Linux for all cards, on Windows it is only important for Fermis, not GT240’s running the 6.12 app (makes little or no difference). It might help with the 6.13 app with a GF200 series card, but I have not tested this. So I suspect priority may not be a replacement for swan_sync, rather an alternative/also usable option, or option for non-Fermi’s. I think one of my GIANNI_DHFR500 tasks (possibly the second one) used swan_sync, but there is little difference in CPU time. When swan_sync is used with Fermi it does use a full core/thread.

Both of my GIANNI_DHFR500 tasks used slightly less CPU time than yours did. While this may be down to using a high priority it might just be a difference in CPU performance. So to test if just increasing priority is an alternative to swan_sync I think we would need to test this on a Fermi; with swan_sync=0 and default (low) priority vs swan_sync not in use and high priority. I remember looking at this in the past, and if my memory serves me right we can get away with “Set Above Normal” priority. I’m trying this now (switched on last night mid run) on my GT240’s. Just need to wait a day to get clean results.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19498 - Posted: 16 Nov 2010 | 11:31:40 UTC - in response to Message 19495.

The GIANNI_DHFR500 task errored out after 108sec on the GTX470:

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>

http://www.gpugrid.net/result.php?resultid=3298906

XP x86, i7-920

I also see another one that errored 2 days ago:
http://www.gpugrid.net/result.php?resultid=3292121

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 16,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19499 - Posted: 16 Nov 2010 | 11:59:41 UTC - in response to Message 19498.

The GIANNI_DHFR500 task errored out after 108sec on the GTX470:

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>


My GIANNI_DHFR500 tasks erroring out the same way after various computing time on GTX480 (overclocked to 800MHz)

task 3282907, 3281784, 3281542, 3281228, 3279134, 3278165

While other highly GPU utilizing tasks (such as KASHIF_HIVPR's) run correctly on these overclocked cards.

When I clocked it back to factory settings (while running an other GIANNI_DHFR500 task, so the tasks page still shows the higher frequency), it finished correctly.
Task 3294885

So these GIANNI_DHFR500 tasks seem to be more sensitive to overclocking on fermis, than other tasks.

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19500 - Posted: 16 Nov 2010 | 12:00:52 UTC
Last modified: 16 Nov 2010 | 12:06:15 UTC

197-GIANNI_DHFR500-2-99-RND7994_1
Workunit 2076439
Aangemaakt 15 Nov 2010 17:28:35 UTC
Sent 15 Nov 2010 19:41:38 UTC
Received 16 Nov 2010 7:21:08 UTC
Server state Over
Outcome Success
Client state Geen
Exit status 0 (0x0)
Computer ID 35174
Report deadline 20 Nov 2010 19:41:38 UTC
Run time 9372.934303
CPU time 9349.922
stderr out <core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 1.40 GHz
# Total amount of global memory: 1610153984 bytes
# Number of multiprocessors: 15
# Number of cores: 120
SWAN: Using synchronization method 0
MDIO ERROR: cannot open file "restart.coor"
# Time per step (avg over 2000000 steps): 4.685 ms
# Approximate elapsed time for entire WU: 9370.085 s
called boinc_finish

</stderr_txt>
]]>


Validate state Geldig
Claimed credit 7491.18171296296
Granted credit 11236.7725694444

@skgiven,

This one is OK.
Windows-xp-pro driver 260.99 boincmanager 06.10.58 swan_sync=0

Good luck!

436-GIANNI_DHFR500-2-99-RND8894_1
Workunit 2076552
Aangemaakt 15 Nov 2010 17:03:33 UTC
Sent 15 Nov 2010 17:15:30 UTC
Received 16 Nov 2010 7:26:10 UTC
Server state Over
Outcome Success
Client state Geen
Exit status 0 (0x0)
Computer ID 47762
Report deadline 20 Nov 2010 17:15:30 UTC
Run time 22050.75
CPU time 1663.875
stderr out <core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
# Using device 1
# There are 2 devices supporting CUDA
# Device 0: "GeForce GTX 295"
# Clock rate: 1.24 GHz
# Total amount of global memory: 939327488 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 1: "GeForce GTX 295"
# Clock rate: 1.24 GHz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 30
# Number of cores: 240
MDIO ERROR: cannot open file "restart.coor"
# Time per step (avg over 2000000 steps): 11.023 ms
# Approximate elapsed time for entire WU: 22045.188 s
called boinc_finish

</stderr_txt>
]]>


Validate state Geldig
Claimed credit 7491.18171296296
Granted credit 11236.7725694444
application version ACEMD2: GPU molecular dynamics v6.13 (cuda31)

--------------------------------------------------------------------------------
And another one with GTX295!

--------------------------------------------------------------------------------
____________
Ton (ftpd) Netherlands

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19501 - Posted: 16 Nov 2010 | 12:06:49 UTC - in response to Message 19499.

I will put them back to stock, keep the fans high and see how they get on. I have 5 threads free on the CPU and it's at stock, with turbo off, and the system is well cooled.

Your 4.034ms per step looks exceptional. That could bring home 120K credits per day, and at stock.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19502 - Posted: 16 Nov 2010 | 13:24:24 UTC - in response to Message 19501.

These tasks might be a bit sensitive, so I decided to give it a fair chance. Suspended all work, restarted system, reset cards back to stock, upped the fan to 80% and allowed 1 GIANNI_DHFR500 task to run by itself; nothing running on the other card and no CPU tasks running.

So far so good - about 37% complete after 68min.

GPU Utilization at 95%
GPU Temp at 59 deg C
GPU Fan at 4850rpm
Entire System (XP X86) only using 648MB

Virtual memory size 50.14MB
Working set size 37.01MB
CPU usage fluctuates between 13% and 18% very rapidly. Using one full core + a bit (probably just the system, Boinc, Task Manager, FF).

To be honest I would be more than happy if GPUGrid required 2 or 3 threads of my i7-920 if they are needed to support two Fermi's and bring this sort of improvement. Each GTX470 could do 100K per day for a very worthwhile project.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 16,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19503 - Posted: 16 Nov 2010 | 15:08:13 UTC - in response to Message 19502.
Last modified: 16 Nov 2010 | 15:22:56 UTC

I have another successful GIANNI_DHFR500 task (3299428).
This time the GPU ran at 800MHz, and I've raised the GPU's core voltage a bit more to 1.075V (it was 1.050V before).
The interesting part is the average time per step at 800MHz (4.038 ms) is even a little bit higher than at 700MHz (4.034 ms)

Bobrr
Send message
Joined: 2 Jul 10
Posts: 7
Credit: 27,699,565
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 19505 - Posted: 16 Nov 2010 | 16:45:02 UTC - in response to Message 19487.

The GTS450 is showing up again BOINC:
'Tue 16 Nov 2010 11:19:53 AM EST NVIDIA GPU 0: GeForce GTS 450 (driver version unknown, CUDA version 3020, compute capability 2.1, 1023MB, 421 GFLOPS peak)'
'driver version unknown' is a bit odd as it shows up in UBUNTU as 260.19.21

The card also shows up in GPUGRID under host ID: 84614.

It is currently running a task:
'Tue 16 Nov 2010 11:36:24 AM EST GPUGRID Starting task r475s1f1_r130s2-TONI_MSM5-0-4-RND8696_0 using acemd2 version 613'

I'll keep an eye on it. Thanks again for the assistance.

Profile Microcruncher*
Avatar
Send message
Joined: 12 Jun 09
Posts: 4
Credit: 185,737
RAC: 0
Level

Scientific publications
watwat
Message 19506 - Posted: 16 Nov 2010 | 16:51:08 UTC - in response to Message 19505.
Last modified: 16 Nov 2010 | 16:55:21 UTC

'driver version unknown' is a bit odd as it shows up in UBUNTU as 260.19.21

That's quite normal. Here (Ubuntu 64 Bit, GTX 460, 260.19.12) it looks similar:

Di 16 Nov 2010 13:47:05 CET NVIDIA GPU 0: GeForce GTX 460 (driver version unknown, CUDA version 3020, compute capability 2.1, 1023MB, 650 GFLOPS peak)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19507 - Posted: 16 Nov 2010 | 16:54:26 UTC - in response to Message 19503.
Last modified: 16 Nov 2010 | 17:03:25 UTC

Retvari Zoltan, I have seen this behaviour a few times in the past, one fairly recently. When you increase the GPU frequency tasks run faster and faster until peaking, on a sort of bell curve apex, and then less and less fast until you reach the point where they run slower than at stock, just before failing most tasks. You have to find that sweet spot and this might change for different task types, so it is better to err on the side of caution. Basically data has to be resent within the card because it did not arrive correctly. May be a feature of the memory controller ;p
Try 725MHz, 750MHz and 775MHz and see which setting finishes these tasks quicker.

That GIANNI_DHFR500 task completed OK on my system. 5.546 ms per step on a stock GTX470 is good going.

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19508 - Posted: 16 Nov 2010 | 17:21:43 UTC - in response to Message 19506.

'driver version unknown' is a bit odd as it shows up in UBUNTU as 260.19.21

That's quite normal. Here (Ubuntu 64 Bit, GTX 460, 260.19.12) it looks similar:

Di 16 Nov 2010 13:47:05 CET NVIDIA GPU 0: GeForce GTX 460 (driver version unknown, CUDA version 3020, compute capability 2.1, 1023MB, 650 GFLOPS peak)


Here it's
Mo 15 Nov 2010 20:30:06 CET NVIDIA GPU 0: GeForce GT 240 (driver version unknown, CUDA version 3020, compute capability 1.2, 511MB, 257 GFLOPS peak)

And my Nvidia settings say 260.19.12 under ubuntu10.4-64bit
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19509 - Posted: 16 Nov 2010 | 17:23:54 UTC - in response to Message 19499.

My GIANNI_DHFR500 tasks erroring out the same way after various computing time on GTX480 (overclocked to 800MHz)

While other highly GPU utilizing tasks (such as KASHIF_HIVPR's) run correctly on these overclocked cards.

When I clocked it back to factory settings (while running an other GIANNI_DHFR500 task, so the tasks page still shows the higher frequency), it finished correctly.

So these GIANNI_DHFR500 tasks seem to be more sensitive to overclocking on fermis, than other tasks.

The GIANNI_DHFR500 run at a higher GPU percent usage than most other WUs, 97-99% on some of my machines. This alone would tend to push a card too heavily OCed beyond it's limits.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19510 - Posted: 16 Nov 2010 | 17:47:46 UTC - in response to Message 19495.

Thanks Beyond, eFMer Priority will have another home for a while.

You're welcome. Just trying to find the most efficient solution so both GPU and CPU projects can be run optimally.

These GIANNI_DHFR500 tasks are very fast; probably the fastest tasks we have seen to date, 17.5 to 25ms per step on GT240s. I hope these advancements make their way into the other WU’s. Pleased to see one turn up on my GTX470, just about to start.

I finally got a GIANNI_DHFR500 WU on my GTX 260 too, so will be interesting to see how that goes.

It would still need to be mapped out for Fermi’s and non-fermi’s and on XP, Vista, W7, Linux (nice values). Did you run any same type tasks without priority increases, to see what the differences are?

Like you I started but at the default project priority the GPU is not fed properly and you will see either a low or markedly sawtooth shaped GPU usage graph along with slow WU progress. No point in testing that further as the default priority is not the correct setting at least on any of my XP64 machines.

While swan_sync seems to be important on Linux for all cards, on Windows it is only important for Fermis, not GT240’s running the 6.12 app (makes little or no difference). It might help with the 6.13 app with a GF200 series card, but I have not tested this. So I suspect priority may not be a replacement for swan_sync, rather an alternative/also usable option, or option for non-Fermi’s. I think one of my GIANNI_DHFR500 tasks (possibly the second one) used swan_sync, but there is little difference in CPU time. When swan_sync is used with Fermi it does use a full core/thread.

Linux users have also reported a huge WU speedup by adjusting their priority (nice) settings.

Both of my GIANNI_DHFR500 tasks used slightly less CPU time than yours did. While this may be down to using a high priority it might just be a difference in CPU performance. ... I remember looking at this in the past, and if my memory serves me right we can get away with “Set Above Normal” priority. I’m trying this now (switched on last night mid run) on my GT240’s. Just need to wait a day to get clean results.

For me 1,000-5,000 seconds of CPU time for a 40,000 second WU is very acceptable, much better than the 40,000 seconds of CPU used with SWAN_SYNC. Please let us know how the Above Normal priority works for you.


Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19511 - Posted: 16 Nov 2010 | 17:48:32 UTC - in response to Message 19507.
Last modified: 16 Nov 2010 | 18:03:55 UTC

BobR, good to see you back up and running.

Thanks for listing that data.
"driver version unknown" might just be a Boinc client/driver reporting issue. Perhaps Boinc expects a similar format to Windows; three numbers one dot and two more numbers (260.99) rather than nnn.nn.nn?

That 421 GFLOPS peak is also wrong, it should be about 602 and Ralf's 650 GFlops peak for a GTX460 is also low; should be 907.

I will also try eFMer Priority on my Fermi's tonight and report back, tomorrow hopefully. Would like to run it without swan_sync=0 and with swan_sync=0 to see if they are worth using together or just separately. If it makes no difference running them together but they equally speed up the projects then an increased priority could be added project side to the app and we would not have to use swan_sync at all, or perhaps just with Linux (depending on nice values). Alternatively both together might improve runtime even more, and then swan_sync could be kept as an option and above normal priority could be a default setting. We'll see...

Profile Microcruncher*
Avatar
Send message
Joined: 12 Jun 09
Posts: 4
Credit: 185,737
RAC: 0
Level

Scientific publications
watwat
Message 19512 - Posted: 16 Nov 2010 | 18:32:10 UTC - in response to Message 19511.
Last modified: 16 Nov 2010 | 18:34:46 UTC

That 421 GFLOPS peak is also wrong, it should be about 602 and Ralf's 650 GFlops peak for a GTX460 is also low; should be 907.

The GFLOPS display is pretty much useless. BOINC 6.10.58 for Windows reports 363 GFLOPS for the GTX 460 (independent of the shader clocks) while my old GTX 260 was rated at 477 GFLOPS at stock clock.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,518,011,851
RAC: 8,607,124
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19514 - Posted: 16 Nov 2010 | 19:06:11 UTC - in response to Message 19512.

That 421 GFLOPS peak is also wrong, it should be about 602 and Ralf's 650 GFlops peak for a GTX460 is also low; should be 907.

The GFLOPS display is pretty much useless. BOINC 6.10.58 for Windows reports 363 GFLOPS for the GTX 460 (independent of the shader clocks) while my old GTX 260 was rated at 477 GFLOPS at stock clock.

Until we can get an answer for David Anderson's question:

Is it the case that all compute capability 2.1 chips
have 48 cores per processor?
I can't get a clear answer from nvidia on this.
-- David

the peak GFlops estimate in BOINC is going to stay wrong. Has anyone else got a way of getting a clear answer from NVidia, not only for this gerenaration of chips, but hopefully a software API that will work on future generations as well?

Profile Microcruncher*
Avatar
Send message
Joined: 12 Jun 09
Posts: 4
Credit: 185,737
RAC: 0
Level

Scientific publications
watwat
Message 19515 - Posted: 16 Nov 2010 | 20:00:05 UTC - in response to Message 19514.

Has anyone else got a way of getting a clear answer from NVidia, not only for this gerenaration of chips, but hopefully a software API that will work on future generations as well?

No. Of course, an API would be nice but it should be kept simple or otherwise the BOINC programmers end up writing and maintaining a sysinfo tool only to display one not very interesting number. It is interesting if BOINC detects the GPUs but the display of the "correct" theoretical performance for a card like my GTX 460 (which behaves with one app like a 336 SP card and with another like a 224 SP card because the two warp schedulers per multiprocessor can't make use of the extra 16 cores each MP has) should be far down on the priority list.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19517 - Posted: 16 Nov 2010 | 22:41:24 UTC - in response to Message 19515.

There’s little chance of NVidia divulging the architectural structure of future cards.
As for CC 2.1, yes. The 48:1 ratio is synonymous with CC2.1, hence the compliance of all low and mid range Fermi cards. This is set to continue for the immediate future; there will not be a GF114 version for several months and those GTX560’s will probably tow the line, or just not be CC2.1.
Should another card turn up with a different ratio I would expect NVidia to make a new capability (CC3.0) – very unlikely for some time; the GTX 580 is still CC 2.0 (32:1 ratio) and the forthcoming GTX 470 will follow suit. So there might not even be a new CC until the move to 28nm, and that is a fair bit away; more chance of several new Boinc versions between now and then.

GPU Caps Viewer reports GPU Compute Capability.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19519 - Posted: 16 Nov 2010 | 23:33:52 UTC - in response to Message 19510.

These GIANNI_DHFR500 tasks are very fast; probably the fastest tasks we have seen to date, 17.5 to 25ms per step on GT240s. I hope these advancements make their way into the other WU’s. Pleased to see one turn up on my GTX470, just about to start.

I finally got a GIANNI_DHFR500 WU on my GTX 260 too, so will be interesting to see how that goes.

Finished the GIANNI_DHFR500 WU on my GTX 260 now: priority high, 1530 MHz, XP64. It ran in 16904.656 seconds with only 832 seconds of CPU time, 8.452 ms/step. All in all very happy with how the GIANNI_DHFR500 WUs are running with priority boost.

Bobrr
Send message
Joined: 2 Jul 10
Posts: 7
Credit: 27,699,565
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 19522 - Posted: 17 Nov 2010 | 3:13:39 UTC - in response to Message 19511.

Well, it worked for a little while. The system froze up completely and had to do a reboot. The next messages are as follows:
'Tue 16 Nov 2010 10:02:13 PM EST No usable GPUs found
Tue 16 Nov 2010 10:02:14 PM EST GPUGRID Application uses missing NVIDIA GPU
Tue 16 Nov 2010 10:02:14 PM EST GPUGRID Application uses missing NVIDIA GPU
Tue 16 Nov 2010 10:02:14 PM EST GPUGRID Application uses missing NVIDIA GPU
Tue 16 Nov 2010 10:02:14 PM EST GPUGRID Missing coprocessor for task r475s1f1_r130s2-TONI_MSM5-0-4-RND8696_0
Tue 16 Nov 2010 10:02:15 PM EST GPUGRID URL http://www.gpugrid.net/; Computer ID 84614; resource share 10000
Tue 16 Nov 2010 10:02:15 PM EST Reading preferences override file
Tue 16 Nov 2010 10:02:15 PM EST Preferences:
Tue 16 Nov 2010 10:02:15 PM EST max memory usage when active: 501.96MB
Tue 16 Nov 2010 10:02:15 PM EST max memory usage when idle: 2007.86MB
Tue 16 Nov 2010 10:02:33 PM EST max disk usage: 20.00GB
Tue 16 Nov 2010 10:02:33 PM EST (to change preferences, visit the web site of an attached project, or select Preferences in the Manager)
Tue 16 Nov 2010 10:02:33 PM EST Not using a proxy
Tue 16 Nov 2010 10:03:53 PM EST GPUGRID task r475s1f1_r130s2-TONI_MSM5-0-4-RND8696_0 suspended by user'

The 'Tasks' window shows 'GPU missing'

I'll keep working BOINC with other projects and hope something new comes along. Unless you have other suggestions to try.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19524 - Posted: 17 Nov 2010 | 9:14:03 UTC - in response to Message 19522.

These failures could be caused by many things but a few likely candidates would be, shortage of RAM (get more or dont run CPU apps), the card may just be overheating and hanging the system (need to check this and then up the fan speed to cool the card & system better), your power supply is inappropriate, operating system (Linux) is unstable and you need to re-install (caused by driver itself, HDD issues, update or app issues), the hard drive is corrupt, malware or bad apps are on the system.

biodoc
Send message
Joined: 26 Aug 08
Posts: 183
Credit: 6,466,114,375
RAC: 1,393,151
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19526 - Posted: 17 Nov 2010 | 11:29:28 UTC

I've had no problems since I switched driver versions from 260.19.14 to 260.19.21. I'm also using SWAN_SYNC=0 since there are no other options for linux right now although I'd like to get 80% of that core back for other projects. :)

I've finished 11 WUs since the driver upgrade:

http://www.gpugrid.net/results.php?hostid=82590


64 bit Ubuntu 10.04
boinc 6.10.58
GTX460
nvidia driver version 260.19.21

lkiller123
Send message
Joined: 24 Dec 09
Posts: 22
Credit: 15,875,809
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 19549 - Posted: 19 Nov 2010 | 0:23:32 UTC

New update on the 65nm problem.

I've done more testing recently..

Specs: EVGA GTX 260 65nm (stock clock at 576/1242/999)
EVGA GTX 260 55nm (stock clock at 626/1350/1053)
Driver version: 260.61
CUDA version: 3020 (which I assume to be 3.2)
Processor: Phenom II X4 940 at 3.5Ghz, running WCG.
BOINC version: 6.10.58 64bit
OS: Windows 7 Ultimate x64

This is the computer I am testing with.

On the computer, the 65nm GTX 260 is shown as device 1 while the 55nm is shown as device 0.

As you might recall, the 65nm had been struggling to run GPUGRID; there had been WU crashes which Windows will prompt you to "close the program" as well as some driver crashes.

However, when running on 6.13 (and the new driver as well), the problem disappeared. While the WUs still error out (sadly), they never crash; not even a WU hang had occurred on the 65nm.

The 65nm card even managed to finish two work units [WU 1][WU 2].

I will test for a couple more days, and will do so for each new ACEMD version.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19556 - Posted: 19 Nov 2010 | 19:09:46 UTC - in response to Message 19549.

Firstly I think a restart is in order, as you are getting what looks like runaway failures - could be from some other software on the system.

While I’m not saying you can’t run a Phenom II 940 at 3.5GHz and crunch, I have one so I know what it can do, I think it is a bit on the high side, especially if you are having problems in a dual GTX260’s system and in particular if one is the older 65nm version, which as you know is notoriously difficult to crunch with here. At 3.5GHz the additional power draw and heat production is substantial (unless you have a water cooled system, and even then the capacitors dont get cooled as much). I found that 3.3GHz was plenty, but I generally run at stock and would suggest that for now you do this; you are testing after all.

Basically try all the tricks, including upping the GPU fan speed (EVGA Precision or other app), using eFMer Priority to raise the application priority, and leaving at least one CPU core free (preferably all 4); I think the Windows Error Message to close the CUDA app is as a result of a timeout, so increasing the threads priority should help, as should freeing up cores.

What I’m saying is try to give yourself as much chance as possible to run these tasks successfully, and add applications later. Even if it means turning off your firewalls and antivirus software it’s worthwhile, especialy if you use several (it removes possible problem apps), and if you are worried about security, disable Boinc networking, close all other programs, disable updates, and unplug the Ethernet cable for the duration of the runs.

Good Luck,

Profile Fred J. Verster
Send message
Joined: 1 Apr 09
Posts: 58
Credit: 35,833,978
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19563 - Posted: 20 Nov 2010 | 0:31:48 UTC - in response to Message 19556.
Last modified: 20 Nov 2010 | 1:06:47 UTC

Noticed, sorry if this is a bit off topic, if CPU speed increases, CPU use drops
from 0.20 to 0.16CPU+1GPU (GTX480@1.43-1.5GHz)(CPU@3.366MHz)(Now 3.4GHz;3500FLOPS/sec; 10,000 Drystone Mops/sec.)
According to BOINC manager info.
Didn't know it is a BOINC feature, or am I wrong?

Doesn't have much effect though, with Einstein it's hardly any difference.
(And the 'GPU app' doesn't do anything most of the times, I skip GPU use on Einstein, I'll UNtag it, when the have/develop/etc. a better GPU app.)
I don't mean to be all negative, but this doesn't contribute at all to GPU processing and don't know why they use it!

But I'm gettin off TOPIC..........
____________

Knight Who Says Ni N!

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19567 - Posted: 20 Nov 2010 | 8:21:43 UTC - in response to Message 19563.

Not at all, this is just how BOINC computes the amount of CPU used. It's really just a text of no use.

gdf

Noticed, sorry if this is a bit off topic, if CPU speed increases, CPU use drops
from 0.20 to 0.16CPU+1GPU (GTX480@1.43-1.5GHz)(CPU@3.366MHz)(Now 3.4GHz;3500FLOPS/sec; 10,000 Drystone Mops/sec.)
According to BOINC manager info.
Didn't know it is a BOINC feature, or am I wrong?

Doesn't have much effect though, with Einstein it's hardly any difference.
(And the 'GPU app' doesn't do anything most of the times, I skip GPU use on Einstein, I'll UNtag it, when the have/develop/etc. a better GPU app.)
I don't mean to be all negative, but this doesn't contribute at all to GPU processing and don't know why they use it!

But I'm gettin off TOPIC..........

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19584 - Posted: 21 Nov 2010 | 14:06:01 UTC

I have to wait for my second Toni yet, so far most of the WUs I get are at least half as fast as they were before the "upgrade", which was a very bad downgrade for me.
Currently I get a Kashiv most of the time, and they take 2-3 times as long as with the older app. OK, they don't use that much CPU now, but that was a minor issue compared to the total crunch time that's used now, if it's used at all, the temperatures are still not as high as they used to be before on my GPU.

Is there a possibility to stop the wasteful non-Tonis to be delivered to my computer?
Should I simply abort all others until I get a suitable one?
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19586 - Posted: 21 Nov 2010 | 14:20:56 UTC - in response to Message 19584.

Have you tried to use export SWAN_SYNC=0 in your .bashrc?
Soon, we will make this much easier.
gdf

I have to wait for my second Toni yet, so far most of the WUs I get are at least half as fast as they were before the "upgrade", which was a very bad downgrade for me.
Currently I get a Kashiv most of the time, and they take 2-3 times as long as with the older app. OK, they don't use that much CPU now, but that was a minor issue compared to the total crunch time that's used now, if it's used at all, the temperatures are still not as high as they used to be before on my GPU.

Is there a possibility to stop the wasteful non-Tonis to be delivered to my computer?
Should I simply abort all others until I get a suitable one?

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19587 - Posted: 21 Nov 2010 | 14:28:48 UTC - in response to Message 19586.

Have you tried to use export SWAN_SYNC=0 in your .bashrc?
Soon, we will make this much easier.
gdf

I've tried with one file that included .bashrc, although I've told you and other many times that I'm a user, not a nerd, and that that file is nowhere on my machine as you described it.
I won't alter any remotely named file without knowing what's going to happen and guaranties that nothing will go wrong with my machine in general.

To tell you what I've done:
I've included the lines
# for GPUgrid in BOINC
# configures an interactive environmental variable
export SWAN_SYNC=0

in the file /etc/bash.bashrc
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19589 - Posted: 21 Nov 2010 | 17:47:33 UTC

What am I doing wrong now??? My Linux machines are as good as winter heating, but not more then that. 71711 62706 70922 I guess that I was complaining all summer long about Linux being overly aggressive & overheating all summer long, but it's winter now & summer then. BTW, it would have been nice not to have the windows open all summer long, now I'm freezing & just burning electricity.
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19590 - Posted: 21 Nov 2010 | 18:28:57 UTC - in response to Message 19589.

There must be something wrong. The timing is too slow.
Try to add
export SWAN_SYNC=0
in your .bashrc file.

gdf

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19591 - Posted: 21 Nov 2010 | 19:01:00 UTC - in response to Message 19590.

There must be something wrong. The timing is too slow.
Try to add
export SWAN_SYNC=0
in your .bashrc file.

gdf

There is no such file on my system, for f*** sake.

If you want to keep ordinary users here, not just nerds, and using command line instructions is definitely a nerdy behaviour, please communicate on a way that users can understand you!

And please explain where this dubious .bashrc is supposed to be, how it should look like exactly, what it will do to the machine. If you just give such unintelligible professional terminology type answers, don't expact any user to understand you.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19594 - Posted: 21 Nov 2010 | 21:04:21 UTC - in response to Message 19591.

.bashrc is hidden. That's what the "." in Linux does. I made this how to: BOInc 4 N00Bs

BTW GDF, 71711 has always been using SWAN_SYNC=0, maybe it's something else? I did add export SWAN_SYNC=0 to the other two, but I don't think it'll help. :-(
____________

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19597 - Posted: 22 Nov 2010 | 5:37:10 UTC - in response to Message 19594.
Last modified: 22 Nov 2010 | 5:38:59 UTC

.bashrc is hidden. That's what the "." in Linux does. I made this how to: BOInc 4 N00Bs

There's as well no hidden file with that name, I know how to let them be seen in the nautilus.

Why are you writing about a .profile, while here it's always called .bashrc?
Where exactly is this .profile or .bashrc supposed to be to alter it?
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19601 - Posted: 22 Nov 2010 | 6:32:32 UTC - in response to Message 19597.
Last modified: 22 Nov 2010 | 6:33:41 UTC

Can't explain beyond a truthful I don't know, but you can google it. It was a long time ago I did this. Just open the terminal & type sudo gedit .profile add export SWAN_SYNC=0 save then restart Linux. You can check if it's there by opening terminal after you've restarted & type env
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19602 - Posted: 22 Nov 2010 | 8:49:47 UTC - in response to Message 19594.

It must not be set correctly, otherwise the use of the different sycronization would appear in the log of the results.

Type echo $SHELL to check what shell do you have?

gdf

.bashrc is hidden. That's what the "." in Linux does. I made this how to: BOInc 4 N00Bs

BTW GDF, 71711 has always been using SWAN_SYNC=0, maybe it's something else? I did add export SWAN_SYNC=0 to the other two, but I don't think it'll help. :-(

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19619 - Posted: 22 Nov 2010 | 17:18:59 UTC - in response to Message 19602.

/bib/bash

I just type evn doesn't that also tell you what you need to know & if SWAN_SYNC=0 is present? But I must be doing something wrong because nothing is happening.
____________

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19621 - Posted: 22 Nov 2010 | 17:43:30 UTC - in response to Message 19597.

.bashrc is hidden. That's what the "." in Linux does. I made this how to: BOInc 4 N00Bs

There's as well no hidden file with that name, I know how to let them be seen in the nautilus.

Some have reported good results by boosting the priority in Linux (also works well in Windows). Not a Linux expert but you should be able to do this in the process manager by lowering the "nice" value. There may be a way or program to automatically set the nice value to what you want for any given process. Maybe someone knows how to do that in Linux and can post it here?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19623 - Posted: 22 Nov 2010 | 18:19:47 UTC - in response to Message 19621.

Run time 174918.919998
CPU time 270.99

It is clear that swan_sync was not working. Did you restart after setting it?

Setting the nice value (akin to setting the priority, I think) might help, but again I'm not a Linux expert either and if I knew how I would have already tested it and posted a nice how to do it :)

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19624 - Posted: 22 Nov 2010 | 18:42:55 UTC

p115-IBUCH_1_pYEEI_101109-9-20-RND4744_0 took this:
Run time 95757.294388
CPU time 3680.72
Claimed credit 7954.42

An old IBuch from before the big mess with the new sloppy crap of an app had this (p30-IBUCH_8_pYEEI_101027-3-4-RND5091_0):
Run time 61218,70
CPU time 59819,61
Claimed credit 7954.42

So an increase of ~50% in crunch time.


My last completed Kashif:
53-KASHIF_HIVPR_n1_bound_so_ba1-70-100-RND8429_1
Run time 138546.228961
CPU time 1301.94
Claimed credit 6409.23

One of the same type from before the mess (264-KASHIF_HIVPR_n1_bound_so_ba1-34-100-RND1335_1):
Run time 54414,69
CPU time 53990,94
Claimed credit 6409.23

Crunch time has tripled.


The only Toni so far:
input_r199s1-TONI_MSM5-2-4-RND9936_1
Run time 45747.43243
CPU time 2241.32
Claimed credit 3539.96

I have none with the same amount of claimed credits, thus no same type.
The shortest one was this (S50-TONI_HERGMETAXDOFE-22-50-RND4899_1):
Run time 49040,07
CPU time 48419,91
Claimed credit 5764,79

Rt/Cc old: 8,51
Rt/Cc new: 12,93

Crunch time has increased by ~50%
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile X-Files 27
Avatar
Send message
Joined: 11 Oct 08
Posts: 95
Credit: 68,023,693
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19634 - Posted: 23 Nov 2010 | 0:47:55 UTC

If your're using ubuntu and install boinc thru apt-get, then export SWAN_SYNC=0 would not work by default. boinc client is run by user "boinc" whc cannot see the env var you set on your account.

Workaround is:
stop the client
edit /etc/default/boinc-client
change BOINC_USER to root
sudo -i
export SWAN_SYNC=0
start the client

Best Method:
edit /etc/enviroment -> in windows, this is the system env var
then add SWAN_SYNC=0

To verify:
echo $SWAN_SYNC

It should output 0
____________

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19637 - Posted: 23 Nov 2010 | 2:21:28 UTC - in response to Message 19634.

Thanks X-Files 27, thanks & done. Now back to hope.

BTW I've read that many were against running boinc as root, which is understandable being a DC Project. But I've got nothing in my PCs, they're just crunchboxes to me & this is just a hobby.
____________

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19638 - Posted: 23 Nov 2010 | 5:34:17 UTC - in response to Message 19634.

To verify:
echo $SWAN_SYNC

It should output 0

As this answer is the one on my machine, it seems to be that despite the many non-helpful half-answers I got here it may have worked nevertheless.

The crunch time has worsened despite that big time, as you can see in my last post. If the new app does more than double the science than the old one, it's just that you can ditch the next generation of cards because of excessive demands by this project, if not, it's just crap.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19641 - Posted: 23 Nov 2010 | 10:55:42 UTC - in response to Message 19638.
Last modified: 23 Nov 2010 | 10:56:23 UTC

The 6.13app is slow for GT240's, on Windows and Linux. It seems to work as well with the CC1.3 cards, but not the CC1.2 cards. Use an older driver and run the 6.12app.

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19643 - Posted: 23 Nov 2010 | 12:14:37 UTC - in response to Message 19641.
Last modified: 23 Nov 2010 | 12:15:45 UTC

The 6.13app is slow for GT240's, on Windows and Linux. It seems to work as well with the CC1.3 cards, but not the CC1.2 cards. Use an older driver and run the 6.12app.

So I have to downgrade my system to something inferior to suite the sloppy programming of the app? And for weeks you tried to persuade me to upgrade my old driver, fiddle around in some dubious innards of my core system or whatever SWAN_SYNC does, so I could run your so much better new app, and now you slap me in the face and say: F*** off with your current drivers, get back the outdates stuff or bugger off. Very fine behaviour indeed.

If you weren't the only available project for my system with a valid and good science I could not get away here fast enough.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19644 - Posted: 23 Nov 2010 | 12:38:42 UTC - in response to Message 19643.

Hi Saenger,

While I understand your frustration, BOINC computing for GPU just is not as easy we all would like it to be. There are many variables involved between the apps the project creates, the drivers NVIDIA publishes, each GPU family, and BOINC itself. It's complicated, that's just how it is, and this project is still officially listed as a beta project. GDF and team, along with volunteers here on the forum are trying their best. If this is the only project you deem worthwile for your NVIDIA card and you want to participate then you might consider accepting the project defaults, like the recent decision to use less CPU which does run slower overall, or you could learn a little *nerd* to get the most crunching from your equipment. While askign questions and engaging in a vigorous discussion is welcome I would like remind you of the forum posting rules which in part state: "No messages that are deliberately hostile or insulting."
____________
Thanks - Steve

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19645 - Posted: 23 Nov 2010 | 15:20:26 UTC - in response to Message 19644.

Even I can relate with Saenger's frustration. It's not fun to have something work, & then "not" work, because of optimizations & improvements gone bad. Whether you're MS or Apple, if something goes wrong, you'll get a million thumbs down & that kid from Springfield who always says "Ha, ha!"

I myself DO want my PCs to die. But I want them all to die a natural & noble death. I'd have to cheat someone or cheat myself, if I wanted to do a complete overhaul of my GPUs. So unless Nvidia is paying you to care about their GPU sales, please still consider the people with all the old stuff. ;-)
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19647 - Posted: 23 Nov 2010 | 15:41:18 UTC - in response to Message 19645.
Last modified: 23 Nov 2010 | 15:41:32 UTC

I can just say that we have changed the way the application works in Linux because some users complained on the use of a CPU.
Now some others complains because the GPU is not used well enough.

As I said in a previous post, we are making the GPU utilization mode a project variable, so that you can set it in a easier way from Linux and from Windows.
It just requires some time.

gdf

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19648 - Posted: 23 Nov 2010 | 17:39:05 UTC - in response to Message 19647.
Last modified: 23 Nov 2010 | 17:39:46 UTC

I can just say that we have changed the way the application works in Linux because some users complained on the use of a CPU.
Now some others complains because the GPU is not used well enough.

As I said in a previous post, we are making the GPU utilization mode a project variable, so that you can set it in a easier way from Linux and from Windows.
It just requires some time.

gdf

I was one who complained about the secret use of the CPU for the project, I had absolutely no complaints about the use of it in general. The only bad part was that the app unnecessary pretended not to use a CPU while it used a whole core. If the linux app just had said (1 CPU + 1 GPU) instead of (0.15 CPU + 1 GPU) everything would have been fine. That would probably have been something totally easy for any first year student programmer to implement.

Instead you chose to implement something different to utilize less CPU, and with that made next to no use of the GPU along the way. For not-high-end cards it now impossible to crunch a WU in the desired time frame of 24h even for 24/7-crunchers like me, a task that was no problem with the old app.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19652 - Posted: 23 Nov 2010 | 20:26:40 UTC - in response to Message 19648.

The amount of CPU is fixed by the scheduler so it's the same for windows and linux. We would be pretty happy to use 1 CPU for each run and maximum speed, but then others will not like it. That's why we decided to give the opportunity to choose.

gdf

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19655 - Posted: 23 Nov 2010 | 21:35:05 UTC - in response to Message 19652.
Last modified: 23 Nov 2010 | 22:22:08 UTC

The amount of CPU is fixed by the scheduler so it's the same for windows and linux

How comes other projects have CPU-only and GPU/CPU WUs in stock? As you don't even have any validation process on the single WU, so no chance that Windows and Linux-puters will run the same WU, there should be absolutely no problem to differentiate here.

That's why we decided to give the opportunity to choose.

If there was anything to choose....
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Crystal Pellet
Send message
Joined: 25 Mar 10
Posts: 18
Credit: 2,568,073
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 19660 - Posted: 24 Nov 2010 | 9:16:25 UTC - in response to Message 19655.

Sänger wrote:
If there was anything to choose....

You have the choice to run the old application (still available) with an app_info.xml file as discussed in another thread.

I'm still running Windows old 6.05 version with 43.478 ms per step on an old GT240 (bought this spring) ;)

77-KASHIF_HIVPR_n1_unbound_so_ba1-70-100-RND5835_0 2103684 23 Nov 2010 17:12:36 UTC 24 Nov 2010 5:29:12 UTC Completed and validated 43,490.56 5,742.65 6,322.41 9,483.62 Anonymous platform

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19662 - Posted: 24 Nov 2010 | 13:33:39 UTC - in response to Message 19660.

The old application will currently fail on several Ignasi tasks and on all the Gianni. So don't.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19667 - Posted: 25 Nov 2010 | 8:55:35 UTC - in response to Message 19662.

6.12 is also just about as fast as 6.05:

http://www.gpugrid.net/result.php?resultid=3336470
48.331 ms for a GT240 at 1.60GHz.

43.478 ms for a GT240 at 1.76GHz.
http://www.gpugrid.net/result.php?resultid=3341116

48.331/43.478=1.1, but 1.76/1.60=1.1

Both Vista systems with similar CPUs, both KASHIF_HIVPR tasks.

Crystal Pellet
Send message
Joined: 25 Mar 10
Posts: 18
Credit: 2,568,073
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 19671 - Posted: 25 Nov 2010 | 15:37:33 UTC - in response to Message 19667.

skgiven wrote:
6.12 is also just about as fast as 6.05 ...

You are comparing apples and oranges. The tasks are not similar. Only the time per step looks equal.

Your example tasks:

6.12:
Run time 52299.189003
CPU time 3931.506
Credits 9025.06

6.05:
Run time 43490.557104
CPU time 5742.646
Credits 9483.62

The runtime is on 6.12 ~20% longer and the credits are lower. The cpu usage is ~30% lower.

Btw: I was able to save a task from a acemd2-crash by ignoring the MS message and rebooting: http://www.gpugrid.net/result.php?resultid=3346345
It looks like I've to degrade my NVIDIA driver from 260.99 to 197.45 again.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19672 - Posted: 25 Nov 2010 | 16:39:57 UTC
Last modified: 25 Nov 2010 | 16:40:23 UTC

Ok, so now I've tried adding export SWAN_SYNC=0 to .profile

I've also changed BOINC_USER from boinc to root, & added SWAN_SYNC=0 to /etc/environment

When running boinc-client in root, I've typed echo $SWAN_SYNC & the answer is 0

So WTF is wrong now? 71711

I even had to cancel tasks on 62706 & 70922 because they weren't getting anywhere anytime soon.
____________

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19673 - Posted: 25 Nov 2010 | 16:58:17 UTC - in response to Message 19672.

So WTF is wrong now? 71711

Nothing new, just what I complained about for weeks ;)
Just abort all Kashifs asap, they will not run in time on any Linux machine worth less than a few thousand quid.
Other ones will perhaps make it within the 2-day deadline, but probably no one will be back @project within the desired 1-day time frame.

The project doesn't care about Linux crunchers with ordinary cards inside.

Rubbish apps, every day new and contradicting hints what to do, (upgrade drivers, downgrade drivers, swan-sync whatever, nothing but worthless drivel why and where to put what)
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19674 - Posted: 25 Nov 2010 | 17:27:48 UTC

They say all Davids are good, but not all Goliaths are bad. But it's things like these which makes all Davids eventually take up a sling & try to take out every Goliath he sees. Wouldn't Goliath rather have a woman take him out instead? Robbing Hood was a thief, but The Sheriff started!

I know it's all about the money, but good will goes both ways, & so does bad will. So do they want people to work for peanuts, or is it dog eat dog, in a world created for Humans?

What started out as mini golf, is starting to look an awful lot like a golf club. People still pay to play mini golf, but only rich people can afford to be a member of a golf club.

I thought everyone was busy smoking cigars in Dubai, so why are they turning mini golf into golf? Don't they have towers to build, formula 1 races to win, football teams to take over, & other vanity projects to pass their time with?
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19675 - Posted: 25 Nov 2010 | 18:58:51 UTC - in response to Message 19671.

You are comparing apples and oranges. The tasks are not similar. Only the time per step looks equal.

OK, here is another more apple like task,
http://www.gpugrid.net/result.php?resultid=3287765

It ran on a card at 1.34GHz, while your task ran at 1.76GHz - your card is clocked 31% faster. My task took 29% longer to run. Same credits.

I also got several MS acemd2-crash Errors - I left the error messages open, closed Boinc, rebooted and the tasks recovered.

The 260.99 NVIDIA driver caused me serious problems on Vista, the clock speeds dropped to their lowest level and it would not let me set them higher, even using EVGA Precision did not force them to stay high. These are Very BAD drivers for many if not most 200 series cards.

The situation appears to be the same on Linux - I have looked at several peoples 6.13 tasks and they are too slow. I'm now trying Ubuntu 10.10, but only installed it last night. CPU tasks run fine, but getting the GPU right is a pain. When I tell NVidia X server (power Mizer) "Prefer maximum performance" it is not saved; on restart it is back to adaptive.
I think the recommended Boinc installation may not be the way forward for GPU users (Applications, Ubuntu Software Centre, search for Boinc) and the default driver appears to be the equivalent to 260.99, at least in terms of being slow and not having controls that work. I have Boinc loading automatically, but even after delaying it's start it appears to be running in power saver mode. 10.10 is different from earlier versions and I'm no Linux expert either. Even swan_sync=0 does not seem to stick, but as I say its another learning curve. I did manage to use swan_sync with Kubuntu 10.04, but I was running the 6.12app and using a reasonable driver. On Ubuntu 10.10 I had to use sudo nautilus /etc/boinc-client to be able to edit the cc_config.xml file, without it I had no access to the file or folder. A bit too much unwanted security all round :(

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19676 - Posted: 26 Nov 2010 | 10:34:25 UTC - in response to Message 19675.

On my Phenom II 940 system with Four GT240's I had been running 4 heavy CPU tasks, 4 lightweight Free Hal tasks and a WU Prop task (also lightweight). The performance of the 6.12app was a bit sluggish at times; GPU utilization was only about 70%. With the odd project backoff, and not running tasks at peak power usage times (3.5 times the cheapest times) I had not been getting some tasks back in time for full credit; annoyingly missing out by less than an hour in most cases.

I'm now using swan_sync=0 without CPU tasks and the 6.12app is much faster. It does use a full core/thread per GPU and to benefit you do need to free up a core to use swan_sync=0, but the tasks are faster and the credit is much better.

3351679 2103512 9:36:23 UTC Completed and validated 44,048.93 35,309.45 6,016.70 9,025.06 ACEMD2: GPU molecular dynamics v6.12 (cuda)

43ms per step on a GT240 @ 1.6GHz (and thats with a couple of restarts and on Vista).

I'm just going to use 3 GPU's until I have completed partially completed CPU tasks and then just stop running any CPU tasks on that system. The daily RAC for that system could rise to over 60K per day. Much better than my present 43.8K RAC.

On my i7-920 with 2 GPU's I use 5 CPU cores.

I managed to totally mess up my Ubuntu system trying to change drivers, now the hard drive is causing issues and I can't reinstall Ubuntu. I might go back to Kubuntu 10.04 if I can dig out a replacement drive.

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19677 - Posted: 26 Nov 2010 | 12:04:32 UTC

Considering the very alpha status of the current awful apps, and the availability of a thing called

Run test applications?
This helps us develop applications, but may cause jobs to fail on your computer
in your preferences, why did the project team let this rubbish loose on unsuspecting crunchers? Why wasn't it tested before? What's a test application good for if it's not used for testing, as it obviously happened?
The current apps failed completely from day one, and the project team did nothing at all to get the good ones back or help us victims of the bad decision with real hints, not just senseless wild guesses and contradicting suggestions.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19678 - Posted: 26 Nov 2010 | 15:55:21 UTC - in response to Message 19677.

One of the problems is that the project MUST distinguish between cards capabilities in order to send the correct app (6.12 or 6.13) and the correct tasks for the app.
The natural progression is to support new GPU's and utilize improvements from new CUDA drivers, otherwise the project would stop – one of this projects main areas of research is improving molecular modelling, and this requires them to use the latest cards and the latest CUDA apps.

This time the project introduced the 6.13app to better cater for Fermi cards, and actually allow an app to work for Linux users with a GTX460 or GTS450 (and presumably all the other 48:1 ratio cards, mostly OEM). This part is a total success; Fermi’s are working well for Windows and Linux under the 6.13app. It’s only a problem for users of earlier cards that are using newish drivers, some of which don’t even work properly (260.89 and 260.99).

The best application to use for previous generations of cards is the 6.12app. On Windows this means using a driver between 195 and 197.45.
Why? Because these drivers includes CUDA support.
Specifically why? The 197.45 driver is the most up to date WHQL driver before the first driver that shipped with support for CUDA3.1 (the 257.15 Beta driver, or 257.21 WHQL driver). So if you use a 257.15 or later driver you have CUDA3.1 and because the project distinguishes which app to run by the drivers CUDA capabilities you get the 6.13app.

If you have an NVidia driver between 195 and 197.45 you will use the 6.12app to run tasks. These typically run faster than the 6.13 app.

Why not have a selection box in my profile run the 6.12app then, with 257.15 or later drivers? I can’t answer this for sure, but I know cards before Fermi do not benefit, in terms of speed, from anything beyond CUDA2.2. While the drivers with CUDA 2.3 and CUDA 3.0 are almost as fast as the drivers that first supported CUDA2.2 (195 drivers), I expect the 257.15 through to 258.96 drivers much slower running the 6.12 app, and we know the latest NVidia drivers make many 200 series GPUs go to sleep. So I guess if you have the CUDA3.1 driver, it will not run the 6.12app any faster than the 6.13app.

Why release lots of new apps? That’s a major part of what this project is about; application development, and when it leads to significant speed increases it makes the bumps worthwhile and this aids the Bio-Medical research in the long run.

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19679 - Posted: 26 Nov 2010 | 16:19:51 UTC - in response to Message 19678.
Last modified: 26 Nov 2010 | 16:23:51 UTC

If you have an NVidia driver between 195 and 197.45 you will use the 6.12app to run tasks. These typically run faster than the 6.13 app.

That's where the mess started with here.
I had this "better" driver, but the WUs were extremely slow, about 2-10 times as slow as before the "update". I had to abort them because some would not have made even a 4 day deadline, and this project has a real deadline of 2 days, everything slower is a waste of capacities.

I was convinced by all of you here to update to the newer, better drivers, which I run now. It's a 260.19.21, running at 550/1700/1340 MHz. I even included the dubious SWAN_SYNC somewhere, although all hints where to put it in this forum where plain useless. And I manually set the nice value to 0 as soon as I can for every new WU.

Suddenly you completely change your story, and the new driver is not any longer far better, but plain shit. Why did I even consider to use it? (Well, because you told me so;)

My card tells BOINC exactly what it's capable of:
NVIDIA GPU 0: GeForce GT 240 (driver version unknown, CUDA version 3020, compute capability 1.2, 511MB, 257 GFLOPS peak)

If this is not good for 6.13, it's the projects responsibility not to sent them to me, not mine to downgrade my system.

From my own experience there is no difference between 6.12 and 6.13, both are ridiculous slow compared to the one used before, 6.04.

I just saw that one of the Kashifs would perhaps have made it within 30h, but as the last ones took close to or even a bit above 48h, I of course aborted them asap, as anything longer than 48h is useless.

Edit:
If you included the credit claim in the name, I could simply just abort all WUs with more than 5500.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19685 - Posted: 26 Nov 2010 | 22:59:21 UTC

Yes, quite often you are receiving suggestions that are nothing more than guesses? The project does not have every version of every card, driver and OS much less you exact system / configuration. Neither do the volunteers that are trying to help you. You say you want a complete guarantee that nothing will happen to your system ... not gonna happen with any piece of software in the world (take a look at any EULA) so I think you're being a little unrealistic there. You say your not a nerd and get outright hoistile when someone suggests you know something about the operating system you use. You say you don't want to switch drivers but then you do it anyway and post your results in a way that is outright hostile and insulting.

Clearly you are unhappy about the suggestions you are getting so how about trying to be constructive? Yes, there are projects that *sometimes* work like magic which I think is what you really want but look around, none of the CUDA projects (in my opinion) are doing hard core molecular biology and bringing advancements to the field the way that GPUGrid does. You've been here long enough to understand the state of the project so how about when you test things and post back about the sucess or failure you leave the attitude at the door.
____________
Thanks - Steve

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19686 - Posted: 26 Nov 2010 | 23:43:08 UTC - in response to Message 19685.

Hi Snow Crash,

sorry to say that I'm as equally frustrated as Saenger, & if I don't voice my frustration in the same way, it's because I choose to word it differently. He's using Linux & says that he's not so into all this Linux. I feel that it's refreshing that someone who isn't a "Guru" or "Wizard", even bothers to put so much effort into something he's having difficulties with.

That something is easy & just works, is why so many are staying away from Linux & CUDA. If people still go for it, I don't see how it's helpful to discourage the few that do. It's not good for the future prospects of Linux or CUDA.

If you feel that this excellent research should get all the support it deserves, don't discourage people from wanting to contribute. It's not that I don't value the Gurus & Wizards behind all this research, nor do I neglect the fact that Nvidia sells GPUs, & I don't believe this world can revolve without money. But If gpugrid wants people to support their project, they need to consider all these factors & include encouraging their members instead of discouraging them to find another project.
____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19691 - Posted: 27 Nov 2010 | 11:54:16 UTC

I have no problem with people trying to learn how to get this project up and running with any combination of cards, drivers, CPUs, GPUs. What I am trying to discourage is the attitude that if it does not work all work perfectly then people are justified in posting rude, insulting and downright hostile comments. That is not good for the project at all.
____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19693 - Posted: 27 Nov 2010 | 12:09:30 UTC - in response to Message 19691.

Try running GPUGrid only tasks and leaving your CPU free.
I said this many times before, in order to benefit from swan_sync=0 you need to leave a CPU core free. If you don't you will not benefit, and the CPU tasks will dominate the CPU, slowing down the GPU tasks massively.

http://www.gpugrid.net/results.php?hostid=87478

I just reinstalled Ubuntu 10.10 and have a GTX260-216 and a GT240 in it.
One task returned OK in 20K sec. Using the 6.13app. It's probably not fully optimised yet, but it's not bad; 6h to run on a GTX260 at stock.
I'm not running any CPU tasks, just to test this.
The GT240 task should finish in about 18h.

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19695 - Posted: 27 Nov 2010 | 15:42:02 UTC - in response to Message 19693.

Try running GPUGrid only tasks and leaving your CPU free.

Why should I use BOINC if I want to participate in just one project? I could run Folding then.

GPUgrid had no problems to use a full core with the old app, without even mentioning it in it's description. No you proclaim to have a sooo much better app, but it needs more manually fiddling than ever before methinks. That's OK for Beta-WUs, but leave them from production mode.

    * I've got the most current driver.
    * I've somehow managed to get "0" as an answer to echo $SWAN_SYNC.
    * I change the nice value to "0" asap for every WU.
    * I hand-select the downloaded WUs and abort those known to not make the 2-day deadline.



Imho that's far more than you can expect from anyone crunching here. If you want to have normal users, not just extreme nerds, to crunch here, you have to make the project KISS, and it's far away from that now.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19696 - Posted: 27 Nov 2010 | 16:28:51 UTC - in response to Message 19695.

Crying fowl again?

Take some advice from Mr. T.

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19697 - Posted: 27 Nov 2010 | 17:03:35 UTC - in response to Message 19696.
Last modified: 27 Nov 2010 | 17:04:07 UTC

Crying fowl again?

Take some advice from Mr. T.

Thanks for ridiculing me.
As I dont have the words
Project tester
Volunteer tester

under my name, I expect tested WUs in my BOINC, what's wrong with that expectation?

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19698 - Posted: 27 Nov 2010 | 17:25:08 UTC - in response to Message 19693.

Try running GPUGrid only tasks and leaving your CPU free.
I said this many times before, in order to benefit from swan_sync=0 you need to leave a CPU core free. If you don't you will not benefit, and the CPU tasks will dominate the CPU, slowing down the GPU tasks massively.

CPU projects do valuable science too. People have over and over again expressed that they want to run other projects on their CPUs while running GPUGRID. I've tested and posted posted alternatives above and asked you to check my results. Yet no one one involved here seems to care about working with projects from other scientists. Anyway you'll be glad to hear that I won't be bugging you guys so much as I've moved most of my GPUs to other science. Won't be leaving entirely but also won't be wasting my time testing alternatives to try to improve things, since we're all just ignored anyway. Will still be around from time to time and wish you all the BEST in your endeavors. If things improve I'll move the bulk of my GPUs back. Thanks for the project!

Regards/Beyond

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19699 - Posted: 27 Nov 2010 | 17:26:26 UTC - in response to Message 19697.

You are the unspoken and yet most important thing, a Cruncher.

Research projects come and go, scientists come and go, CA's come and go, science apps, drivers, GPU's, CPUs and systems come and go, but crunchers remain crunchers.

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19705 - Posted: 28 Nov 2010 | 6:22:37 UTC
Last modified: 28 Nov 2010 | 6:24:18 UTC

I get BSOD's from it (under Windows). I have ceased running GPUgrid on my GTX295 rigs for the time being. There is another thread running in Number Crunching about it here

I would suggest cuda 3.0 app or cuda 3.2 (when its available).
____________
BOINC blog

Oktan
Send message
Joined: 28 Mar 09
Posts: 16
Credit: 953,280,454
RAC: 1
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19711 - Posted: 29 Nov 2010 | 13:43:33 UTC

Hi there 6.13 works good if i run the task alone on the cpu with all 4 cpu free then the strange happens every 20-40sec core changes never stays in one core.
Yes i have the SWAN_SYNC=0 thingy.

Yes i am a noob plese get it working.

Keep up the good work.

MVH/ Oktan

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19712 - Posted: 29 Nov 2010 | 17:03:44 UTC - in response to Message 19711.

This is normal, and a good thing; the CPU decides which core to use based on things like usage and heat.

Your results match up well with my GTX260 on Linux (Ubuntu 10.10).
Note I have a GT240 in there too, and when I restart Boinc the tasks often jump to the other GPU.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 388,572
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19765 - Posted: 4 Dec 2010 | 14:52:02 UTC - in response to Message 19695.

Try running GPUGrid only tasks and leaving your CPU free.

Why should I use BOINC if I want to participate in just one project? I could run Folding then.



So you don't want to try something long enough to see if it works, and then possibly change back later?

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 727,920,933
RAC: 388,572
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19766 - Posted: 4 Dec 2010 | 15:05:04 UTC - in response to Message 19695.

Try running GPUGrid only tasks and leaving your CPU free.

Why should I use BOINC if I want to participate in just one project? I could run Folding then.

GPUgrid had no problems to use a full core with the old app, without even mentioning it in it's description. No you proclaim to have a sooo much better app, but it needs more manually fiddling than ever before methinks. That's OK for Beta-WUs, but leave them from production mode.


So you haven't noticed that this whole project is still in beta test?

You might think about switching to the Rosetta@Home project, which got past the beta test stage, then started adding "improvements" so fast that the resulting level of problems looked like they were still in EARLY beta test.

How many CPU cores does your computer have? If it's more than one, first try telling BOINC to stop running any CPU projects, and see how much that helps run GPUGRID. Then try telling it to run CPU projects on all but one of the CPU cores and see if that gets you closer to the situation you want.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19769 - Posted: 4 Dec 2010 | 18:48:48 UTC - in response to Message 19766.

For the few members that have two different cards (or more) in the one system, in my case a GT240 and a GTX260, there is a simple way to finish tasks quicker - in my case the 30h tasks on a GT240. If a long work unit is running on the smaller card, suspend the other task (running on the larger card), restart Boinc and your long task should run on the big GPU (assuming this is in the top PCIE slot). Then just Resume the other task, and it will run on the lesser GPU.

Warning, do not suspend both tasks and then start them in the order you want to run them, without exiting Boinc, or both may crash.

JAMES DORISIO
Send message
Joined: 6 Sep 10
Posts: 8
Credit: 2,414,542,626
RAC: 1,825,132
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19772 - Posted: 4 Dec 2010 | 23:05:39 UTC - in response to Message 19652.
Last modified: 4 Dec 2010 | 23:08:55 UTC

The amount of CPU is fixed by the scheduler so it's the same for windows and linux. We would be pretty happy to use 1 CPU for each run and maximum speed, but then others will not like it. That's why we decided to give the opportunity to choose.

gdf


Any idea how long to implement this option: weeks? months?

Thanks Jim

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19835 - Posted: 9 Dec 2010 | 16:41:39 UTC

Wow, now I've even gone back to using Nvidia 195 driver on my Linux, after lots & lots of fun with SWAN_SYNC=0 & other fun stuff too. It's even worse with older drivers. 71711 70922 62706

But at least it's a cold winter with snow in Denmark ;-)
____________

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19837 - Posted: 9 Dec 2010 | 17:05:33 UTC - in response to Message 19835.

Wow, now I've even gone back to using Nvidia 195 driver on my Linux, after lots & lots of fun with SWAN_SYNC=0 & other fun stuff too. It's even worse with older drivers. 71711 70922 62706

But at least it's a cold winter with snow in Denmark ;-)

I've crunched before with this older drivers, with a similar result as you say here, and was vehemently convinced by the usual suspects to use the new one, as it's soooo far better........until it turned out to be no difference.
I told them that it made no difference, but they nevertheless no try to convince me to get my old driver back, because it's sooooo far better ;)
Now at least there's a second cruncher to confirm what I told them for quite some time, only they do not listen. Let's see what their next suggestion will be: Even older drivers? New, beta stuff from nVidia?

They obviously don't want to tell the truth that they <are not interested in people that don't invest a few hundred Euro every year just in cards, plus the electricity bill of another few hundred Euro.

It's OK, if they only want rich nerds they could say so. But if they pretend that normal, mid-budget, cards, like the G200-series that is far from old, are suitable for this project they don't play fair. They are only suitable if they are extremely micromanaged by hand and running 24/7, and that's not the basis for a BOINC project.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19880 - Posted: 13 Dec 2010 | 0:10:09 UTC - in response to Message 19837.
Last modified: 13 Dec 2010 | 0:10:49 UTC

Okay, still too soon to say, but at least it looks normal now. I reinstalled Linux on one of my PCs yesterday. Nothing worked, so there was nothing to loose. I decided to try Mint Linux 10, since Boinc-Client & Nvidia Drivers are now as should be, there's no more need to chase Betas all the time.

It "might" have been one of the Mint Linux 8 updates that messed with something I haven't a clue about what it could be. But after the first reinstall, I'm looking at a 30 hour WU on a PC using a GT240.

So I decided to reinstall the other 2 PCs, it all looks better then hopeless. My apologies for bitching so much. Maybe later I'll try tweaking with SWAN_SYNC=0 & other stuff, but for now I'm just glad to get it working again.

Cheers!
____________

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19883 - Posted: 13 Dec 2010 | 11:49:35 UTC - in response to Message 19880.
Last modified: 13 Dec 2010 | 11:50:20 UTC

I attached the four other projects I support. Reinstalled on the remaining two PCs, attached gpugrid.net as well as the other projects on the two newly reinstalled PCs, wrote the last article. Now it's the same ol, same ol. So something happened in between only having gpugrid.net run on the first newly reinstalled PC where a GT240 was at 80% after 24hours, & after I attached other projects where it's at 93% after 37hours. So I "assume" that "maybe", the new WU's from gpugrid.net running on Linux doesn't like other projects & vice versa.
____________

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,518,011,851
RAC: 8,607,124
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19884 - Posted: 13 Dec 2010 | 12:11:47 UTC - in response to Message 19883.
Last modified: 13 Dec 2010 | 12:12:02 UTC

I don't think it's anything to do with 'liking' or 'not liking'.

But they will be occupying CPU cores. And if you want GPUGrid to run at the highest possible speed, GPUGrid would like to have a CPU core back, please.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 19891 - Posted: 13 Dec 2010 | 21:44:56 UTC - in response to Message 19884.

True, but Windows doesn't have this problem, nor did I have any problem running 5 different projects on Linux prior to 6.12 & 6.13, now I've remove 2 projects on my PCs running Linux. I hope it's "good enough", because if the argument for this change was to lighten the burden that CPU hungry gpugrid.net WUs was causing on other projects, this new change is having the opposite effect.
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19893 - Posted: 13 Dec 2010 | 22:30:54 UTC - in response to Message 19891.

Ultimately how much you wish to contribute is down to your evaluation of GPUGrid compared to other projects and how much you want to support GPUGrid. I assessed GPUGrid primarily on the science, not my professional IT expertise. I decided to redirect my overall Boinc contribution to here as I understand the research and the amount of work a GPU can do here compared to a CPU.
Initially it was not easy, I had one barely useful GPU and my efforts to buy what I thought would be useful GPU’s failed (192 is an evil number). Lesser GPU projects offer up more points per hour, so it took perseverance to reach my current level of contribution which I think is reasonable for my means. I’m happy that my current contribution costs less, in terms of electric, than it did 6months ago. This is mostly due to upgrading GPU’s.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19894 - Posted: 13 Dec 2010 | 22:40:50 UTC

Liveonc,

thumbs up for your patience! Regarding the actual problem: as I understand it should be enough to leave one CPU core free. How many other projects etc. shouldn't matter. And judging by what SK says you also need to do this in Win to get maximum performance.

Sänger wrote:
only they do not listen.


That's not true. They've clearly listened, but didn't get a solution out of the door yet.

They obviously don't want to tell the truth that they <are not interested in people that don't invest a few hundred Euro every year just in cards, plus the electricity bill of another few hundred Euro.


That's what you see. For me this is not obvious at all.

I see that GPU-Grid is by definition interested in the fastest cards (they need results back ASAP in order to be productive) and needs to make their rather complex software work with these. While doing so they constantly need to fight bugs in CUDA libraries, drivers etc. as well as optimize the algorithms and actual science. This is not a small task at all.

Having said this doesn't mean I wouldn't like to see a better solution of the current problem(s). I'm just trying to put things into perspective.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Fred J. Verster
Send message
Joined: 1 Apr 09
Posts: 58
Credit: 35,833,978
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19895 - Posted: 13 Dec 2010 | 22:45:16 UTC - in response to Message 19891.
Last modified: 13 Dec 2010 | 22:57:20 UTC

I have noticed too often, having a wingman, using a 200 series (NVidia), erroring out in 5 seconds, some 240's,260-216's on which I saw a correct result.
Isn't it a waiste of resources and time.

I have host with a GTS250, but is not able to run any GPUgrid.
Only my WIN XP64, X9650 @3.55GHz + GTX480 @ 1400 (engine), Mem 3880MHz.
Temp's for this host also is ideal, cause no casing, is used, since
2 FERMI running full load, they put out a lot of heat, 650Watt when I ran a 470
together with the 480, but gave too much 'trouble'
Often wondered why some 200 series, do crash a unit.


And yesterday, another 'new' experience, not enough virtual memory,
to continue, on a 64BIT WIN Host with 4(2x2)GiG DDR2 533MHz!
And the harddrives on 3 host, had such a Fragmentation, that I had to run
chkdsk X /f (/v), defore i could defragment!
Why are WU's sended to hosts, which only produce, if any at all, errors?

It is clear, some WU's, cannot be computed, by f.i. a GTS250, but GTX295 also makes too much errors, in too many cases.
GTX295 & GTX480 .
GTS250 & GTX480 .
GTX260 & GTX480 .
9500GT & GTX480 .
GOOD Results .
with 'older cards', compaired with an 480.

Also I notice a message, I should update to CUDA 2.2, i run 3.1, or the app.
Or is the Compute Capability, that is meant, cause there is the big difference between the 200,C.C=1.0-1.3 and 400 (FERMI)2.0, 2.1.
____________

Knight Who Says Ni N!

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19909 - Posted: 14 Dec 2010 | 17:42:21 UTC - in response to Message 19895.

Hi Fred, I looked at some of those errors. You can certainly find them :)

One guy has a single GTX295 with runaway errors. I suggested he uses 197.45 and to run the 6.12app.

Someone else has a GTX260-192, every task fails because their card is incapable of working on GPUGrid.

Another user for some reason is continuously aborting tasks.
Not sure why anyone would want to attach a GeForce 9500 GT in the first place, let alone keep aborting tasks. Again, I made a suggestion, just in case they read their PM's.

Werkstatt
Send message
Joined: 23 May 09
Posts: 121
Credit: 321,525,386
RAC: 477,412
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19914 - Posted: 15 Dec 2010 | 0:27:26 UTC

Hi All,

I see postings like They obviously don't want to tell the truth that they <are not interested in people that don't invest a few hundred Euro every year just in cards, plus the electricity bill of another few hundred Euro.
I see posts describing ways to optimize speed and post explaining the troubles that the dev's see.
I'm also a victim of the 'evil number 192', but I found a way to sell that card.
I did upgrade to the cheapest GTX460 I could get (€ 149.90, not 'a few hundred') and experimented a bit.
It's not a pure crunching computer, its used for regular work and it runs cpu wu's also.
The results are here: http://www.gpugrid.net/results.php?userid=25200
I can finish two tasks a day and have a card using less than 170 Watt.
One trick is to use windows task manager to increase the priority of the acemd wu's if my system is not used for some hours. And I can easily switch back if I need the system for my work. It's not as effective as swan_sync but more flexible.
I would like to encourage everyone to experimet a bit to get out the most of his hardware. Science is 'Finding new ways', not getting virtual credits. And its sometimes a way of 'Try and error'.
If this project has a real potential to help saving lives all the troubles are worth to go trough them.

Alexander

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 19915 - Posted: 15 Dec 2010 | 5:40:11 UTC

So that's a 3-digit Euro amount just for the card, that's a lot.
And that's another Euro per day for electricity, ~350 per year, that's a lot too.

And this is not about getting credits, otherwise I would long ago have left this project for one of the imho futile math-projects, it's about getting the 2-day deadline for useful crunching done. If the WUs take longer than 2 days, it's no longer useful crunching but just for credits.

I have to do a lot of manually adjusting every single WU as fast as possible after it arrives on my computer.
I have to delete apparently too long WUs asap, although I don't know how fast they would be, the project knows beforehand but fails to mark the WUs.

Now they even distributed extra-long WUs to every cruncher, that will definitely lead to a huge amount of wasted crunch-time, as a lot of them will take more than 2 days.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19918 - Posted: 15 Dec 2010 | 10:45:25 UTC - in response to Message 19915.
Last modified: 15 Dec 2010 | 10:52:25 UTC

A GTS450 is only about 40% faster than a GT240, but around 60% more power hungry. While a GTX460 is slightly more than twice as fast as a GT240, it uses around 150W; so it just about matches the GT240 in terms of energy efficient when crunching here. The GTS450 and GTX460 are also more expensive in terms of active performance per purchase price. The reason for this relatively poor performance compared to the high end Fermi’s is that the GTS450 and GTX460 have inaccessible shaders when it comes to crunching here.

For GPU crunchers running costs are very important and the most energy efficient cards in terms of points per Watt are the GTX580 and GTX570. These are expensive cards, around £260/300Euros for the GTX570. However even at that price they offer some crunchers an opportunity. Crunchers with several older cards and systems could sell what they have and build new systems around one of these more energy efficient cards. These cards can do as much work as several older cards and at the same time reduce running costs. You might not even have to spend anything if you get a reasonable price for your existing cards/system.

For example,
If I sell 4 or 5 GT240’s (the most energy efficient previous generation cards) I would have enough money to buy one GTX470. The Fermi would do about the same amount of work but cost slightly less to run. If I throw in my GTX260 I would be able to buy a GTX570 which would do about the same work, but only use 219W per hour, rather than 520W per hour.
If I also replaced my quad GT240 system, 4GB DDR2, Phenom II 940 (TDP 125W) with a slightly lesser quad core CPU, but much more energy efficient (45W) and used a Fermi, my overall power usage would fall dramatically and I would have 3 CPU cores to crunch on. The good thing is I would not have to spend anything overall.

Obviously this is not for the average cruncher, and I have not seen a better entry level Fermi card than the GT240; all the lesser Fermis have the unfavourable 48:1 ratio with their inaccessible shaders. Only the top Fermi’s use the 32:1 ratio. While I could sell 4 GT240’s and buy one GTX460, it would not do as much work. There is little merit to moving away from GT240’s towards a GTS450 or GTX460. It only makes sense if you have several pre-Fermi cards and are prepared to get a top end Fermi.

As for the long tasks, I am running one on a GT240 and it will take around 30h using the optimization methods I prefer, well inside 48h. So for me the 50% bonus will make it very worthwhile in terms of credit.

-
GTS450 on Linux 43.187 ms per step. Probably not well optimized.
Same task type/credit on a GT240 6.12app 41.422 ms per step

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19971 - Posted: 17 Dec 2010 | 11:16:49 UTC
Last modified: 17 Dec 2010 | 11:34:51 UTC

Windows 7 64bit + 263.06 x64 + GTX470 + Swan_Sync 0

GPU Load 45% p39-IBUCH_7_pYEEI_101214-1-4-RND2298_1 acemd2 version 613

Also acemd2_6.13_windows_intelx86__cuda31.exe use 100% of cpu! (25% of quad core)

Is this normal?

I also crunch aqua. When I suspend aqua GPU Load goes to 50%.

Any ideas how to boost GPU Load?

For comparison PrimeGrid for CUDA - 99% GPU Load with aqua on 4 cores! and cpu usage is 1-2% of maks 25%
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19981 - Posted: 17 Dec 2010 | 21:54:29 UTC - in response to Message 19971.

Tomasz, the GPU Utilization seems a bit low but I think I have the reason.

I see slightly higher utilization than that on my GTX470's:
My i7-920 system with two GTX470's is presently crunching two IBUCH tasks. When I free up 2 CPU threads the GPU utilization was between 60% and 63%.
When I free up another CPU thread the GPU utilization rose to between 62% and 66%. Similar observations, but slightly higher utilization numbers. So why.

Well firstly Win XP is faster than Win7, but the difference in this case may be largely down to the systems; a Q8200 @ 2.33GHz system compared to an i7-920 @ 2.8GHz. These tasks rely a lot on the CPU. It would be interesting to see how they perform on a better CPU. Someone with an i7-980X might want to post up some data.

Not sure if this would make any difference or not, but the 260.99 WHQL is the recommended driver for the GTX470 and not the 263.06 driver (it's for the GTX500 series cards and that effort of a GTX460 SE).

I think in this case running 2 GPU tasks at once would be useful.


Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 19987 - Posted: 18 Dec 2010 | 9:38:38 UTC - in response to Message 19981.

GPU utilization of some workunits should get better from next release, probably in January. We know why these are slower and it has been fixed.

Also, the problem of long workunits will also be solved either by two applications (so that you can decide) or by better selecting the host which can compute them.


gdf

Post to thread

Message boards : Graphics cards (GPUs) : ACEMD2 6.12 cuda and 6.13 cuda31 for windows and linux

//