Advanced search

Message boards : Graphics cards (GPUs) : acemd2 611 cuda3.1 for Fermi

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 18079 - Posted: 20 Jul 2010 | 9:10:12 UTC

The new application for cuda3.1 is now out, please report here.

gdf

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 470
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18081 - Posted: 20 Jul 2010 | 10:10:35 UTC
Last modified: 20 Jul 2010 | 10:14:24 UTC

Got one 6.11 task.
It's running at 91-93% GPU usage, and completed 50% in 52 minutes. (GTX480 stock clocking, SWAN_SYNC=0)
It's very promising.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 470
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18082 - Posted: 20 Jul 2010 | 11:13:35 UTC - in response to Message 18081.

Got one 6.11 task.
It's running at 91-93% GPU usage, and completed 50% in 52 minutes. (GTX480 stock clocking, SWAN_SYNC=0)
It's very promising.


It's completed successfully, and much faster (6236 sec) than 6.05 (8200 sec), that's approx. 23% faster.
# Time per step (avg over 1300000 steps): 4.794 ms
# Approximate elapsed time for entire WU: 6232.813 s
GPU load was 90-93%, and the Memory Controller Load was 22%, both of them is higher than it used to be with the 6.05 client (66% and 14%).

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18083 - Posted: 20 Jul 2010 | 13:06:39 UTC - in response to Message 18082.
Last modified: 20 Jul 2010 | 13:09:28 UTC

Faster depends on what way you work it out!

For your two TONI tasks I make it 23% less time, which is 30% faster

2696626 1716504 20 Jul 2010 9:12:35 UTC 20 Jul 2010 11:04:52 UTC Completed and validated 6,235.95 6,196.56 4,727.93 7,091.89 ACEMD2: GPU molecular dynamics v6.11 (cuda31)

2696083 1716202 20 Jul 2010 6:30:00 UTC 20 Jul 2010 8:51:33 UTC Completed and validated 8,135.92 7,649.42 4,727.93 7,091.89 ACEMD2: GPU molecular dynamics v6.05 (cuda30)

I'm also running a 3.1 task on a GTX470. Its been running for a while and looks slightly faster (but not as much as the TONI tasks); should finish in 7420sec total. It is a different type of WU (468-KASHIF_HIVPR_auto_spawn_2_90_ba1-28-100-RND7028_0). A similar 3.0 task on my card took 8208sec. So the 3.1 KASHIF_HIVPR tasks are about 10% faster than the 3.0 tasks.

GPU says 96% usage, temp 76degC, fan at 83%, clocked to 704MHZ.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 470
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18085 - Posted: 20 Jul 2010 | 14:06:14 UTC - in response to Message 18083.
Last modified: 20 Jul 2010 | 14:06:43 UTC

Faster depends on what way you work it out!

For your two TONI tasks I make it 23% less time, which is 30% faster


You're right.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18086 - Posted: 20 Jul 2010 | 14:41:36 UTC - in response to Message 18085.

Well my first 6.11 task finished in 7402.484sec (as predicted about 10% faster than a similar 6.05 task). I have picked up another 6.11 WU, this time its an IBUCH :)

I see a few people with GTX460 cards are running these 6.11 based tasks, but no results back yet.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 470
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18088 - Posted: 20 Jul 2010 | 15:17:58 UTC

My second 6.11 WU was a TONI too, and the speed was the same fast.
I'm very pleased with this performance.
Now I got two new TONIs, but I really like to have a GCY.

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,693,095,193
RAC: 13,094,131
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18089 - Posted: 20 Jul 2010 | 15:25:28 UTC - in response to Message 18086.

Does anyone finish a 6.11 WU under Win7?
I know that CUDA 3.0 tasks finish faster under driver 197.75 than under 257.xx. So I am hesitating to change drivers again. I'd be happy to read about your experience with CUDA 3.1 under Win7.

Jirza
Send message
Joined: 30 Jun 10
Posts: 4
Credit: 1,108,522
RAC: 0
Level
Ala
Scientific publications
watwat
Message 18090 - Posted: 20 Jul 2010 | 17:08:16 UTC - in response to Message 18089.

My card finish v6.11 under Win7
http://www.gpugrid.net/result.php?resultid=2696901 ACEMD2: GPU molecular dynamics v6.11 (cuda31)
http://www.gpugrid.net/result.php?resultid=2692327 ACEMD2: GPU molecular dynamics v6.05 (cuda30)

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18091 - Posted: 20 Jul 2010 | 17:21:57 UTC - in response to Message 18090.
Last modified: 20 Jul 2010 | 17:51:24 UTC

That is almost 60% faster for the IBUCH tasks under Win7 using the 6.11 app.
It looks like that improvement holds for XP on a GTX470 as well.

roundup, I think it is worth a try; as long as you get plenty of IBUCH tasks, it should be a good improvement. If you get an even spread of tasks (with 10%, 30% and 60% improvement), it should cover the drivers losses (as long as it was not over 33%)!

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,693,095,193
RAC: 13,094,131
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18092 - Posted: 20 Jul 2010 | 17:57:28 UTC - in response to Message 18091.


roundup, I think it is worth a try; [...]

Okay skgiven, you've won ;-). I'll give it a try.

coldFuSion
Send message
Joined: 22 May 10
Posts: 20
Credit: 85,355,427
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 18093 - Posted: 20 Jul 2010 | 18:19:02 UTC - in response to Message 18090.

My card finish v6.11 under Win7
http://www.gpugrid.net/result.php?resultid=2696901 ACEMD2: GPU molecular dynamics v6.11 (cuda31)
http://www.gpugrid.net/result.php?resultid=2692327 ACEMD2: GPU molecular dynamics v6.05 (cuda30)


That's about a 40% improvement O_O

Profile Bikermatt
Send message
Joined: 8 Apr 10
Posts: 37
Credit: 3,839,902,185
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18094 - Posted: 20 Jul 2010 | 18:21:06 UTC

Here's the first off my 1GB 460. Slower than I had hoped in Win7. GPU showed about 73% load during the task.

I am going to try some with Swan_Sync=0 now and see what happens.


http://www.gpugrid.net/result.php?resultid=2697154

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18096 - Posted: 20 Jul 2010 | 19:24:49 UTC - in response to Message 18094.
Last modified: 20 Jul 2010 | 20:06:17 UTC

Vista and Win7 are slow - nothing has changed there; that is 88% slower than a similar 6.05 task I ran on my GTX470.
Remember to free up your CPU; If you run other CPU projects on every CPU core, using swan_sync will make little difference! You have to free up at least one core/thread:
On my i7-920, I leave 2 threads free, and it is significantly faster than with only 1 thread free.

Also, the improvement for GF100 Fermi's was only about 10% for the KASHIF_HIVPR 6.11 WU's (your last run).

The TONI tasks (which you have this time) are about 30% faster on 6.11 than on 6.05, and the IBUCH tasks are about 59% faster. So your task should be a bit faster this time.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18098 - Posted: 20 Jul 2010 | 23:10:30 UTC - in response to Message 18096.

This IBUCH_101b_pYEEI task was about 13% faster. We seem to have a large variety of 6.11 tasks and improvements over the 6.05 tasks.

Profile Bikermatt
Send message
Joined: 8 Apr 10
Posts: 37
Credit: 3,839,902,185
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18099 - Posted: 21 Jul 2010 | 1:40:43 UTC

Swan_Sync=0 is showing an 80% GPU load with this KASHIF_HIVPR task compared to the 73% GPU load on the last KASHIF_HIVPR with no Swan_Sync.


Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 470
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18103 - Posted: 21 Jul 2010 | 10:17:08 UTC

So far, so good.
I've got one erratic task, which is failed 3 times already.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18104 - Posted: 21 Jul 2010 | 10:48:26 UTC - in response to Message 18103.

I have a nan Invalid task:

Name I529-TONI_KIDln-23-100-RND9840_0
Workunit 1719104
Created 21 Jul 2010 4:49:41 UTC
Sent 21 Jul 2010 5:54:43 UTC
Received 21 Jul 2010 8:44:18 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 98 (0x62)
Computer ID 71363
Report deadline 26 Jul 2010 5:54:43 UTC
Run time 2838.5625
CPU time 2835.453
stderr out

<core_client_version>6.10.56</core_client_version>
<![CDATA[
<message>
- exit code 98 (0x62)
</message>
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 470"
# Clock rate: 1.41 GHz
# Total amount of global memory: 1341718528 bytes
# Number of multiprocessors: 14
# Number of cores: 112
SWAN: Using synchronization method 0
MDIO ERROR: cannot open file "restart.coor"
ERROR: file deven.cpp line 855: # Energies have become nan

called boinc_finish

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 4727.92939814815
Granted credit 0
application version ACEMD2: GPU molecular dynamics v6.11 (cuda31)

Bikermatt, thats a 9% improvement for that type of task by using swan_sync - Not bad.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 470
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18105 - Posted: 21 Jul 2010 | 11:07:14 UTC
Last modified: 21 Jul 2010 | 11:20:40 UTC

This task caused an "acemd2_6.11_windows_intelx86__cuda31.exe has stopped working" error, and I didn't click "OK", I restarted my pc instead. (I'm used to do that every time I see this error message, because if I click "OK" then the task will abort, and the processing time is lost)
After restart, the task is finished well, just like every other task with that error message (and restart).

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18107 - Posted: 21 Jul 2010 | 11:34:54 UTC - in response to Message 18105.

Yes, I have seen this stopped working error too, and try to do the same thing - a hot restart ASAP usually allows the task to pick up at the last checkpoint. If we had a way to tell a failed task to go back to the last checkpoint it would save quite a few tasks. In Windows there are a lot of false failures, for example in IE, sometimes you get a failure message saying IE has crashed, but if you just click the X in the top right corner, IE works fine. If you click OK Windows closes IE. Unfortunately clicking the X in the case of an acemd task kills it.

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,693,095,193
RAC: 13,094,131
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18108 - Posted: 21 Jul 2010 | 12:15:06 UTC - in response to Message 18091.

That is almost 60% faster for the IBUCH tasks under Win7 using the 6.11 app.
[...]
roundup, I think it is worth a try; as long as you get plenty of IBUCH tasks, it should be a good improvement. If you get an even spread of tasks (with 10%, 30% and 60% improvement), it should cover the drivers losses (as long as it was not over 33%)!

I am really happy to have installed the new driver. Thanks for the advice, skgiven!
This computer never completed a 6803 credit task (TONI_CAPBIND) under Vista/7 in about 10k seconds before:
http://www.gpugrid.net/result.php?resultid=2700834.
Obviously driver 258.96 running CUDA3.1 tasks is much faster under Vista/7 than driver 257.21 running CUDA3.0 units - the difference is about 25% for the TONI tasks. With the former 257 driver it took an average of 16k seconds for a 5,965 credits task. Example (IBUCH): http://www.gpugrid.net/workunit.php?wuid=1713213.

The latest 258 driver with CUDA3.1 is a big step forward for the Win7 users.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18109 - Posted: 21 Jul 2010 | 13:56:33 UTC - in response to Message 18108.

Obviously driver 258.96 running CUDA3.1 tasks is much faster under Vista/7 than driver 257.21 running CUDA3.0 units - the difference is about 25% for the TONI tasks. With the former 257 driver it took an average of 16k seconds for a 5,965 credits task. Example (IBUCH): http://www.gpugrid.net/workunit.php?wuid=1713213.

I think you need to compare similar WUs. This computer shows 5 of the 6803 credit TONI_CAPBIND WUs. The 2 run under CUDA30 had times of 12,994 and 13905. There were 3 run under CUDA31, 2 successes with times of 10,330 and 10,076, and 1 failure with a time of 15,897. In the best of worlds nowhere near 25% and if you count the failure actually slower for CUDA31.

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,693,095,193
RAC: 13,094,131
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18110 - Posted: 21 Jul 2010 | 14:23:50 UTC - in response to Message 18109.

Obviously driver 258.96 running CUDA3.1 tasks is much faster under Vista/7 than driver 257.21 running CUDA3.0 units - the difference is about 25% for the TONI tasks. With the former 257 driver it took an average of 16k seconds for a 5,965 credits task. Example (IBUCH): http://www.gpugrid.net/workunit.php?wuid=1713213.

I think you need to compare similar WUs. This computer shows 5 of the 6803 credit TONI_CAPBIND WUs. The 2 run under CUDA30 had times of 12,994 and 13905. There were 3 run under CUDA31, 2 successes with times of 10,330 and 10,076, and 1 failure with a time of 15,897. In the best of worlds nowhere near 25% and if you count the failure actually slower for CUDA31.

I know why there was a failure: My fault, when I tested with different BOINC and environmental settings.
Also the HIVPR units are much quicker: Now 8,759.88 seconds. I never delivered a HIVPR unit with less than 11k seconds before, not even on a different (and much faster) PC that also is equipped with a GTX470.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18111 - Posted: 21 Jul 2010 | 14:35:32 UTC - in response to Message 18110.

I know why there was a failure: My fault, when I tested with different BOINC and environmental settings.
Also the HIVPR units are much quicker: Now 8,759.88 seconds. I never delivered a HIVPR unit with less than 11k seconds before, not even on a different (and much faster) PC that also is equipped with a GTX470.

That is faster. Just wondering, what BOINC & environmental settings are you referring to? Also did you use swan_sync for both CUDA30 & CUDA31?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18113 - Posted: 21 Jul 2010 | 15:20:15 UTC - in response to Message 18111.
Last modified: 21 Jul 2010 | 15:31:06 UTC

I saw these two KASHIF_HIVPR tasks, one using CUDA3.0 and then using CUDA3.1 (with the latest molecular dynamics app; v6.11). They ran on the same Vista system. The v6.11 app was 44% faster:

2616205 1663394 4 Jul 2010 11:22:11 UTC 9 Jul 2010 10:27:45 UTC Completed and validated 12,636.57 12,351.77 4,428.01 4,428.01 ACEMD2: GPU molecular dynamics v6.05 (cuda30)

2701801 1713122 21 Jul 2010 10:42:25 UTC 21 Jul 2010 14:14:34 UTC Completed and validated 8,759.88 8,882.59 4,428.01 6,642.02 ACEMD2: GPU molecular dynamics v6.11 (cuda31)

I ran a similar KASHIF_HIVPR task on my GTX470, which took 7417sec (257.21). My task ran about 18% faster than yours, both using v6.11, but my system is overclocked to 704MHz (about 16% faster than stock), and I also freed up 2 cores on my i7-920.

Roundup, is your GTX470 at stock?
I'm asking because I am using XP x86!

coldFuSion
Send message
Joined: 22 May 10
Posts: 20
Credit: 85,355,427
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 18115 - Posted: 21 Jul 2010 | 17:26:54 UTC
Last modified: 21 Jul 2010 | 17:32:31 UTC

I have seen some very significant improvement in a few specific WUs.

However I thought I would post results from one of the seemingly least efficient WUs I have run: IBUCH_freebind_pYEEI_100706. This one only pushes my GPU usage into the 70 something percent.

2670817 16,609.06 16,413.48 7,954.42 11,931.63 ACEMD2: GPU molecular dynamics v6.05 (cuda30) 13.284 ms/step

2701432 15,484.38 15,454.58 7,954.42 11,931.63 ACEMD2: GPU molecular dynamics v6.11 (cuda31) 12.385 ms/step

That's approximately a 7% improvement.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18116 - Posted: 21 Jul 2010 | 17:44:21 UTC - in response to Message 18115.
Last modified: 21 Jul 2010 | 17:55:07 UTC

I agree that different work types are showing different performances, but the important thing is that they have all improved. No Fermi tasks perform worse using the latest 6.11 app. The latest improvements have allowed the GTX460 cards to work, and brought about a big improvement for Vista and W7 users (who needed it the most).
Like you, I have a GTX470 and use XP. Yesterday and this morning I ran tasks using the older 25721 driver, as you are doing. I have now changed to the latest driver, just on the off chance it brings an improvement my way. I dont think it will (just for Vista and W7) but I will try it and report back.

roundup
Send message
Joined: 11 May 10
Posts: 57
Credit: 1,693,095,193
RAC: 13,094,131
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18118 - Posted: 21 Jul 2010 | 18:14:04 UTC - in response to Message 18113.
Last modified: 21 Jul 2010 | 18:15:28 UTC

I saw these two KASHIF_HIVPR tasks, one using CUDA3.0 and then using CUDA3.1 (with the latest molecular dynamics app; v6.11). They ran on the same Vista system. The v6.11 app was 44% faster:

[...]
My task ran about 18% faster than yours, both using v6.11, but my system is overclocked to 704MHz (about 16% faster than stock), and I also freed up 2 cores on my i7-920.

Roundup, is your GTX470 at stock?
I'm asking because I am using XP x86!

The GTX470 is at 702/1708/1404, SWAN_SYNC is set. I only freed up one core on the i7-920 @ stock.
The 44% advantage of CUDA3.1 over CUDA3.0 is amazing.

I think your 18% advantage is caused by 2 instead of 1 freed up core and XP instead of Vista.
The disadvantage of Vista/7 has become smaller, but is still there.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 470
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18134 - Posted: 22 Jul 2010 | 9:11:00 UTC

I've got another "...cuda31.exe has stopped working" error (during watching some HD video), a failed task, and a task cancelled by server. This last thing is a novelty for me.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18137 - Posted: 22 Jul 2010 | 12:20:58 UTC - in response to Message 18134.
Last modified: 23 Jul 2010 | 20:39:25 UTC

On my GTX470 (XP), I found no performance change using the latest driver (25896), instead of the older 25721 driver,
[edit]
but I did find that the 6.11 app brought an average 11% improvement from the 6.05 app:

- IBUCH_freebind_pYEEI . . . . 9.85% Faster
- IBUCH_101b_pYEEI . . . . . . 10.25%
- TONI_CAPBINDsp1 . . . . . . . 12.00%
- TONI_CAPBINDsp2 . . . . . . . 12.18%
- KASHIF_HIVPR_auto_spawn 10.48%
[/edit]

roundup, I think having 2 cores free expidites the task by about a further 3% but no more than about 8%, so a GTX470 may only be 10 to 15% slower while on Vista/Win7 than it would be on XP. More task types required to be sure about this generalization, and there are task variations.
Going by your tasks, on Vista, moving to the latest driver and running the 6.11 app you saw these improvements:

- IBUCH_201b_pYEEI, 57% faster
- KASHIF_HIVPR, . . . 39% faster
- TONI_CAPBINDsp2, 33% faster
- TONI_CAPBINDsp1, 26% faster

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 468
Credit: 8,463,447,716
RAC: 11,008,613
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18166 - Posted: 26 Jul 2010 | 1:38:48 UTC

Looks like, my 480 cards running on Win 7, are around 30% to 40% faster on acemd2 611 compared to 605, which means that it is, on average, slightly faster than a 285 card running on win xp. This is a good improvement, but 480 should really be twice as fast as 285, since it has double the cores. So the drivers for these cards, still need to be updated.

coldFuSion
Send message
Joined: 22 May 10
Posts: 20
Credit: 85,355,427
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 18171 - Posted: 26 Jul 2010 | 17:15:59 UTC - in response to Message 18166.

Looks like, my 480 cards running on Win 7, are around 30% to 40% faster on acemd2 611 compared to 605, which means that it is, on average, slightly faster than a 285 card running on win xp. This is a good improvement, but 480 should really be twice as fast as 285, since it has double the cores. So the drivers for these cards, still need to be updated.



Apples to oranges
You can't really make a comparison to a new card running on Win7 to an older card running on WinXP

The drivers on Win7 don't need to be "updated" M$ needs to abandon WDDM which has unacceptable overhead that directly and significantly affects performance.

IMHO

Werkstatt
Send message
Joined: 23 May 09
Posts: 121
Credit: 321,525,386
RAC: 12,239
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18173 - Posted: 26 Jul 2010 | 19:47:07 UTC - in response to Message 18171.

Looks like, my 480 cards running on Win 7, are around 30% to 40% faster on acemd2 611 compared to 605, which means that it is, on average, slightly faster than a 285 card running on win xp. This is a good improvement, but 480 should really be twice as fast as 285, since it has double the cores. So the drivers for these cards, still need to be updated.



Apples to oranges
You can't really make a comparison to a new card running on Win7 to an older card running on WinXP

The drivers on Win7 don't need to be "updated" M$ needs to abandon WDDM which has unacceptable overhead that directly and significantly affects performance.

IMHO


One question, just to make it clear:
If a HW-monitor says, GPU is running at ~95% on Win7, does that mean, that it is as efficient as running at 95% on XP ? Currantly I do not own a Fermi-card (only an GTX260/192), but I plan to buy a GTX460 and it would make only little problems to swich back to XP.

Regards,
Alexander

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18174 - Posted: 26 Jul 2010 | 20:11:58 UTC - in response to Message 18173.

No. Win7 is not as efficient as XP. Under Win7 the 95% total includes the Win7 Driver overhead (perhaps 10 or 15%). Win7 is still slower than WinXP due to the Driver Architecture.
The last GPUGrid application, along with the latest driver, resulted in a speed bump for the first Fermi cards (GF100) and enabled the second Fermi cards (GF104) to work. Prior to these updates the GF100 cards were almost half as slow on Vista and Win7 than on XP.

Werkstatt
Send message
Joined: 23 May 09
Posts: 121
Credit: 321,525,386
RAC: 12,239
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18175 - Posted: 26 Jul 2010 | 20:22:58 UTC - in response to Message 18174.

Uups!!
And what about Kubuntu? On one of my systems I could run the required apps in a VM under linux.

Regards,
Alexander

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 470
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18176 - Posted: 26 Jul 2010 | 20:27:19 UTC - in response to Message 18173.


One question, just to make it clear:
If a HW-monitor says, GPU is running at ~95% on Win7, does that mean, that it is as efficient as running at 95% on XP ? Currantly I do not own a Fermi-card (only an GTX260/192), but I plan to buy a GTX460 and it would make only little problems to swich back to XP.

Regards,
Alexander


In one word:
No.

I have WinXP, Vista, and Win7 on the same hardware. I started GPUgrid on Win7, and the GPU monitoring was showing very high percentage, but the tasks took almost twice as much time than on (other's) WinXP. Then I learned about the WDDM overhead, therefore I switched 'back' to WinXP. The GPU monitoring still showing the same high GPU usage, but the tasks run much faster. And one dedicated core with the SWAN_SYNC=0 environmental value setting made the tasks run a little more (about 10-15%) faster.
Switching back to XP is recommended for the sake of crunching efficiency. You can even create a multi-boot environment: WinXP for crunching, and Win7 for gaming. Or you can have a GPUgrid running under linux on a USB stick.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18177 - Posted: 26 Jul 2010 | 21:56:46 UTC - in response to Message 18175.
Last modified: 26 Jul 2010 | 22:02:55 UTC

Dont boot to XP first and then to a Linux Virtual Machine.
First of all you can't use XP x86 as the Linux app is X64 (the Windows app is x86). Although you could do it from XP x64 you would have the overhead of XP, the Virtulisation software and Linux, and three sets of potential problems. Better to use either XP or Linux.
If you must use Vista or Win7 then Disk/USB booting to Linux (FatPuppy) is an option.

Werkstatt
Send message
Joined: 23 May 09
Posts: 121
Credit: 321,525,386
RAC: 12,239
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18179 - Posted: 26 Jul 2010 | 22:46:16 UTC - in response to Message 18177.

Thanks a lot for your answers.

My main system has to run under an 64Bit OS due to mem requirements, and it must be windows. Since XP64 is no option, I have to use Win7. (Once upon Vista I had a working system ;-)

My other system is dual boot XP/Win7. I can check the speed difference there. Its not necessary to run Win7 all days.
My idea was to install Kubuntu and run the two most used windows-apps in a VM under linux. These apps are tested and work fine in a VM. But will that bring an remarkable increase in speed?

Regards,
Alexander

Werkstatt
Send message
Joined: 23 May 09
Posts: 121
Credit: 321,525,386
RAC: 12,239
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18180 - Posted: 27 Jul 2010 | 7:30:33 UTC - in response to Message 18179.


My other system is dual boot XP/Win7. I can check the speed difference there.


I tested it overnight. Since GPUGRID fails on my GTX260/192, I had to test it with Einstein and Milkyway. And YES, you're absolutely right, its pretty much faster! I don't have exact numbers, but it seems to be in the range of 20%.

Alexander

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18181 - Posted: 27 Jul 2010 | 12:33:48 UTC - in response to Message 18180.

I tested it overnight. Since GPUGRID fails on my GTX260/192, I had to test it with Einstein and Milkyway. And YES, you're absolutely right, its pretty much faster! I don't have exact numbers, but it seems to be in the range of 20%.

You have a speed difference between WinXP and Win7 in MilkyWay? At least with the ATI cards there's no difference at all, not even one second.

Werkstatt
Send message
Joined: 23 May 09
Posts: 121
Credit: 321,525,386
RAC: 12,239
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18183 - Posted: 27 Jul 2010 | 14:08:13 UTC - in response to Message 18181.

No, the posted difference referes to the E@H app. I wanted to take further tests, but currently two wu faild after ~75 min.
This is the reason why I want to replace my unloved GTX260.
Anyway, nice to hear that there is no difference with the ATI cards. I cannot test that since my ATI-System is not dual boot and I don't want to do too many experiments with it; I need this system correct running.

It's really interesting, that so many issues regarding nVidia seem to be not existend on the ATI-side. Lets see what's coming up with the OpenCL apps.

Alexander

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18194 - Posted: 28 Jul 2010 | 18:17:57 UTC - in response to Message 18183.

It's really interesting, that so many issues regarding nVidia seem to be not existend on the ATI-side. Lets see what's coming up with the OpenCL apps.


The MW app for ATI is hand-crafted. This is possible because the code is comparably simple. You can't do this with an app as complex as GPU-Grid, or not within a finite time. That's why you have to rely on external libraries here, which makes things driver dependent and more vulnerable to errors which you can not control / fix yourself.
Try doing the same with ATIs and you'll probably run into similar problems. At least ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Christopher Herr
Send message
Joined: 1 Apr 10
Posts: 13
Credit: 106,905,353
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 18367 - Posted: 18 Aug 2010 | 13:34:09 UTC

Hello everyone,

after my FERMI GTX 480 finally seems to really having picked up the pace on XP X64 with Swan_Sync=0, I have some questions though:
1. Why is this performance not possible until now under Linux X64? Is this solely because the drivers aren't so well-crafted for Linux? Used the newest available from NVidia and the one coming from Linux itself.

2. Now under XP, BOINC suddenly reports 1345 GFlops peak, instead of 778 GFLops before. The latter can't simply be driver overhead on Win 7 X64, can it?

3. Why does a WU running XP X64 crash seriously like this one? Just thought I'd contact you as the stderr output says to do so. Annoying, as it was nearly finished...

Thx and cheers,

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18368 - Posted: 18 Aug 2010 | 17:03:30 UTC - in response to Message 18367.
Last modified: 18 Aug 2010 | 17:16:14 UTC

1. You ran two Cuda 3.0 tasks under Linux (6.06) and not CUDA 3.1 tasks for Fermi. You used the 2.6.32-24-generic driver. The latest is 256.44.

2. Boinc 6.10.17 (old Linux ver) vs 6.10.58 (latest Win ver). You did not have your GTX480 on W7.

3. Your error message, "The system can not find the path specified" appears to be a one off event. Perhaps the message will help the developers.

Thanks,

Profile Christopher Herr
Send message
Joined: 1 Apr 10
Posts: 13
Credit: 106,905,353
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 18371 - Posted: 19 Aug 2010 | 13:33:48 UTC - in response to Message 18368.

Hi skgiven!

1. Yeah, I only ran two tasks under Linux X64 and I know that is not representative at all. But I was fairly disappointed with the performance, that's why I tried XP instantly. I hate to contradict you, but I used both the generic drivers (later and not at the same time) and NVidia's 256.44 first, cause i knew that the Fermi has a performance boost with new drivers.
I tend to look first, if there's a new driver for any of my hardware. Both were disappointing, to say the least.

2. To be honest, I'm not a Linux geek, more a Linux newbie. So I didn't figure, how to install the newest BOINC for Linux AND getting it to run. The manual to install on Linux from the BOINC website wasn't helpful at all for me.
And I did have my GTX 480 running under Win 7, because I don't have any other performant NVidia-Graphics-Adapter aside from an ION, that crunched exactly a single WU, but then was too slow to report a second WU in time.

3. I hope, that this is helpful in further development; so no prob and you're welcome!

Cheers,
Christopher

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 18374 - Posted: 19 Aug 2010 | 19:07:16 UTC - in response to Message 18371.
Last modified: 19 Aug 2010 | 19:07:34 UTC

Hi Christopher Herr,

I made this N00B guide, I use it myself when I install/reinstall or upgrade Mint Linux 8 (Ubuntu 9.10). If UR using a higher version of Mint or Ubuntu, the Nvidia driver upgrade steps won't work, but the steps to upgrade BOINC still works. I myself can't get higher versions of Mint/Ubuntu to run stable on my hardware. I'm told that the steps I use are time consuming, (but it's just clip n paste so it's not that time consuming).
____________

Profile Christopher Herr
Send message
Joined: 1 Apr 10
Posts: 13
Credit: 106,905,353
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 18377 - Posted: 20 Aug 2010 | 6:01:58 UTC

Hello liveonc,

Thanks a lot! Will try that the next time I tinker with a full Linux install.

All the best,
Christopher
____________

Post to thread

Message boards : Graphics cards (GPUs) : acemd2 611 cuda3.1 for Fermi

//