Advanced search

Message boards : Number crunching : New 'long' workunits in queue

Author Message
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 20541 - Posted: 28 Feb 2011 | 16:28:45 UTC

A new batch of 'long' workunits are out, named SMDTRYP5LONG. They are approximately 4 times as long as the usual ones.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 20543 - Posted: 28 Feb 2011 | 16:52:50 UTC

Hi Toni,

Can you confirm that these WUs each represent 20 ns of simulation and if these are running at the new high utilization rate? If they are 20 ns and the typical entire run length is 500 ns (like older runs are) there would only need to be 25 WUs in a chain instead of 100? I would imagine these long (really long) WUs provide substantial overall efficiencies due to not only the app efficiency but also fewer turnovers between WUs and because they are "opt in" only, likely it will only be high ends cards crunching them and they should still return within 24 hours. I'll be turning GPUGrid back on when I get home tonight. :thumbsup:

____________
Thanks - Steve

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 20552 - Posted: 1 Mar 2011 | 0:20:11 UTC

I'm crunching two of these right now. Both are running on GTX 480 at 72-75% GPU usage (SWAN_SYNC=1, CPU at 4GHz). They will finish earlier than I expected: 2h27m 43.2%, 4h53m 86.1%

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 20557 - Posted: 1 Mar 2011 | 9:25:44 UTC - in response to Message 20552.
Last modified: 1 Mar 2011 | 11:54:01 UTC

> Can you confirm that these WUs each represent 20 ns of simulation

Almost: these TRYP5LONGs had a preliminary 10 ns of simulation time, and no WU turnaround - I just needed those 10 ns per run. (This will change of course)

Longer WUs indeed are a huge bonus for us, because overall latency is much shortened. For example, for TRYP5LONGs, i just need to wait for one result per chain rather than four in a row.

Special thanks to all the "long" crunchers, then.

Kirby54925
Send message
Joined: 21 Jan 11
Posts: 31
Credit: 70,061,988
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 20565 - Posted: 1 Mar 2011 | 12:09:45 UTC - in response to Message 20557.

Looks like the long workunit queue is running low. Will you repopulate it, or is this just a test run for you?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 20571 - Posted: 1 Mar 2011 | 13:39:08 UTC - in response to Message 20565.

Make sure your systems are setup to also run short tasks.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 20577 - Posted: 1 Mar 2011 | 15:05:12 UTC - in response to Message 20571.

The long runs will be more sporadic than the short ones.

gdf

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 20590 - Posted: 2 Mar 2011 | 1:39:38 UTC

The "If no work for selected applications is available, accept work from other applications?" option is not working (from the "Preferences for this project" section).

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 20592 - Posted: 2 Mar 2011 | 2:10:42 UTC
Last modified: 2 Mar 2011 | 2:18:49 UTC

I think you should en masse uncheck the "ACEMD for long runs of 8-12 hours on fastest GPU" option for users with low RAC. I'm crunching a bunch of these long WUs previously failed on some older card. This is a waste of time, elecricity and computing power as it is now.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 6,169
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 20593 - Posted: 2 Mar 2011 | 5:00:35 UTC - in response to Message 20590.

The "If no work for selected applications is available, accept work from other applications?" option is not working (from the "Preferences for this project" section).

It's kinda strange. Now it seems to be working.


I received these error messages until now:

2011.03.02. 3:22:54 GPUGRID Sending scheduler request: To fetch work.
2011.03.02. 3:22:54 GPUGRID Requesting new tasks for GPU
2011.03.02. 3:22:56 GPUGRID Scheduler request completed: got 0 new tasks
2011.03.02. 3:22:56 GPUGRID Message from server: No work sent
2011.03.02. 3:22:56 GPUGRID Message from server: No work is available for ACEMD beta version
2011.03.02. 3:22:56 GPUGRID Message from server: No work is available for Long runs (8-12 hours on fastest card)
2011.03.02. 3:22:56 GPUGRID Message from server: ACEMD beta version is not available for your type of computer.
2011.03.02. 3:22:56 GPUGRID Message from server: No work available for the applications you have selected. Please check your preferences on the web site.

2011.03.02. 3:23:31 GPUGRID Sending scheduler request: To fetch work.
2011.03.02. 3:23:31 GPUGRID Requesting new tasks for CPU and GPU
2011.03.02. 3:23:32 GPUGRID Scheduler request completed: got 0 new tasks
2011.03.02. 3:23:32 GPUGRID Message from server: No work sent
2011.03.02. 3:23:32 GPUGRID Message from server: No work is available for ACEMD beta version
2011.03.02. 3:23:32 GPUGRID Message from server: No work is available for Long runs (8-12 hours on fastest card)
2011.03.02. 3:23:32 GPUGRID Message from server: ACEMD beta version is not available for your type of computer.
2011.03.02. 3:23:32 GPUGRID Message from server: No work available for the applications you have selected. Please check your preferences on the web site.

2011.03.02. 3:24:36 GPUGRID update requested by user
2011.03.02. 3:24:37 GPUGRID Sending scheduler request: Requested by user.
2011.03.02. 3:24:37 GPUGRID Requesting new tasks for CPU and GPU
2011.03.02. 3:24:38 GPUGRID Scheduler request completed: got 0 new tasks
2011.03.02. 3:24:38 GPUGRID Message from server: No work sent
2011.03.02. 3:24:38 GPUGRID Message from server: No work is available for ACEMD beta version
2011.03.02. 3:24:38 GPUGRID Message from server: No work is available for Long runs (8-12 hours on fastest card)
2011.03.02. 3:24:38 GPUGRID Message from server: ACEMD beta version is not available for your type of computer.
2011.03.02. 3:24:38 GPUGRID Message from server: No work available for the applications you have selected. Please check your preferences on the web site.




But now it's look like this:

2011.03.02. 5:43:44 GPUGRID Finished upload of R600-TONI_SMDTRYP5LONG-0-1-RND6708_1_4
2011.03.02. 5:43:44 GPUGRID Sending scheduler request: To report completed tasks.
2011.03.02. 5:43:44 GPUGRID Reporting 1 completed tasks, requesting new tasks for CPU and GPU
2011.03.02. 5:43:46 GPUGRID Scheduler request completed: got 1 new tasks
2011.03.02. 5:43:46 GPUGRID Message from server: No work can be sent for the applications you have selected
2011.03.02. 5:43:46 GPUGRID Message from server: No work is available for ACEMD beta version
2011.03.02. 5:43:46 GPUGRID Message from server: No work is available for Long runs (8-12 hours on fastest card)
2011.03.02. 5:43:46 GPUGRID Message from server: ACEMD beta version is not available for your type of computer.

2011.03.02. 5:43:46 GPUGRID Message from server: Your preferences allow work from applications other than those selected
2011.03.02. 5:43:46 GPUGRID Message from server: Sending work from other applications
2011.03.02. 5:43:48 GPUGRID Started download of F375-TONI_SMDTRYP5-0-LICENSE

I swear that I didn't change anything in my preferences since my previous post.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 20596 - Posted: 2 Mar 2011 | 11:57:03 UTC - in response to Message 20592.
Last modified: 2 Mar 2011 | 12:04:45 UTC

I think you should en masse uncheck the "ACEMD for long runs of 8-12 hours on fastest GPU" option for users with low RAC. I'm crunching a bunch of these long WUs previously failed on some older card. This is a waste of time, elecricity and computing power as it is now.


Is there a way to at least prevent CC1.1 and CC1.2 users from selecting long tasks, to make it CC1.3 and Fermi only?

- Though that wouldn't help this guy's GTX295, this guy's GTS450 or this guys GTX460.

They seem to work on the high end (GTX465, 470, 480, 570 and 580) cards better.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 20598 - Posted: 2 Mar 2011 | 16:49:01 UTC - in response to Message 20596.

We could make it fermi only, but then people might complain.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 20602 - Posted: 3 Mar 2011 | 14:35:35 UTC - in response to Message 20598.

Then as Zoltan suggests, a strickter allocation based primarily on task completion success rate would be the way forward; no-one wants tasks failing after 8 to 12h.

I also think that the option to run other tasks if long WU's are not available should be fixed at selected when people subscribe to the long tasks:

Run only the selected applications ACEMD standard: no
ACEMD beta: yes
ACEMD for long runs of 8-12 hours on fastest GPU): yes

If no work for selected applications is available, accept work from other applications? yes [set at yes when long runs=yes]

Profile silent Float
Send message
Joined: 13 Jul 09
Posts: 3
Credit: 5,626,566
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 20609 - Posted: 4 Mar 2011 | 12:39:57 UTC - in response to Message 20598.

We could make it fermi only, but then people might complain.

gdf


Complaint !!

My GTX460 OC (GF104) -> for "long runs" 50'500 sec. -> NO ERRORS

There are many computers with GTX275, GTX280, GTX285 without errors and much better RunTime , between 35'000 and 50'000 sec.

I hope you do not close access for "long runs" to all these GPU.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 20610 - Posted: 4 Mar 2011 | 13:36:34 UTC - in response to Message 20609.

Don't worry, all it's going to stay open.
gdf

We could make it fermi only, but then people might complain.

gdf


Complaint !!

My GTX460 OC (GF104) -> for "long runs" 50'500 sec. -> NO ERRORS

There are many computers with GTX275, GTX280, GTX285 without errors and much better RunTime , between 35'000 and 50'000 sec.

I hope you do not close access for "long runs" to all these GPU.


Profile nenym
Send message
Joined: 31 Mar 09
Posts: 137
Credit: 1,308,230,581
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 20631 - Posted: 8 Mar 2011 | 10:05:51 UTC
Last modified: 8 Mar 2011 | 10:18:14 UTC

RUN time 35,545.84 CPU time 35,059.03 claimed 38,584.72 granted 57,877.08
GTX560Ti WinXP 64bit, Swan_Sync=0, driver 26726, GPU load 95%, time per step 14.190 ms. Host ID 31329.

Post to thread

Message boards : Number crunching : New 'long' workunits in queue

//