Advanced search

Message boards : Server and website : fast result returns

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1862
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 15660 - Posted: 10 Mar 2010 | 8:24:38 UTC

In order to increase the fast results. We have changed the way the server handle fast returned jobs.
If you return your results within 24 hours you have a bonus of 50%. If you return it within two days the bonus is as before 25%.

Workunits size will slowly increase once the new application 30% faster is out. We will try to create a small workunits queue for smaller cards.


Current status:
== Turnaround time of today's results (sent-received within X days) ==
turnaround count(*)
1 2705
2 604
3 131
4 64
5 29
6 11
>6 15


GDF

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3516
Credit: 933,272,507
RAC: 1,047,955
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15672 - Posted: 10 Mar 2010 | 14:19:09 UTC - in response to Message 15660.
Last modified: 10 Mar 2010 | 14:59:30 UTC

Many people will need to turn down their cache to benefit from the points for sub 24h returns. I would have preferred sub 36h (as I have several GT240s which usually take about 10-15h per task depending on sysetm and OC). Though this should be fine for any GTX260s or better (<7h turnaround).

Risky, should work units not be available!

Basically, unless you can return a work unit in less than 12 hours, you dont want to have a task always in queue until very close to when it is due to run.

There may be differences between how the various Boinc versions behave, WRT cache settings, and they could interfere with other projects (CPU and GPU).

I hope you dont overshoot the anticipated 30% increase in performance, or some cards might not be capable of returning WU's within 24h, especially the CC1.1 cards.

In Boinc Manager Advanced View, Advanced Preferences, network usage tab, Additional work buffer 0.10 should work for me – it will download a new task about 0.10 days before the present running task is due to finish. From about 0.01 to 0.3 days buffer would actually work. People should note that it is better to download a new task slightly before sending a completed task, otherwise you would be trying to download and upload at the same time, which could delay uploading.

Do the completed tasks have to be Reported to get the bonus, or just uploaded???

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1862
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 15673 - Posted: 10 Mar 2010 | 16:14:09 UTC - in response to Message 15672.

They need to get reported.

Ideally, we would like that you should not worry about getting the second WU in the queue. You should be able to increase the priority of GPUGRID as much as possible without negative effects. We already limit the number of workunits to avoid you getting too many of them.
What we can probably do is to increase the size of the WU by twice and give 50% within 2 days and 25% more within 4 days or something like that (1.5 days/3 days).

What do you guys think of this?

gdf

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 434
Credit: 433,932,599
RAC: 752,665
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 15678 - Posted: 10 Mar 2010 | 19:15:21 UTC - in response to Message 15673.

I think that bonus points don't really matter any more.
Because you only get 2 WUs per GPU, if you have fast cards (2XX) you get the bonus. If you have slow cards (9XXX) then you usually do not. There just is not that much room in between.

Looking at the numbers posted on yesterday's returns you are already getting 76% within 24 hours so these people are not the focus.

I don't see any change possible for the 7% that return later than 48 hours as I doubt their cards are actually capable of doing better.

That leaves the 17% that return between 24 and 48 hours.
How best can we help them to return quicker?
Addtional bonus may help but maybe it is just education and giving people guidance on how to set their cache size appropriately.

There are other ways to try and get the bonus but they take more effort than most people are willing to put in and in somecases are detrimental to the goal of reducing overall return times.

I am assuming that most of the WUs were executed by the quicker app and if so then increasing the runtime will make it even more difficult for the people in the 24-48 hour bracket to make enough of a change to come in under 24 hours.

I think if you make really long running wus it will increas the failure and late return rate (do the stats from before the new app upgrade support late return rates have gone down? Error rate might still be high just because it is a new version). Another reason to leave wu lengths short is that, like most crunchers I like to see my numbers move fast, it just makes me feel like I am getting more done :-) If the server can handle the return rate as it is today I would leave the wu size alone or maybe increase it slightly. I do not think wus 2x times longer will work particularly well.

There is a way to make sure your wus get reported as soon as they are complete but most other projects don't like you to do it because reporting causes a heavy database hit.

add this to your cc_config.xml
<cc_config>
<options>
<report_results_immediately>1</report_results_immediately>
</options>
</cc_config>
____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3516
Credit: 933,272,507
RAC: 1,047,955
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15680 - Posted: 10 Mar 2010 | 20:05:21 UTC - in response to Message 15673.

If you can identify faster cards (GTX260 and above) and increase the length for those that would be perfect.
The idea of doubling the length and having a 2day return (for max points) is better also, because it means that with a cache of say 0.5days a card such as a GT240 will be able to work through the task and about half way through download the next WU. So there will be less chance of running out of tasks (Internet outaget at user end as well as your server).
Of course this rests on your ability to send out 2 types of task, but there would also be benefits to the project. Should a short task not be returned in time from a slow GPU, you could still allocate (prioritise) that short task to a fast GPU; safe in the knowledge of a quick return.

When you decide what you will do, make it well known and perhaps put together a table of advice of different card specs.

OldChap
Send message
Joined: 5 Jul 09
Posts: 14
Credit: 22,132,314
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 15682 - Posted: 11 Mar 2010 | 0:48:00 UTC - in response to Message 15680.

You might consider how the users configure their rigs...I guess most will just run "straight out of the box" so for a 98 gtx running one WU and having one in the queue the current length and speed makes it possible to hit the deadlines (at least if overclocked).

So, my thought is that rather than define a queue by time perhaps consider defining it by number so that if that number is set at 0.5 the next WU is not downloaded until the WU being worked on has completed 50%

I've pulled a couple of cards from F@H and put on GRID and will continue to do this each time they can't keep their stuff together... It is not leading edge equipment (gx2's) but if such kit were to fall into a second tier of points (read wanted less by GPUGrid) then the incentive to participate 24/7 is diminished.

With regard to the 24hour deadline and larger WU's you will need to decide where to draw the line in the sand ...who will fall into that second tier.

The one concern that really does have to be resolved if you make the move to longer WU's is loss upon a failure. In fah I get credited for the work done so far upon a failure.... here, no such thing ...I ran 2* 260 192's for a while and failures with those....well you know that story...a lot of time effort and money for nothing...not a good thing.

Whatever you decide, make the interface simple...automatic if possible, with options for those who have the time to optimise for their own usage

Profile MarkJ
Volunteer moderator
Project tester
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 723
Credit: 149,720,182
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 15685 - Posted: 11 Mar 2010 | 10:50:50 UTC

My machines have either a GTX295 or GTX275. They tend to do the newer work units in 6 hours. I also run Seti cuda, which means they will typically get the gpu 60% of the time.

Due to my ISP download limits I have an 8 hour window in which to do most of my comms. 1.5 days for the 50% bonus would be good, but 1 day and I tend to miss it by about one or two hours. I think the size is about right for the faster cards, so wouldn't suggest increasing the size of the work units.
____________
BOINC blog

Profile X-Files 27
Avatar
Send message
Joined: 11 Oct 08
Posts: 95
Credit: 68,023,693
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 15687 - Posted: 11 Mar 2010 | 13:56:53 UTC

I wouldn't increase the size unless there is some kind of recovery method after a few crash/error. Increasing the size is equal to increasing the chance of failures.

4xx = .5 day = 75%
2xx = 1day = 50%
9xxx/8xxx = 2 days = 25 %

____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1862
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 15688 - Posted: 11 Mar 2010 | 14:40:43 UTC - in response to Message 15687.

I wouldn't increase the size unless there is some kind of recovery method after a few crash/error. Increasing the size is equal to increasing the chance of failures.

4xx = .5 day = 75%
2xx = 1day = 50%
9xxx/8xxx = 2 days = 25 %


This is an option.

gdf

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 434
Credit: 433,932,599
RAC: 752,665
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 15690 - Posted: 11 Mar 2010 | 15:45:50 UTC - in response to Message 15688.
Last modified: 11 Mar 2010 | 15:46:45 UTC

Clearly this is a call by the project to substantially increase the overall rate of return. I am guessing that this push is centered around the requirements / commitments made as part of the publically funded project GPUGrid recently won. The schiz project is great news for the GPUGrid and proves that what we have all been working towards is worthwhile. We have all spent so much time, equipment, and electricity ... now is the time to rise up and show that we really support what the project needs ... set your BOINC cache to .35
Let's not let them down now!

As a cruncher I really appreciate this conversation. I've flipped my position on WU size. If we go back to the average runtime from before the app upgrade then error rates will likely be the same. No higher, no lower. If the runtimes are extended beyond that it might encourage people to quit messing with their machines and get stable (because otherwise it is too painful if you crash after many hours).

I can also see that if there are new bonus levels added for 12 and 24 hour returns that majority of returns which already happen within 48 hours are done on hardware that is capable of returning quicker if we all set our BOINC cache size to .35 . There is no detriment to people who return between 24 and 48 hours as the bonus level for them stays the same as it is today and if I remember correctly GDF said they would try to make smaller WUs specifically for this group.

Everyone, please set your BOINC cache to .35 so GDF and team can get back to working on the real science.
____________
Thanks - Steve

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 434
Credit: 433,932,599
RAC: 752,665
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 15876 - Posted: 21 Mar 2010 | 14:58:45 UTC - in response to Message 15688.

I wouldn't increase the size unless there is some kind of recovery method after a few crash/error. Increasing the size is equal to increasing the chance of failures.

4xx = .5 day = 75%
2xx = 1day = 50%
9xxx/8xxx = 2 days = 25 %


This is an option.

gdf


What is the value of adding the class of GPU into the bonus formula?
Unless you set up "small", "medium", "large" WU applications types and then each of them has a different bonus schedule? I think this is starting to get more complicated than necessary.

200b class gpu does a WU in about 6 hours.
6 hours - 25% from new app = 3.5 runtime
double that to get 7 hour runtime.
Double that for max WU in process = 14 hours.
New bonus target of 12 hours so we only need to delay BOINC Manager from pulling new work for 2 hours after each WU finishes. Should be pretty easy :-)

Have you seen any improvement in overall turnaround times?
How about new traffic due to higher pointing?

____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3516
Credit: 933,272,507
RAC: 1,047,955
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15887 - Posted: 21 Mar 2010 | 20:27:42 UTC - in response to Message 15876.
Last modified: 21 Mar 2010 | 21:23:05 UTC

Returned one longer task (result details) today after 11h on a GTX260sp216 (system) for a 50% bonus (p48-IBUCH_1016_pYEEI_long_100319-0-4-RND9408):

    21 Mar 2010 4:48:49 UTC 21 Mar 2010 19:44:14 UTC Completed and validated 39,561.06 9,017.50 7,954.42 11,931.63


It used slightly more processor time (but 0.23 cores is not to bad).

Of course the other paired card (a GTX295) failed to complete the WU. So you wont be making the jumps you were hoping for this time. Perhaps you can allocate tasks on a potential success rate basis? That same GTX295 rarely finishes a task (RAC 1301)!

I hope my GT240 cards keep picking up smaller tasks as they would be border line to complete within 24hours (and they are overclocked).
My guess is that the DDR5 versions would make it with an hour to spare,
the GDDR3 would be within 30min either side,
but a natively clocked DDR3 card would miss out by about an hour.

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 282
Credit: 443,294,926
RAC: 827,126
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 16084 - Posted: 30 Mar 2010 | 11:53:33 UTC - in response to Message 15690.


Everyone, please set your BOINC cache to .35 so GDF and team can get back to working on the real science.



I have my cache set to 0.01 and it makes no difference. I don't see why GPUGRID can't just set a swap one complete unit for one new unit SIMPLES.
____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 434
Credit: 433,932,599
RAC: 752,665
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 16144 - Posted: 3 Apr 2010 | 12:24:08 UTC - in response to Message 16084.

It looks like your tasks are not doubling up immediately, are you manually managing them now?
____________
Thanks - Steve

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 282
Credit: 443,294,926
RAC: 827,126
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 16145 - Posted: 3 Apr 2010 | 14:09:47 UTC - in response to Message 16144.

No, I do report the odd one manually but it's not that machine which struggles to report within 24hrs.
Because the server polls the machines every 5hrs if you finish the unit just after the machine is polled it sits waiting to be reported for another 5hrs which means you miss the 24hr deadline.
Caches are set 0.01 for max efficiency project wise. I don't believe in building a large cache for any project because it doesn't aid the project especially one like GPUGRID.


____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 434
Credit: 433,932,599
RAC: 752,665
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 16148 - Posted: 3 Apr 2010 | 15:54:22 UTC - in response to Message 16145.

Because the server polls the machines every 5hrs if you finish the unit just after the machine is polled it sits waiting to be reported for another 5hrs which means you miss the 24hr deadline.


Most other projects don't like you to use option I'll post below because it is a fairly heavy database call and they like to keep those to a minimum but if you add the ...

<report_results_immediately>1</report_results_immediately>

option to your cc_config file it will force your PC to register the task as complete with the server right after it finishes uploading (within a minute or so). This might help you to make the 24 hour deadline.
____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3516
Credit: 933,272,507
RAC: 1,047,955
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16154 - Posted: 3 Apr 2010 | 20:44:24 UTC - in response to Message 16148.

Thanks Snow Crash,
It is helping with my RAC.
For W7 & vista I needed to go into the c:\programdata\Boinc folder and create a cc_config.xml file then add the code. By default this folder is hidden!
(Actually, I just edited another xml file, and saved as).
Remember to open Boinc in advanced view, click the Advanced Tab and select Read config file.

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 282
Credit: 443,294,926
RAC: 827,126
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 16268 - Posted: 13 Apr 2010 | 7:55:53 UTC - in response to Message 16148.

Because the server polls the machines every 5hrs if you finish the unit just after the machine is polled it sits waiting to be reported for another 5hrs which means you miss the 24hr deadline.


Most other projects don't like you to use option I'll post below because it is a fairly heavy database call and they like to keep those to a minimum but if you add the ...

<report_results_immediately>1</report_results_immediately>

option to your cc_config file it will force your PC to register the task as complete with the server right after it finishes uploading (within a minute or so). This might help you to make the 24 hour deadline.


Thanks Snow Crash

I am now getting just under 24hr on one of my machines which was affected by the time the WU was reported and this helps a lot.


____________

511513y
Send message
Joined: 15 Apr 14
Posts: 3
Credit: 608,321
RAC: 22,226
Level
Gly
Scientific publications
wat
Message 37833 - Posted: 5 Sep 2014 | 10:26:44 UTC

This thread is a few years old
Can anyone confirm if this bonus is still in effect?
After missing a deadline by 20 minutes I was looking at changing my work unit buffer anyway.

Carlos Augusto Engel
Send message
Joined: 5 Jun 09
Posts: 24
Credit: 393,087,594
RAC: 854,494
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 37842 - Posted: 5 Sep 2014 | 15:56:46 UTC - in response to Message 37833.

Yes, still the same.
I finish a Noelias's short WU less then 24h and i receive 31,500.00.
In your case, similar Noelia's WU 21,000.00 after 24h.
____________

MrJo
Send message
Joined: 18 Apr 14
Posts: 22
Credit: 194,527,483
RAC: 656,730
Level
Ile
Scientific publications
wat
Message 37846 - Posted: 5 Sep 2014 | 18:18:13 UTC - in response to Message 37833.
Last modified: 5 Sep 2014 | 18:19:17 UTC

After missing a deadline by 20 minutes I was looking at changing my work unit buffer anyway.

Just because the bonus will still be granted within 24 hours, it makes sense to choose the buffer as small as possible. The time is counting after the WU's download. So if the WU is boring on your hard drive, valuable points are wasted. Therefore, I have chosen this setting (0,01 days):



____________
Regards, Josef

511513y
Send message
Joined: 15 Apr 14
Posts: 3
Credit: 608,321
RAC: 22,226
Level
Gly
Scientific publications
wat
Message 37854 - Posted: 6 Sep 2014 | 20:11:00 UTC

Good to know. For a laptop I have a relatively fast crunching GPU, but even on "short" runs the length seems to vary.
I've had some take 8 hours others closer to 20.
The longer ones are credited better, but if they push too long I won't be fast enough to get the bonus.

Currently testing now with a NOELIA task estimated for 36 hours.
Its actually at 13% with 2 hours in, so, if THAT holds steady, it should be done in 16 hours from arrival

On the CPU end, which I use for other projects, the shorter buffer is helping collect a variety of tasks, but it still seems to want to complete shorter deadlines first, and let long deadline projects sit around for days.

The resource share setting seems useless in affecting this, so I'm suspending projects manually if they take too much CPU time

Post to thread

Message boards : Server and website : fast result returns