Long WUs are out - 50% bonus

Message boards : News : Long WUs are out - 50% bonus

Author	Message
ignasi Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level Scientific publications	Message 19912 - Posted: 14 Dec 2010 \| 19:05:15 UTC
	Double sized WUs are out: variant_long* and include a bonus of the 50% on the credits.
	ID: 19912 \| Rating: 0 \| rate: / Reply Quote

Saenger Send message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level Scientific publications	Message 19913 - Posted: 14 Dec 2010 \| 19:48:47 UTC - in response to Message 19912.
	Double sized WUs are out: variant_long* and include a bonus of the 50% on the credits. Is there a possibility to opt out of those monsters, as they would probably crash the 2-day deadline an normal computers? Or do we have to abort them manually? And how do we recognize them? ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki
	ID: 19913 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19916 - Posted: 15 Dec 2010 \| 6:20:05 UTC - in response to Message 19912. Last modified: 15 Dec 2010 \| 6:22:25 UTC
	First one of these is completed in 7h32m (27116.891s). Time per step (avg over 2500000 steps): 10.847ms Claimed Credit: 23878 (that's increased 50% compared to a _IBUCH_?_pYEEI_long_ WU) Granted Credit: 35817 (standard 50% fast return bonus) It gives 1.32 credit/sec. that's not bad at all :) My fastest GIANNI_DHFR1000 gives 1.5 credit/sec. GPU usage is 62-64% on GTX 580, and 62-67% on GTX 480.
	ID: 19916 \| Rating: 0 \| rate: / Reply Quote

ftpd Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level Scientific publications	Message 19917 - Posted: 15 Dec 2010 \| 9:08:18 UTC Last modified: 15 Dec 2010 \| 9:10:09 UTC
	10-IBUCH_8_variantP_long-0-2-RND0787_0 Workunit 2166428 Aangemaakt 14 Dec 2010 19:08:59 UTC Sent 14 Dec 2010 22:55:38 UTC Received 15 Dec 2010 9:05:29 UTC Server state Over Outcome Success Client state Geen Exit status 0 (0x0) Computer ID 35174 Report deadline 19 Dec 2010 22:55:38 UTC Run time 35497.448383 CPU time 35436.44 stderr out <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> # Using device 0 # There is 1 device supporting CUDA # Device 0: "GeForce GTX 480" # Clock rate: 1.40 GHz # Total amount of global memory: 1610153984 bytes # Number of multiprocessors: 15 # Number of cores: 120 SWAN: Using synchronization method 0 MDIO ERROR: cannot open file "restart.coor" # Time per step (avg over 2500000 steps): 14.200 ms # Approximate elapsed time for entire WU: 35500.370 s called boinc_finish </stderr_txt> ]]> Validate state Geldig Claimed credit 23878.0787037037 Granted credit 35817.1180555555 application version ACEMD2: GPU molecular dynamics v6.13 (cuda31) -------------------------------------------------------------------------------- 14 Dec 2010 22:55:38 UTC 15 Dec 2010 9:05:29 UTC Completed and validated 35,497.45 35,436.44 23,878.08 35,817.12 ACEMD2: GPU molecular dynamics v6.13 (cuda31 -------------------------------------------------------------------------------- My first long WU! Give me more please! ____________ Ton (ftpd) Netherlands
	ID: 19917 \| Rating: 0 \| rate: / Reply Quote

ignasi Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level Scientific publications	Message 19919 - Posted: 15 Dec 2010 \| 11:12:20 UTC - in response to Message 19917.
	Well, It seems that long WUs aren't afterall that much gain for everybody. We have dropped by 1000 WUs in progress in one single day. We may be trying to push the computations too far here. The last thing we want is to scare people out and run away from GPUGRID. We are going to reconsider the strategy for "fast-track" WUs. What do you think? cheers, ignasi
	ID: 19919 \| Rating: 0 \| rate: / Reply Quote

ftpd Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level Scientific publications	Message 19920 - Posted: 15 Dec 2010 \| 11:17:32 UTC - in response to Message 19919. Last modified: 15 Dec 2010 \| 11:18:36 UTC
	Ignasi, If i had to choose: very long WU's or small quick WU's, but then a greater amount of WU's. I am NOT scared, but give some cards (gts250) also something to crunch. ____________ Ton (ftpd) Netherlands
	ID: 19920 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19921 - Posted: 15 Dec 2010 \| 11:48:47 UTC - in response to Message 19919.
	It seems that long WUs aren't afterall that much gain for everybody. We have dropped by 1000 WUs in progress in one single day. We may be trying to push the computations too far here. The last thing we want is to scare people out and run away from GPUGRID. You may have to consider external factors, as well as your own internal choices. Do you have a medium/long term record of that "WUs in progress" figure? I suspect that you may have been affected by that other big NVidia beast in the BOINC jungle - SETI@home. They have been effectively out of action since the end of October. You may well have been benefitting from extra resources during that time. SETI has been (slowly and intermittently) getting back up to speed over the last week or so, and as they do so, you will inevitably lose some volunteers (or some share of their machines). Having said that, I've got IBUCH_*_variantP_long running on all four hosts at the moment - I'll comment on your other questions when they've finished, in 5 - 38 hours from now.
	ID: 19921 \| Rating: 0 \| rate: / Reply Quote

ignasi Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level Scientific publications	Message 19922 - Posted: 15 Dec 2010 \| 11:53:41 UTC - in response to Message 19920.
	Sure. I am submitting non-long WUs at the moment. There's plenty of work for everybody always. i
	ID: 19922 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19923 - Posted: 15 Dec 2010 \| 12:12:53 UTC - in response to Message 19922.
	We have dropped by 1000 WUs in progress in one single day. In progress includes running tasks and queued tasks. As running tasks will take longer there are less queued tasks, that's why there are less in progress. You will need to wait for a few days to equilibrate. Don't hit the panic button.
	ID: 19923 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19926 - Posted: 15 Dec 2010 \| 13:47:39 UTC - in response to Message 19919. Last modified: 15 Dec 2010 \| 14:16:20 UTC
	Well, It seems that long WUs aren't afterall that much gain for everybody. We have dropped by 1000 WUs in progress in one single day. We may be trying to push the computations too far here. The last thing we want is to scare people out and run away from GPUGRID. We are going to reconsider the strategy for "fast-track" WUs. What do you think? cheers, ignasi Ignasi, you don't read the other topics in your forum, do you? :) I think you should separate the _long_ workunits from the normal WUs, or even create _short_ WUs for crunchers with older cards. If you couldn't develop an automated separation process, you should make it possible for the users to do it on their own (but not by aborting long WUs one by one manually). There are some computers equipped with multiple, but very different cards, so this is a very complicated problem. The best solution would be limiting the running time of a WU, instead of using a fixed simulation timeframe. (As far as I know, this is almost impossible for GPUGRID to implement.) My other project (rosetta@home) gives the user the opportunity of setting a desired WU running time, you too should give this opportunity in some way or another. My computers are on 24/7 so I could (and I would) do even a 100ns simulation, if it were up to me, and if the credit/time ratio would be the same (or higher). Another way to get around this problem: you should create a new project under BOINC for these _long_ WUs - let's say it's called FermiGRID - and encourage users with faster cards to join FermiGRID, and set GPUGRID as a backup project. You should contact NVidia about the naming of the new project before it starts, maybe they consider it as advertisement (and give you something in exchange), or maybe they consider it as a copyright infringement and send their lawyers to sue you for it. :)
	ID: 19926 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19927 - Posted: 15 Dec 2010 \| 14:15:39 UTC - in response to Message 19921.
	... Having said that, I've got IBUCH__variantP_long running on all four hosts at the moment - I'll comment on your other questions when they've finished, in 5 - 38 hours from now. Well, one has now failed - task 3445790. Exit status -40 (0xffffffffffffffd8) SWAN: FATAL : swanBindToTexture1D failed -- texture not found I hope that isn't a low memory outcome on a 512MB card - if it is, the others are going to go the same way. And I hope the new shorter batch aren't all going to be _HIVPR_n1_[un]bound_* - they are just as difficult on my cards.
	ID: 19927 \| Rating: 0 \| rate: / Reply Quote

ignasi Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level Scientific publications	Message 19928 - Posted: 15 Dec 2010 \| 14:16:00 UTC - in response to Message 19926.
	@skgiven Correct. It's a matter of few days as you say. @Retvari Zoltan I do my best, thank you. The long WUs are not meant to be something regular at all. We always stated that. The problem is that I have abused of them in the rush for getting key results back for publication. The solution for this issue is not just extending WU length. It has to be well thought out and degrees of freedom to adjust, pinpointed. Classification of WU by card comp. capacity is certainly an option. Back to science, i
	ID: 19928 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19930 - Posted: 15 Dec 2010 \| 15:33:13 UTC - in response to Message 19928. Last modified: 15 Dec 2010 \| 15:37:28 UTC
	I do my best, thank you. I'm sorry, I didn't mean to offend you. The long WUs are not meant to be something regular at all. We always stated that. The problem is that I have abused of them in the rush for getting key results back for publication. We, crunchers, on the other end of the project, see this problem from a very different view. When a WU fails, the cruncher will be disappointed, and if many WUs keep on failing, the cruncher will leave this project for a more successful one. We don't see the progress of the subprojects you are working on, and cannot choose the appropriate subproject for our GPUs. The solution for this issue is not just extending WU length. It has to be well thought out and degrees of freedom to adjust, pinpointed. Classification of WU by card comp. capacity is certainly an option. I'm (and I suppose every cruncher in this forum are) just guessing what could be the best solution for the project, because I don't have the information needed to pick (to invent) the right one. But you (I mean GPUGRID) have that information (the precise number of the crunchers, and their WU returning time). You just have to process that information wisely. You should create a little application for simulating GPUGRID (if you don't have one already). This application have to have some variables in it, for example WU length, subprojects, and processing realiability. If you play with those variables a little from time to time, you can choose the best thing to change in the whole project. You can even implement a simulation of a very different GPUGRID, to see if the new one worth the hassle. Just like when SETI were transformed to BOINC.
	ID: 19930 \| Rating: 0 \| rate: / Reply Quote

ftpd Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level Scientific publications	Message 19933 - Posted: 15 Dec 2010 \| 17:04:55 UTC - in response to Message 19922.
	Ignasi, My gts250 does NOT receive any downloads for 24 hrs now. All Kashiv-wu are cancelling after several hours on this card. Is that the reason? ____________ Ton (ftpd) Netherlands
	ID: 19933 \| Rating: 0 \| rate: / Reply Quote

dataman Send message Joined: 18 Sep 08 Posts: 36 Credit: 100,352,867 RAC: 0 Level Scientific publications	Message 19935 - Posted: 15 Dec 2010 \| 17:28:20 UTC
	I just cruised by to say I'm lovin' the new wu's. ~21 hours on GTX260 but only ~55% utilization of the card. ??? Nice credits too :) Good job. ____________
	ID: 19935 \| Rating: 0 \| rate: / Reply Quote

Saenger Send message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level Scientific publications	Message 19936 - Posted: 15 Dec 2010 \| 18:39:28 UTC - in response to Message 19928.
	The problem is that I have abused of them in the rush for getting key results back for publication. The solution for this issue is not just extending WU length. It has to be well thought out and degrees of freedom to adjust, pinpointed. Classification of WU by card comp. capacity is certainly an option. That's what I asked for in this thread. At the moment, and especially with sending such monsters to everyone participating, even normal crunchers, running their cards only 8h per day, and not the latest, most expensive cards, you are alienating those crunchers. I had to abort one of these monsters after 15h, because it would never have made the 48h deadline, and better to waste just 15h than to waste 48h. As you seem to know quite good beforehand how demanding your WUs will be, the fixed credits give a good clue to that, you could do such adjustment could be only to allow certain types of WUs for the crunchers, i.e. no bigger than 5000 credits claim for mine usually, if nothing else is available give up to 8000 claim. I think you could do that even on the scheduler, although I'm no programmer, but it shouldn't be so hard to put the GPUs on our puters in classes and send them according to their capabilities. Those capabilities are known to BOINC. ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki
	ID: 19936 \| Rating: 0 \| rate: / Reply Quote

ftpd Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level Scientific publications	Message 19937 - Posted: 15 Dec 2010 \| 18:57:35 UTC
	Computer ID 47762 Report deadline 19 Dec 2010 23:11:00 UTC Run time 63982.28125 CPU time 17871.09 stderr out <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code 98 (0x62) </message> <stderr_txt> # Using device 0 # There are 2 devices supporting CUDA # Device 0: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939327488 bytes # Number of multiprocessors: 30 # Number of cores: 240 # Device 1: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939196416 bytes # Number of multiprocessors: 30 # Number of cores: 240 MDIO ERROR: cannot open file "restart.coor" ERROR: file tclutil.cpp line 31: get_Dvec() element 0 (b) called boinc_finish </stderr_txt> ]]> Validate state This one cancelled after almost 18 hrs. Windows XP - gtx 295. Next one, please ____________ Ton (ftpd) Netherlands
	ID: 19937 \| Rating: 0 \| rate: / Reply Quote

ftpd Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level Scientific publications	Message 19939 - Posted: 15 Dec 2010 \| 20:58:58 UTC
	Computer ID 47762 Report deadline 19 Dec 2010 23:11:00 UTC Run time 71377.171875 CPU time 18345.22 stderr out <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> # Using device 1 # There are 2 devices supporting CUDA # Device 0: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939327488 bytes # Number of multiprocessors: 30 # Number of cores: 240 # Device 1: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939196416 bytes # Number of multiprocessors: 30 # Number of cores: 240 MDIO ERROR: cannot open file "restart.coor" # Using device 1 # There are 2 devices supporting CUDA # Device 0: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939327488 bytes # Number of multiprocessors: 30 # Number of cores: 240 # Device 1: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939196416 bytes # Number of multiprocessors: 30 # Number of cores: 240 # Time per step (avg over 65000 steps): 34.528 ms # Approximate elapsed time for entire WU: 86320.913 s called boinc_finish </stderr_txt> ]]> Validate state Geldig Claimed credit 23878.0787037037 Granted credit 35817.1180555555 application version ACEMD2: GPU molecular dynamics v6.13 (cuda31) -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- This one is succesfull. Next one is processing! ____________ Ton (ftpd) Netherlands
	ID: 19939 \| Rating: 0 \| rate: / Reply Quote

Werkstatt Send message Joined: 23 May 09 Posts: 121 Credit: 321,525,386 RAC: 12,239 Level Scientific publications	Message 19940 - Posted: 15 Dec 2010 \| 21:11:52 UTC
	Ignasi, is there a way to sort out the long wu's by an app_info.xml? If, for example, the 6.13 app is only for the shorter wu's and a (suggested) 6.14 app is for the longer wu's everyone can decide for himself what type he wants to crunch. And without app_info he gets whats available. Alexander
	ID: 19940 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19948 - Posted: 16 Dec 2010 \| 10:44:57 UTC - in response to Message 19940.
	Presently, tasks are allocated acording to what Cuda capability your card has, which in turn is determined by your driver: Use a full driver after 197.45 and you will get the 6.13app to run tasks on. Use 197.45 or earlier (down to 195) and you will use the 6.12app to run tasks. So if we extended the existing system to another app, we would still be asking crunchers to uninstall and reinstall drivers to crunch small tasks. Some of the recent suggestions are starting to go round in circles; many of the suggestions we have seen in the last day or so have been suggested before, some several times. The best place for suggestions in the Wish List.
	ID: 19948 \| Rating: 0 \| rate: / Reply Quote

Fred J. Verster Send message Joined: 1 Apr 09 Posts: 58 Credit: 35,833,978 RAC: 0 Level Scientific publications	Message 19956 - Posted: 16 Dec 2010 \| 14:53:35 UTC - in response to Message 19948.
	No problems with the long WU's , but when checking upon my XP64 host, cause computing times got longer on 'the same WU' types. GPUz reported, after I did put an 470 next to the 480, the 470 runs in PCI-E x2 mode and the 480 in 1x mode?! Which is ofcoarse slower, not 16x but about 1.5 times. OK, ( a bit) off topic, but very odd and haven't found a setting(BIOS) or reason, as to why it does this. Looks like it doesn't like NVIDIA cards! Have 2 ASUS P5E Mobo's, 1 with a Q6600 and ATI's HD4850 & HD5870, both running PCI-E x16, ver2.0 The other has an X9650 (@3.5GHz)and a GTX480 and a 470, only difference is OS type. When I try to force, in BIOS, the cards in 16x mode, it suddenly stops, no warning or fault, just 'hangs'. Maybe the X38 chipset, as all the cards are from ASUS as well. But I do see an unusual amount of computation errors on 200 series, also on FX (Quadro) cards. (Too much difference in architecture between 200 and 400, 500 series?) ____________ Knight Who Says Ni N!
	ID: 19956 \| Rating: 0 \| rate: / Reply Quote

Saenger Send message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level Scientific publications	Message 19958 - Posted: 16 Dec 2010 \| 15:24:18 UTC - in response to Message 19948.
	Presently, tasks are allocated acording to what Cuda capability your card has, which in turn is determined by your driver: My compute capability of 1.2 hasn't changed with the change of drivers, only the Cuda version number. NVIDIA GPU 0: GeForce GT 240 (driver version unknown, CUDA version 3020, compute capability 1.2, 511MB, 257 GFLOPS peak) If you don't use what you already know from a) our compute capabilities and b) the specific requiremants for each and every WU to give the computers just those WUs that they are capable of, it's your active decision to waste computing power by giving demanding WUs to low performing computers. You know beforehand what will be wasted, you just don't care about it. ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki
	ID: 19958 \| Rating: 0 \| rate: / Reply Quote

dataman Send message Joined: 18 Sep 08 Posts: 36 Credit: 100,352,867 RAC: 0 Level Scientific publications	Message 19959 - Posted: 16 Dec 2010 \| 23:05:49 UTC
	Have we run out of long wu's? I was getting one about 1 in 4 but now they have stopped or has something else changed? I like them :) ____________
	ID: 19959 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19960 - Posted: 16 Dec 2010 \| 23:10:34 UTC
	Hmmmm. task 3445649 Run time 162790 seconds, CPU time 42746 seconds, is an awful lot of resources to throw at a "SWAN: FATAL : swanBindToTexture1D failed -- texture not found" Any light to throw on this new error message yet? It's not been reported in the last 12 months, except by me - and that was on a completely different host.
	ID: 19960 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19961 - Posted: 16 Dec 2010 \| 23:12:03 UTC - in response to Message 19959.
	Have we run out of long wu's? I was getting one about 1 in 4 but now they have stopped or has something else changed? I like them :) You're welcome to my resend. Have fun with it ;-)
	ID: 19961 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19965 - Posted: 17 Dec 2010 \| 0:22:21 UTC
	Another of my hosts has just completed WU 2166545 - p19-IBUCH_10_variantP_long-0-2-RND8229. It took 48.5 hours, so the forgone 'quick return' bonus cancelled out the 50% 'long' bonus. But, more importantly - and contrary to some complaints here - no second copy was created, and the work I did was 100% useful for the science.
	ID: 19965 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19966 - Posted: 17 Dec 2010 \| 0:23:18 UTC - in response to Message 19961.
	I would be more worried about the "No heartbeat from core client for 30 sec - exiting" message. This bit has been seen in the past, - exit code -40 (0xffffffd8) To speculate, I think this means the error is project related, rather than Boinc or system; otherwise I think a zero would have been returned. Perhaps one of the scientists can confirm and elucidate on this message, "SWAN: FATAL : swanBindToTexture1D failed -- texture not found" To fail after 45h runtime is not a good situation, hence the present lack of long tasks. It's been passed to a GTX275.
	ID: 19966 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19967 - Posted: 17 Dec 2010 \| 0:40:20 UTC - in response to Message 19966.
	I would be more worried about the "No heartbeat from core client for 30 sec - exiting" message. Twice, early in the task lifetime? No, I'm not worried about that. I did do this month's Windows security updates in the middle of this run (under manual supervision - I don't allow fully automatic updates), and had a couple of lockups afterwards. But I used the machine for some layout work earlier this evening, and it was running fine: and the error happened while the machine was otherwise idle, and I was monitoring the tasks remotely via BoincView as normal. I'm more worried about "SWAN: FATAL : swanBindToTexture1D failed -- texture not found", which - like you - I suspect to be an application or task definition error: probably the latter, since a similar task has just finished successfully on an identical card to my first observation of that error message.
	ID: 19967 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19968 - Posted: 17 Dec 2010 \| 0:43:51 UTC - in response to Message 19966.
	A second task is not sent out until 48h after the initial task is sent (normally), so as yours returned in 48.5h then I guess the server just did not get round to issuing a resend before your task returned. Your task would have been fully used even if another task had been issued, as your task would be the first back, and if a task was sent out again, in most cases it would not start automatically, so if you return slightly later than 48h your task is still the most useful; un-started resends can be recalled. However, after 3 or 4days this is no longer the case; resends would have completed. Sometimes however the resends fail, but not too often, as they tend to go to reliable and faster cards. Begs the question why this allocation method is not initially used to determine long task hosts.
	ID: 19968 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19969 - Posted: 17 Dec 2010 \| 0:54:06 UTC - in response to Message 19968.
	A second task is not sent out until 48h after the initial task is sent (normally), so as yours returned in 48.5h then I guess the server just did not get round to issuing a resend before your task returned. Your task would have been fully used even if another task had been issued, as your task would be the first back, and if a task was sent out again, in most cases it would not start automatically, so if you return slightly later than 48h your task is still the most useful; un-started resends can be recalled. However, after 3 or 4days this is no longer the case; resends would have completed. Sometimes however the resends fail, but not too often, as they tend to go to reliable and faster cards. Begs the question why this allocation method is not initially used to determine long task hosts. We know all this. My comment was primarily aimed at Saenger, who seemed to be under the impression that a new task would be created, allocated, downloaded, and run unconditionally at 48 hours and 1 second - thus wasting electricity and CPU cycles on the second host. I thought a counter-example might help to set his mind at rest.
	ID: 19969 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19972 - Posted: 17 Dec 2010 \| 11:52:49 UTC - in response to Message 19969.
	I've received two reissued IBUCH_*_variantP_long_ WUs. Workunit 2166526 Workunit 2166431 I think these long WUs are waiting to be reissued, that's why we didn't receive much of them in the past 48 hours.
	ID: 19972 \| Rating: 0 \| rate: / Reply Quote

Saenger Send message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level Scientific publications	Message 19975 - Posted: 17 Dec 2010 \| 14:54:41 UTC - in response to Message 19969.
	A second task is not sent out until 48h after the initial task is sent (normally), so as yours returned in 48.5h then I guess the server just did not get round to issuing a resend before your task returned. Your task would have been fully used even if another task had been issued, as your task would be the first back, and if a task was sent out again, in most cases it would not start automatically, so if you return slightly later than 48h your task is still the most useful; un-started resends can be recalled. However, after 3 or 4days this is no longer the case; resends would have completed. Sometimes however the resends fail, but not too often, as they tend to go to reliable and faster cards. Begs the question why this allocation method is not initially used to determine long task hosts. We know all this. My comment was primarily aimed at Saenger, who seemed to be under the impression that a new task would be created, allocated, downloaded, and run unconditionally at 48 hours and 1 second - thus wasting electricity and CPU cycles on the second host. I thought a counter-example might help to set his mind at rest. OK, so it's not 48h but 49 or 50, so what? It's far before the communicated deadline of 4 days that the WU is in reality being ditched. ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki
	ID: 19975 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19976 - Posted: 17 Dec 2010 \| 15:05:00 UTC - in response to Message 19972. Last modified: 17 Dec 2010 \| 15:09:17 UTC
	http://www.gpugrid.net/workunit.php?wuid=2166431 A fine example of the reason not to send these to CC1.1 cards. The resend could be sent after 20min or 5h, but if it goes to a fast card, it will be back fairly quickly. Very few tasks returned after 4days would be worth anything, but after 2days and a few hours their value is high for the project. Saenger, from your earlier post, compute capability (CC) is fixed to the type of card. So a GT240 will always be CC1.2. The GTS250 will always be CC1.1 and a GTX470 will always be CC2.0 no matter which working NVidia driver is installed. The cuda version included in the drivers could be anything from 2.2 through to 3.2, depending on the installed driver. It is the cuda version supported by the drivers that determine which app can be run. Usually several drivers include support for the one cuda version.
	ID: 19976 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19977 - Posted: 17 Dec 2010 \| 18:35:58 UTC - in response to Message 19976.
	http://www.gpugrid.net/workunit.php?wuid=2166431 A fine example of the reason not to send these to CC1.1 cards. The resend could be sent after 20min or 5h, but if it goes to a fast card, it will be back fairly quickly. Very few tasks returned after 4days would be worth anything, but after 2days and a few hours their value is high for the project. ...and the other one I've mentioned (Workunit 2166526) is a fine example of the reason not to send these to hosts with low RAC.
	ID: 19977 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19978 - Posted: 17 Dec 2010 \| 18:51:00 UTC - in response to Message 19977.
	...and the other one I've mentioned (Workunit 2166526) is a fine example of the reason not to send these to hosts with low RAC. Actually, the RAC is a very good, and readily available basis to select hosts automatically for fast result returns. The only thing have to be well balanced is the "rush" type workunits shouldn't reduce the hosts' RAC under the selection level. I can't recall if it has been suggested before, it's so obvious. Is this complicated to implement as well the other ideas?
	ID: 19978 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19979 - Posted: 17 Dec 2010 \| 20:47:48 UTC - in response to Message 19978.
	The task duration correction factor, found in Computer Details, may be key in the allocation of resends, and have potential use as a method of determining which systems to send long tasks to, but I don't know how easy any server side changes are to make as I have never worked on a Boinc server. I do work on servers though so I can understand reluctance to move too far from the normal installations and setup. Linux tends to be about as useful as NT4 when it comes to system updates, program/service installations and drivers.
	ID: 19979 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19980 - Posted: 17 Dec 2010 \| 20:59:52 UTC - in response to Message 19979.
	I think TDCF is likely to be a most unhelpful measure of past performance - not least because I have a suspicion that the current duration estimates (as defined by <rsc_fpops_est>) don't adequately reflect the work complexity of different task types, and the efficiency of processing of the ones, like GIANNI, which take advantage of the extra processing routines in the newer applications. And if the estimates vary, then the correction factors will vary too. TDCF will merely reflect the initial fpops_est error of the most recently run tasks.
	ID: 19980 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19991 - Posted: 18 Dec 2010 \| 15:30:53 UTC - in response to Message 19980.
	a most unhelpful measure of past performance The newest system I added has a TDCF of 4.01 (GTX260 + GT240). My quad GT240 system’s TDCF is 3.54. The dual GTX470 system has a TDCF of 1.34. I think it’s working reasonably well (thanks mainly to the restrictions the scientists normally impose upon themselves) but GPUGrid is somewhat vulnerable to changes in hardware, observed and estimated run-times, mixed CPU usages (swan_sync on/off, free CPU or not, external CPU project usages), and changes in the app (via driver changes).
	ID: 19991 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19992 - Posted: 18 Dec 2010 \| 15:40:42 UTC - in response to Message 19991.
	s0r78-TONI_HIVMSMWO1-0-6-RND2382_1: 9,930.70 seconds p32-IBUCH_1_pYEEI_long_101130-8-10-RND3283_1: 45,855.31 seconds Same host, same card. Both jobs were issued with the same <rsc_fpops_est>1000000000000000.000000</rsc_fpops_est> TDCF will never be able to cope fully with an almost five-fold variation between consecutive tasks.
	ID: 19992 \| Rating: 0 \| rate: / Reply Quote

ignasi Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level Scientific publications	Message 19999 - Posted: 21 Dec 2010 \| 14:52:51 UTC - in response to Message 19960. Last modified: 21 Dec 2010 \| 14:53:03 UTC
	Hmmmm. task 3445649 Run time 162790 seconds, CPU time 42746 seconds, is an awful lot of resources to throw at a "SWAN: FATAL : swanBindToTexture1D failed -- texture not found" Any light to throw on this new error message yet? It's not been reported in the last 12 months, except by me - and that was on a completely different host. Can you report that in "Number crunching" please? thanks
	ID: 19999 \| Rating: 0 \| rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 20023 - Posted: 24 Dec 2010 \| 16:28:40 UTC - in response to Message 19921.
	It seems that long WUs aren't afterall that much gain for everybody. We have dropped by 1000 WUs in progress in one single day. We may be trying to push the computations too far here. The last thing we want is to scare people out and run away from GPUGRID. You may have to consider external factors, as well as your own internal choices. Do you have a medium/long term record of that "WUs in progress" figure? I suspect that you may have been affected by that other big NVidia beast in the BOINC jungle - SETI@home. They have been effectively out of action since the end of October. You may well have been benefitting from extra resources during that time. SETI has been (slowly and intermittently) getting back up to speed over the last week or so, and as they do so, you will inevitably lose some volunteers (or some share of their machines). Having said that, I've got IBUCH_*_variantP_long running on all four hosts at the moment - I'll comment on your other questions when they've finished, in 5 - 38 hours from now. I would suspect another reason is the new GPU app for Primegrid. Fermi cards are the performance leaders. No babysitting required. Credit is high, probably too high IMO. WUs are short. Most important, the project admins listen to the users and are very proactive about solving problems and considering user suggestions.
	ID: 20023 \| Rating: 0 \| rate: / Reply Quote

wiyosaya Send message Joined: 22 Nov 09 Posts: 114 Credit: 589,114,683 RAC: 0 Level Scientific publications	Message 20027 - Posted: 24 Dec 2010 \| 23:55:20 UTC - in response to Message 19919.
	Well, It seems that long WUs aren't afterall that much gain for everybody. We have dropped by 1000 WUs in progress in one single day. We may be trying to push the computations too far here. The last thing we want is to scare people out and run away from GPUGRID. We are going to reconsider the strategy for "fast-track" WUs. What do you think? cheers, ignasi Since I have an older card, an 8800 GT, I would prefer short versus long. At this very moment, my 8800 GT is crunching a WU that looks like it will take 36 hours of run time, and the WU is not flagged "long." No way I can shut down my PC with this one and still gain the "quick return" credit. ____________
	ID: 20027 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : News : Long WUs are out - 50% bonus

	About	Science	Volunteers	Performance	Forum	Join us	Donate

Author	Message
ignasi Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level Scientific publications	Message 19912 - Posted: 14 Dec 2010 \| 19:05:15 UTC
	Double sized WUs are out: variant_long* and include a bonus of the 50% on the credits.
	ID: 19912 \| Rating: 0 \| rate: / Reply Quote

Saenger Send message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level Scientific publications	Message 19913 - Posted: 14 Dec 2010 \| 19:48:47 UTC - in response to Message 19912.
	Double sized WUs are out: variant_long* and include a bonus of the 50% on the credits. Is there a possibility to opt out of those monsters, as they would probably crash the 2-day deadline an normal computers? Or do we have to abort them manually? And how do we recognize them? ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki
	ID: 19913 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19916 - Posted: 15 Dec 2010 \| 6:20:05 UTC - in response to Message 19912. Last modified: 15 Dec 2010 \| 6:22:25 UTC
	First one of these is completed in 7h32m (27116.891s). Time per step (avg over 2500000 steps): 10.847ms Claimed Credit: 23878 (that's increased 50% compared to a _IBUCH_?_pYEEI_long_ WU) Granted Credit: 35817 (standard 50% fast return bonus) It gives 1.32 credit/sec. that's not bad at all :) My fastest GIANNI_DHFR1000 gives 1.5 credit/sec. GPU usage is 62-64% on GTX 580, and 62-67% on GTX 480.
	ID: 19916 \| Rating: 0 \| rate: / Reply Quote

ftpd Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level Scientific publications	Message 19917 - Posted: 15 Dec 2010 \| 9:08:18 UTC Last modified: 15 Dec 2010 \| 9:10:09 UTC
	10-IBUCH_8_variantP_long-0-2-RND0787_0 Workunit 2166428 Aangemaakt 14 Dec 2010 19:08:59 UTC Sent 14 Dec 2010 22:55:38 UTC Received 15 Dec 2010 9:05:29 UTC Server state Over Outcome Success Client state Geen Exit status 0 (0x0) Computer ID 35174 Report deadline 19 Dec 2010 22:55:38 UTC Run time 35497.448383 CPU time 35436.44 stderr out <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> # Using device 0 # There is 1 device supporting CUDA # Device 0: "GeForce GTX 480" # Clock rate: 1.40 GHz # Total amount of global memory: 1610153984 bytes # Number of multiprocessors: 15 # Number of cores: 120 SWAN: Using synchronization method 0 MDIO ERROR: cannot open file "restart.coor" # Time per step (avg over 2500000 steps): 14.200 ms # Approximate elapsed time for entire WU: 35500.370 s called boinc_finish </stderr_txt> ]]> Validate state Geldig Claimed credit 23878.0787037037 Granted credit 35817.1180555555 application version ACEMD2: GPU molecular dynamics v6.13 (cuda31) -------------------------------------------------------------------------------- 14 Dec 2010 22:55:38 UTC 15 Dec 2010 9:05:29 UTC Completed and validated 35,497.45 35,436.44 23,878.08 35,817.12 ACEMD2: GPU molecular dynamics v6.13 (cuda31 -------------------------------------------------------------------------------- My first long WU! Give me more please! ____________ Ton (ftpd) Netherlands
	ID: 19917 \| Rating: 0 \| rate: / Reply Quote

ignasi Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level Scientific publications	Message 19919 - Posted: 15 Dec 2010 \| 11:12:20 UTC - in response to Message 19917.
	Well, It seems that long WUs aren't afterall that much gain for everybody. We have dropped by 1000 WUs in progress in one single day. We may be trying to push the computations too far here. The last thing we want is to scare people out and run away from GPUGRID. We are going to reconsider the strategy for "fast-track" WUs. What do you think? cheers, ignasi
	ID: 19919 \| Rating: 0 \| rate: / Reply Quote

ftpd Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level Scientific publications	Message 19920 - Posted: 15 Dec 2010 \| 11:17:32 UTC - in response to Message 19919. Last modified: 15 Dec 2010 \| 11:18:36 UTC
	Ignasi, If i had to choose: very long WU's or small quick WU's, but then a greater amount of WU's. I am NOT scared, but give some cards (gts250) also something to crunch. ____________ Ton (ftpd) Netherlands
	ID: 19920 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19921 - Posted: 15 Dec 2010 \| 11:48:47 UTC - in response to Message 19919.
	It seems that long WUs aren't afterall that much gain for everybody. We have dropped by 1000 WUs in progress in one single day. We may be trying to push the computations too far here. The last thing we want is to scare people out and run away from GPUGRID. You may have to consider external factors, as well as your own internal choices. Do you have a medium/long term record of that "WUs in progress" figure? I suspect that you may have been affected by that other big NVidia beast in the BOINC jungle - SETI@home. They have been effectively out of action since the end of October. You may well have been benefitting from extra resources during that time. SETI has been (slowly and intermittently) getting back up to speed over the last week or so, and as they do so, you will inevitably lose some volunteers (or some share of their machines). Having said that, I've got IBUCH_*_variantP_long running on all four hosts at the moment - I'll comment on your other questions when they've finished, in 5 - 38 hours from now.
	ID: 19921 \| Rating: 0 \| rate: / Reply Quote

ignasi Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level Scientific publications	Message 19922 - Posted: 15 Dec 2010 \| 11:53:41 UTC - in response to Message 19920.
	Sure. I am submitting non-long WUs at the moment. There's plenty of work for everybody always. i
	ID: 19922 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19923 - Posted: 15 Dec 2010 \| 12:12:53 UTC - in response to Message 19922.
	We have dropped by 1000 WUs in progress in one single day. In progress includes running tasks and queued tasks. As running tasks will take longer there are less queued tasks, that's why there are less in progress. You will need to wait for a few days to equilibrate. Don't hit the panic button.
	ID: 19923 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19926 - Posted: 15 Dec 2010 \| 13:47:39 UTC - in response to Message 19919. Last modified: 15 Dec 2010 \| 14:16:20 UTC
	Well, It seems that long WUs aren't afterall that much gain for everybody. We have dropped by 1000 WUs in progress in one single day. We may be trying to push the computations too far here. The last thing we want is to scare people out and run away from GPUGRID. We are going to reconsider the strategy for "fast-track" WUs. What do you think? cheers, ignasi Ignasi, you don't read the other topics in your forum, do you? :) I think you should separate the _long_ workunits from the normal WUs, or even create _short_ WUs for crunchers with older cards. If you couldn't develop an automated separation process, you should make it possible for the users to do it on their own (but not by aborting long WUs one by one manually). There are some computers equipped with multiple, but very different cards, so this is a very complicated problem. The best solution would be limiting the running time of a WU, instead of using a fixed simulation timeframe. (As far as I know, this is almost impossible for GPUGRID to implement.) My other project (rosetta@home) gives the user the opportunity of setting a desired WU running time, you too should give this opportunity in some way or another. My computers are on 24/7 so I could (and I would) do even a 100ns simulation, if it were up to me, and if the credit/time ratio would be the same (or higher). Another way to get around this problem: you should create a new project under BOINC for these _long_ WUs - let's say it's called FermiGRID - and encourage users with faster cards to join FermiGRID, and set GPUGRID as a backup project. You should contact NVidia about the naming of the new project before it starts, maybe they consider it as advertisement (and give you something in exchange), or maybe they consider it as a copyright infringement and send their lawyers to sue you for it. :)
	ID: 19926 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19927 - Posted: 15 Dec 2010 \| 14:15:39 UTC - in response to Message 19921.
	... Having said that, I've got IBUCH__variantP_long running on all four hosts at the moment - I'll comment on your other questions when they've finished, in 5 - 38 hours from now. Well, one has now failed - task 3445790. Exit status -40 (0xffffffffffffffd8) SWAN: FATAL : swanBindToTexture1D failed -- texture not found I hope that isn't a low memory outcome on a 512MB card - if it is, the others are going to go the same way. And I hope the new shorter batch aren't all going to be _HIVPR_n1_[un]bound_* - they are just as difficult on my cards.
	ID: 19927 \| Rating: 0 \| rate: / Reply Quote

ignasi Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level Scientific publications	Message 19928 - Posted: 15 Dec 2010 \| 14:16:00 UTC - in response to Message 19926.
	@skgiven Correct. It's a matter of few days as you say. @Retvari Zoltan I do my best, thank you. The long WUs are not meant to be something regular at all. We always stated that. The problem is that I have abused of them in the rush for getting key results back for publication. The solution for this issue is not just extending WU length. It has to be well thought out and degrees of freedom to adjust, pinpointed. Classification of WU by card comp. capacity is certainly an option. Back to science, i
	ID: 19928 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19930 - Posted: 15 Dec 2010 \| 15:33:13 UTC - in response to Message 19928. Last modified: 15 Dec 2010 \| 15:37:28 UTC
	I do my best, thank you. I'm sorry, I didn't mean to offend you. The long WUs are not meant to be something regular at all. We always stated that. The problem is that I have abused of them in the rush for getting key results back for publication. We, crunchers, on the other end of the project, see this problem from a very different view. When a WU fails, the cruncher will be disappointed, and if many WUs keep on failing, the cruncher will leave this project for a more successful one. We don't see the progress of the subprojects you are working on, and cannot choose the appropriate subproject for our GPUs. The solution for this issue is not just extending WU length. It has to be well thought out and degrees of freedom to adjust, pinpointed. Classification of WU by card comp. capacity is certainly an option. I'm (and I suppose every cruncher in this forum are) just guessing what could be the best solution for the project, because I don't have the information needed to pick (to invent) the right one. But you (I mean GPUGRID) have that information (the precise number of the crunchers, and their WU returning time). You just have to process that information wisely. You should create a little application for simulating GPUGRID (if you don't have one already). This application have to have some variables in it, for example WU length, subprojects, and processing realiability. If you play with those variables a little from time to time, you can choose the best thing to change in the whole project. You can even implement a simulation of a very different GPUGRID, to see if the new one worth the hassle. Just like when SETI were transformed to BOINC.
	ID: 19930 \| Rating: 0 \| rate: / Reply Quote

ftpd Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level Scientific publications	Message 19933 - Posted: 15 Dec 2010 \| 17:04:55 UTC - in response to Message 19922.
	Ignasi, My gts250 does NOT receive any downloads for 24 hrs now. All Kashiv-wu are cancelling after several hours on this card. Is that the reason? ____________ Ton (ftpd) Netherlands
	ID: 19933 \| Rating: 0 \| rate: / Reply Quote

dataman Send message Joined: 18 Sep 08 Posts: 36 Credit: 100,352,867 RAC: 0 Level Scientific publications	Message 19935 - Posted: 15 Dec 2010 \| 17:28:20 UTC
	I just cruised by to say I'm lovin' the new wu's. ~21 hours on GTX260 but only ~55% utilization of the card. ??? Nice credits too :) Good job. ____________
	ID: 19935 \| Rating: 0 \| rate: / Reply Quote

Saenger Send message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level Scientific publications	Message 19936 - Posted: 15 Dec 2010 \| 18:39:28 UTC - in response to Message 19928.
	The problem is that I have abused of them in the rush for getting key results back for publication. The solution for this issue is not just extending WU length. It has to be well thought out and degrees of freedom to adjust, pinpointed. Classification of WU by card comp. capacity is certainly an option. That's what I asked for in this thread. At the moment, and especially with sending such monsters to everyone participating, even normal crunchers, running their cards only 8h per day, and not the latest, most expensive cards, you are alienating those crunchers. I had to abort one of these monsters after 15h, because it would never have made the 48h deadline, and better to waste just 15h than to waste 48h. As you seem to know quite good beforehand how demanding your WUs will be, the fixed credits give a good clue to that, you could do such adjustment could be only to allow certain types of WUs for the crunchers, i.e. no bigger than 5000 credits claim for mine usually, if nothing else is available give up to 8000 claim. I think you could do that even on the scheduler, although I'm no programmer, but it shouldn't be so hard to put the GPUs on our puters in classes and send them according to their capabilities. Those capabilities are known to BOINC. ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki
	ID: 19936 \| Rating: 0 \| rate: / Reply Quote

ftpd Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level Scientific publications	Message 19937 - Posted: 15 Dec 2010 \| 18:57:35 UTC
	Computer ID 47762 Report deadline 19 Dec 2010 23:11:00 UTC Run time 63982.28125 CPU time 17871.09 stderr out <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code 98 (0x62) </message> <stderr_txt> # Using device 0 # There are 2 devices supporting CUDA # Device 0: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939327488 bytes # Number of multiprocessors: 30 # Number of cores: 240 # Device 1: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939196416 bytes # Number of multiprocessors: 30 # Number of cores: 240 MDIO ERROR: cannot open file "restart.coor" ERROR: file tclutil.cpp line 31: get_Dvec() element 0 (b) called boinc_finish </stderr_txt> ]]> Validate state This one cancelled after almost 18 hrs. Windows XP - gtx 295. Next one, please ____________ Ton (ftpd) Netherlands
	ID: 19937 \| Rating: 0 \| rate: / Reply Quote

ftpd Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level Scientific publications	Message 19939 - Posted: 15 Dec 2010 \| 20:58:58 UTC
	Computer ID 47762 Report deadline 19 Dec 2010 23:11:00 UTC Run time 71377.171875 CPU time 18345.22 stderr out <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> # Using device 1 # There are 2 devices supporting CUDA # Device 0: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939327488 bytes # Number of multiprocessors: 30 # Number of cores: 240 # Device 1: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939196416 bytes # Number of multiprocessors: 30 # Number of cores: 240 MDIO ERROR: cannot open file "restart.coor" # Using device 1 # There are 2 devices supporting CUDA # Device 0: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939327488 bytes # Number of multiprocessors: 30 # Number of cores: 240 # Device 1: "GeForce GTX 295" # Clock rate: 1.24 GHz # Total amount of global memory: 939196416 bytes # Number of multiprocessors: 30 # Number of cores: 240 # Time per step (avg over 65000 steps): 34.528 ms # Approximate elapsed time for entire WU: 86320.913 s called boinc_finish </stderr_txt> ]]> Validate state Geldig Claimed credit 23878.0787037037 Granted credit 35817.1180555555 application version ACEMD2: GPU molecular dynamics v6.13 (cuda31) -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- This one is succesfull. Next one is processing! ____________ Ton (ftpd) Netherlands
	ID: 19939 \| Rating: 0 \| rate: / Reply Quote

Werkstatt Send message Joined: 23 May 09 Posts: 121 Credit: 321,525,386 RAC: 12,239 Level Scientific publications	Message 19940 - Posted: 15 Dec 2010 \| 21:11:52 UTC
	Ignasi, is there a way to sort out the long wu's by an app_info.xml? If, for example, the 6.13 app is only for the shorter wu's and a (suggested) 6.14 app is for the longer wu's everyone can decide for himself what type he wants to crunch. And without app_info he gets whats available. Alexander
	ID: 19940 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19948 - Posted: 16 Dec 2010 \| 10:44:57 UTC - in response to Message 19940.
	Presently, tasks are allocated acording to what Cuda capability your card has, which in turn is determined by your driver: Use a full driver after 197.45 and you will get the 6.13app to run tasks on. Use 197.45 or earlier (down to 195) and you will use the 6.12app to run tasks. So if we extended the existing system to another app, we would still be asking crunchers to uninstall and reinstall drivers to crunch small tasks. Some of the recent suggestions are starting to go round in circles; many of the suggestions we have seen in the last day or so have been suggested before, some several times. The best place for suggestions in the Wish List.
	ID: 19948 \| Rating: 0 \| rate: / Reply Quote

Fred J. Verster Send message Joined: 1 Apr 09 Posts: 58 Credit: 35,833,978 RAC: 0 Level Scientific publications	Message 19956 - Posted: 16 Dec 2010 \| 14:53:35 UTC - in response to Message 19948.
	No problems with the long WU's , but when checking upon my XP64 host, cause computing times got longer on 'the same WU' types. GPUz reported, after I did put an 470 next to the 480, the 470 runs in PCI-E x2 mode and the 480 in 1x mode?! Which is ofcoarse slower, not 16x but about 1.5 times. OK, ( a bit) off topic, but very odd and haven't found a setting(BIOS) or reason, as to why it does this. Looks like it doesn't like NVIDIA cards! Have 2 ASUS P5E Mobo's, 1 with a Q6600 and ATI's HD4850 & HD5870, both running PCI-E x16, ver2.0 The other has an X9650 (@3.5GHz)and a GTX480 and a 470, only difference is OS type. When I try to force, in BIOS, the cards in 16x mode, it suddenly stops, no warning or fault, just 'hangs'. Maybe the X38 chipset, as all the cards are from ASUS as well. But I do see an unusual amount of computation errors on 200 series, also on FX (Quadro) cards. (Too much difference in architecture between 200 and 400, 500 series?) ____________ Knight Who Says Ni N!
	ID: 19956 \| Rating: 0 \| rate: / Reply Quote

Saenger Send message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level Scientific publications	Message 19958 - Posted: 16 Dec 2010 \| 15:24:18 UTC - in response to Message 19948.
	Presently, tasks are allocated acording to what Cuda capability your card has, which in turn is determined by your driver: My compute capability of 1.2 hasn't changed with the change of drivers, only the Cuda version number. NVIDIA GPU 0: GeForce GT 240 (driver version unknown, CUDA version 3020, compute capability 1.2, 511MB, 257 GFLOPS peak) If you don't use what you already know from a) our compute capabilities and b) the specific requiremants for each and every WU to give the computers just those WUs that they are capable of, it's your active decision to waste computing power by giving demanding WUs to low performing computers. You know beforehand what will be wasted, you just don't care about it. ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki
	ID: 19958 \| Rating: 0 \| rate: / Reply Quote

dataman Send message Joined: 18 Sep 08 Posts: 36 Credit: 100,352,867 RAC: 0 Level Scientific publications	Message 19959 - Posted: 16 Dec 2010 \| 23:05:49 UTC
	Have we run out of long wu's? I was getting one about 1 in 4 but now they have stopped or has something else changed? I like them :) ____________
	ID: 19959 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19960 - Posted: 16 Dec 2010 \| 23:10:34 UTC
	Hmmmm. task 3445649 Run time 162790 seconds, CPU time 42746 seconds, is an awful lot of resources to throw at a "SWAN: FATAL : swanBindToTexture1D failed -- texture not found" Any light to throw on this new error message yet? It's not been reported in the last 12 months, except by me - and that was on a completely different host.
	ID: 19960 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19961 - Posted: 16 Dec 2010 \| 23:12:03 UTC - in response to Message 19959.
	Have we run out of long wu's? I was getting one about 1 in 4 but now they have stopped or has something else changed? I like them :) You're welcome to my resend. Have fun with it ;-)
	ID: 19961 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19965 - Posted: 17 Dec 2010 \| 0:22:21 UTC
	Another of my hosts has just completed WU 2166545 - p19-IBUCH_10_variantP_long-0-2-RND8229. It took 48.5 hours, so the forgone 'quick return' bonus cancelled out the 50% 'long' bonus. But, more importantly - and contrary to some complaints here - no second copy was created, and the work I did was 100% useful for the science.
	ID: 19965 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19966 - Posted: 17 Dec 2010 \| 0:23:18 UTC - in response to Message 19961.
	I would be more worried about the "No heartbeat from core client for 30 sec - exiting" message. This bit has been seen in the past, - exit code -40 (0xffffffd8) To speculate, I think this means the error is project related, rather than Boinc or system; otherwise I think a zero would have been returned. Perhaps one of the scientists can confirm and elucidate on this message, "SWAN: FATAL : swanBindToTexture1D failed -- texture not found" To fail after 45h runtime is not a good situation, hence the present lack of long tasks. It's been passed to a GTX275.
	ID: 19966 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19967 - Posted: 17 Dec 2010 \| 0:40:20 UTC - in response to Message 19966.
	I would be more worried about the "No heartbeat from core client for 30 sec - exiting" message. Twice, early in the task lifetime? No, I'm not worried about that. I did do this month's Windows security updates in the middle of this run (under manual supervision - I don't allow fully automatic updates), and had a couple of lockups afterwards. But I used the machine for some layout work earlier this evening, and it was running fine: and the error happened while the machine was otherwise idle, and I was monitoring the tasks remotely via BoincView as normal. I'm more worried about "SWAN: FATAL : swanBindToTexture1D failed -- texture not found", which - like you - I suspect to be an application or task definition error: probably the latter, since a similar task has just finished successfully on an identical card to my first observation of that error message.
	ID: 19967 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19968 - Posted: 17 Dec 2010 \| 0:43:51 UTC - in response to Message 19966.
	A second task is not sent out until 48h after the initial task is sent (normally), so as yours returned in 48.5h then I guess the server just did not get round to issuing a resend before your task returned. Your task would have been fully used even if another task had been issued, as your task would be the first back, and if a task was sent out again, in most cases it would not start automatically, so if you return slightly later than 48h your task is still the most useful; un-started resends can be recalled. However, after 3 or 4days this is no longer the case; resends would have completed. Sometimes however the resends fail, but not too often, as they tend to go to reliable and faster cards. Begs the question why this allocation method is not initially used to determine long task hosts.
	ID: 19968 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19969 - Posted: 17 Dec 2010 \| 0:54:06 UTC - in response to Message 19968.
	A second task is not sent out until 48h after the initial task is sent (normally), so as yours returned in 48.5h then I guess the server just did not get round to issuing a resend before your task returned. Your task would have been fully used even if another task had been issued, as your task would be the first back, and if a task was sent out again, in most cases it would not start automatically, so if you return slightly later than 48h your task is still the most useful; un-started resends can be recalled. However, after 3 or 4days this is no longer the case; resends would have completed. Sometimes however the resends fail, but not too often, as they tend to go to reliable and faster cards. Begs the question why this allocation method is not initially used to determine long task hosts. We know all this. My comment was primarily aimed at Saenger, who seemed to be under the impression that a new task would be created, allocated, downloaded, and run unconditionally at 48 hours and 1 second - thus wasting electricity and CPU cycles on the second host. I thought a counter-example might help to set his mind at rest.
	ID: 19969 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19972 - Posted: 17 Dec 2010 \| 11:52:49 UTC - in response to Message 19969.
	I've received two reissued IBUCH_*_variantP_long_ WUs. Workunit 2166526 Workunit 2166431 I think these long WUs are waiting to be reissued, that's why we didn't receive much of them in the past 48 hours.
	ID: 19972 \| Rating: 0 \| rate: / Reply Quote

Saenger Send message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level Scientific publications	Message 19975 - Posted: 17 Dec 2010 \| 14:54:41 UTC - in response to Message 19969.
	A second task is not sent out until 48h after the initial task is sent (normally), so as yours returned in 48.5h then I guess the server just did not get round to issuing a resend before your task returned. Your task would have been fully used even if another task had been issued, as your task would be the first back, and if a task was sent out again, in most cases it would not start automatically, so if you return slightly later than 48h your task is still the most useful; un-started resends can be recalled. However, after 3 or 4days this is no longer the case; resends would have completed. Sometimes however the resends fail, but not too often, as they tend to go to reliable and faster cards. Begs the question why this allocation method is not initially used to determine long task hosts. We know all this. My comment was primarily aimed at Saenger, who seemed to be under the impression that a new task would be created, allocated, downloaded, and run unconditionally at 48 hours and 1 second - thus wasting electricity and CPU cycles on the second host. I thought a counter-example might help to set his mind at rest. OK, so it's not 48h but 49 or 50, so what? It's far before the communicated deadline of 4 days that the WU is in reality being ditched. ____________ Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki
	ID: 19975 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19976 - Posted: 17 Dec 2010 \| 15:05:00 UTC - in response to Message 19972. Last modified: 17 Dec 2010 \| 15:09:17 UTC
	http://www.gpugrid.net/workunit.php?wuid=2166431 A fine example of the reason not to send these to CC1.1 cards. The resend could be sent after 20min or 5h, but if it goes to a fast card, it will be back fairly quickly. Very few tasks returned after 4days would be worth anything, but after 2days and a few hours their value is high for the project. Saenger, from your earlier post, compute capability (CC) is fixed to the type of card. So a GT240 will always be CC1.2. The GTS250 will always be CC1.1 and a GTX470 will always be CC2.0 no matter which working NVidia driver is installed. The cuda version included in the drivers could be anything from 2.2 through to 3.2, depending on the installed driver. It is the cuda version supported by the drivers that determine which app can be run. Usually several drivers include support for the one cuda version.
	ID: 19976 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19977 - Posted: 17 Dec 2010 \| 18:35:58 UTC - in response to Message 19976.
	http://www.gpugrid.net/workunit.php?wuid=2166431 A fine example of the reason not to send these to CC1.1 cards. The resend could be sent after 20min or 5h, but if it goes to a fast card, it will be back fairly quickly. Very few tasks returned after 4days would be worth anything, but after 2days and a few hours their value is high for the project. ...and the other one I've mentioned (Workunit 2166526) is a fine example of the reason not to send these to hosts with low RAC.
	ID: 19977 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 470 Level Scientific publications	Message 19978 - Posted: 17 Dec 2010 \| 18:51:00 UTC - in response to Message 19977.
	...and the other one I've mentioned (Workunit 2166526) is a fine example of the reason not to send these to hosts with low RAC. Actually, the RAC is a very good, and readily available basis to select hosts automatically for fast result returns. The only thing have to be well balanced is the "rush" type workunits shouldn't reduce the hosts' RAC under the selection level. I can't recall if it has been suggested before, it's so obvious. Is this complicated to implement as well the other ideas?
	ID: 19978 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19979 - Posted: 17 Dec 2010 \| 20:47:48 UTC - in response to Message 19978.
	The task duration correction factor, found in Computer Details, may be key in the allocation of resends, and have potential use as a method of determining which systems to send long tasks to, but I don't know how easy any server side changes are to make as I have never worked on a Boinc server. I do work on servers though so I can understand reluctance to move too far from the normal installations and setup. Linux tends to be about as useful as NT4 when it comes to system updates, program/service installations and drivers.
	ID: 19979 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19980 - Posted: 17 Dec 2010 \| 20:59:52 UTC - in response to Message 19979.
	I think TDCF is likely to be a most unhelpful measure of past performance - not least because I have a suspicion that the current duration estimates (as defined by <rsc_fpops_est>) don't adequately reflect the work complexity of different task types, and the efficiency of processing of the ones, like GIANNI, which take advantage of the extra processing routines in the newer applications. And if the estimates vary, then the correction factors will vary too. TDCF will merely reflect the initial fpops_est error of the most recently run tasks.
	ID: 19980 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 19991 - Posted: 18 Dec 2010 \| 15:30:53 UTC - in response to Message 19980.
	a most unhelpful measure of past performance The newest system I added has a TDCF of 4.01 (GTX260 + GT240). My quad GT240 system’s TDCF is 3.54. The dual GTX470 system has a TDCF of 1.34. I think it’s working reasonably well (thanks mainly to the restrictions the scientists normally impose upon themselves) but GPUGrid is somewhat vulnerable to changes in hardware, observed and estimated run-times, mixed CPU usages (swan_sync on/off, free CPU or not, external CPU project usages), and changes in the app (via driver changes).
	ID: 19991 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1576 Credit: 5,865,111,851 RAC: 10,071,780 Level Scientific publications	Message 19992 - Posted: 18 Dec 2010 \| 15:40:42 UTC - in response to Message 19991.
	s0r78-TONI_HIVMSMWO1-0-6-RND2382_1: 9,930.70 seconds p32-IBUCH_1_pYEEI_long_101130-8-10-RND3283_1: 45,855.31 seconds Same host, same card. Both jobs were issued with the same <rsc_fpops_est>1000000000000000.000000</rsc_fpops_est> TDCF will never be able to cope fully with an almost five-fold variation between consecutive tasks.
	ID: 19992 \| Rating: 0 \| rate: / Reply Quote

ignasi Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level Scientific publications	Message 19999 - Posted: 21 Dec 2010 \| 14:52:51 UTC - in response to Message 19960. Last modified: 21 Dec 2010 \| 14:53:03 UTC
	Hmmmm. task 3445649 Run time 162790 seconds, CPU time 42746 seconds, is an awful lot of resources to throw at a "SWAN: FATAL : swanBindToTexture1D failed -- texture not found" Any light to throw on this new error message yet? It's not been reported in the last 12 months, except by me - and that was on a completely different host. Can you report that in "Number crunching" please? thanks
	ID: 19999 \| Rating: 0 \| rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 20023 - Posted: 24 Dec 2010 \| 16:28:40 UTC - in response to Message 19921.
	It seems that long WUs aren't afterall that much gain for everybody. We have dropped by 1000 WUs in progress in one single day. We may be trying to push the computations too far here. The last thing we want is to scare people out and run away from GPUGRID. You may have to consider external factors, as well as your own internal choices. Do you have a medium/long term record of that "WUs in progress" figure? I suspect that you may have been affected by that other big NVidia beast in the BOINC jungle - SETI@home. They have been effectively out of action since the end of October. You may well have been benefitting from extra resources during that time. SETI has been (slowly and intermittently) getting back up to speed over the last week or so, and as they do so, you will inevitably lose some volunteers (or some share of their machines). Having said that, I've got IBUCH_*_variantP_long running on all four hosts at the moment - I'll comment on your other questions when they've finished, in 5 - 38 hours from now. I would suspect another reason is the new GPU app for Primegrid. Fermi cards are the performance leaders. No babysitting required. Credit is high, probably too high IMO. WUs are short. Most important, the project admins listen to the users and are very proactive about solving problems and considering user suggestions.
	ID: 20023 \| Rating: 0 \| rate: / Reply Quote

wiyosaya Send message Joined: 22 Nov 09 Posts: 114 Credit: 589,114,683 RAC: 0 Level Scientific publications	Message 20027 - Posted: 24 Dec 2010 \| 23:55:20 UTC - in response to Message 19919.
	Well, It seems that long WUs aren't afterall that much gain for everybody. We have dropped by 1000 WUs in progress in one single day. We may be trying to push the computations too far here. The last thing we want is to scare people out and run away from GPUGRID. We are going to reconsider the strategy for "fast-track" WUs. What do you think? cheers, ignasi Since I have an older card, an 8800 GT, I would prefer short versus long. At this very moment, my 8800 GT is crunching a WU that looks like it will take 36 hours of run time, and the WU is not flagged "long." No way I can shut down my PC with this one and still gain the "quick return" credit. ____________
	ID: 20027 \| Rating: 0 \| rate: / Reply Quote