Credit where Credit is due ...

Message boards : Graphics cards (GPUs) : Credit where Credit is due ...

Author	Message
Paul D. Buck Send message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level Scientific publications	Message 16619 - Posted: 29 Apr 2010 \| 5:07:28 UTC
	Here goes Paul, tilting at windmills again ... This Task Set (or whatever termonology UCB is using this week, had a 100% failure rate on all that attempted it ... now, one of my favorite projects pays even for failures because the failures are equally interesting to the project ... I don't know how often this happens, but to me, this is a clear case where the project should also be paying (in my opinion) because we made the good faith effort to produce a result for a flawed task ... we can debate the pay rate ... but, my point is, it was not the fault of the people who did the work that you asked them to do something that was not possible ... I know, I know, cobblestones are worthless ... well if they are... what is the beef with paying them?
	ID: 16619 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 16620 - Posted: 29 Apr 2010 \| 7:53:05 UTC - in response to Message 16619. Last modified: 29 Apr 2010 \| 7:53:27 UTC
	Sorry Paul - is the "other" project which is paying for failures BOINC-based?
	ID: 16620 \| Rating: 0 \| rate: / Reply Quote

MarkJ Volunteer moderator Volunteer tester Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level Scientific publications	Message 16624 - Posted: 29 Apr 2010 \| 12:23:37 UTC - in response to Message 16620.
	Sorry Paul - is the "other" project which is paying for failures BOINC-based? Well CPDN does and its BOINC based, but it depends on trickles to get there. Given GPUgrid doesn't use trickles that might present an issue (or an opportunity - you could have larger wu using trickles). ____________ BOINC blog
	ID: 16624 \| Rating: 0 \| rate: / Reply Quote

Paul D. Buck Send message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level Scientific publications	Message 16626 - Posted: 29 Apr 2010 \| 14:59:09 UTC - in response to Message 16624.
	Sorry Paul - is the "other" project which is paying for failures BOINC-based? Well CPDN does and its BOINC based, but it depends on trickles to get there. Given GPUgrid doesn't use trickles that might present an issue (or an opportunity - you could have larger wu using trickles). As MarkJ noted one of the projects is CPDN... I was also thinking of WCG where the new sub-project DDDT2 has some molecules that when modeled "blow-up" and the application reports a failure. The just went from a 5 to 3 failure test (3 replications vice 5) and they pay because the failures of the application to solve essentially demonstrates that that line of research is a dead end ... in this case they do a post award after all the failed results are in ... When they have a suite of "successful" tasks they create the next generation of tasks which is larger and builds on the ones that "worked" Back to CPDN, their tasks run for hundreds of hours of course which is far more than GPU Grid and WCG, but the principle is the same ... if the task has a likelihood of failure, do not punish the willing for a failure of the project ...
	ID: 16626 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 16635 - Posted: 29 Apr 2010 \| 18:53:20 UTC
	That would mean something along the lines of "If enough people can't run a WU, there must be something wrong project-wise, so pay them at least something"? Sounds fair enough, if it doesn't somehow lead to cheating. I.e. program your own app which - checks other results of a WU - if it finds an error, returns the same error and a certain runtime - if it doesn't find a previous error generates a probable one and a certain runtime MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 16635 \| Rating: 0 \| rate: / Reply Quote

Paul D. Buck Send message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level Scientific publications	Message 16638 - Posted: 29 Apr 2010 \| 19:34:27 UTC - in response to Message 16635. Last modified: 29 Apr 2010 \| 19:38:30 UTC
	That would mean something along the lines of "If enough people can't run a WU, there must be something wrong project-wise, so pay them at least something"? Sounds fair enough, if it doesn't somehow lead to cheating. I.e. program your own app which - checks other results of a WU - if it finds an error, returns the same error and a certain runtime - if it doesn't find a previous error generates a probable one and a certain runtime MrS I will grant that there is that small class of users that are crass enough that they would invest the effort... the problem is that instead of looking for those people we choose instead to punish (in effect) those that are sincerely trying to help, and through no fault of their own, can't ... Now, on a more practical matter, few tasks fail like this on any of the projects, on CPDN you have to have part of the work done and return the trickles, on WCG I am not sure, but I think they have other sanity checks as well ... not sure what about here if the failed task returns partly completed files or not ... but the point is that the person would have to get one of the rares, check to find it is failing and then forge a satisfactory result... quite a feat ... easier just to run the damn thing I would think ... Besides, he/she would have to be the last of the loop to know that it was worthwhile to attempt to forge the failure ... Again, quite a feat of arms as it were ... {edit-add} BTW, I will point out that the 5-12 hour run times on a GPU is equivalent to 300-1,200 hours on the CPU, well above CPDN's run time equivalency at the current stage of their models ... so, we are in the same general vicinity of total calculations done in a model ... the GPUs just do is so much faster that we lose sight of the massive amount of work that is actually being accomplished... as I have noted elsewhere and elsewhen, an MW task takes hours to run on a CPU core, but I am running them off in less than 2 minutes on my GPUs ...
	ID: 16638 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 16664 - Posted: 30 Apr 2010 \| 7:55:26 UTC - in response to Message 16638. Last modified: 30 Apr 2010 \| 10:45:28 UTC
	Cheating is a concern, in fact, but also random errors due to overclocking/driver issues are IMHO. Up to now, the vast majority of the errors that we see are due to misconfigured hosts rather than WU mistakes. Incidentally, that's the reason why we can't reliably (ie automatically) figure out "erroneous" wu right away.
	ID: 16664 \| Rating: 0 \| rate: / Reply Quote

Snow Crash Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level Scientific publications	Message 16674 - Posted: 30 Apr 2010 \| 13:14:16 UTC
	How about awarding credits for failures only after a WU reaches the "too many failures" status? ____________ Thanks - Steve
	ID: 16674 \| Rating: 0 \| rate: / Reply Quote

Paul D. Buck Send message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level Scientific publications	Message 16675 - Posted: 30 Apr 2010 \| 14:27:48 UTC - in response to Message 16664.
	Cheating is a concern, in fact, but also random errors due to overclocking/driver issues are IMHO. Up to now, the vast majority of the errors that we see are due to misconfigured hosts rather than WU mistakes. Incidentally, that's the reason why we can't reliably (ie automatically) figure out "erroneous" wu right away. I would think you would be able to coorelate the reliablity index of the participating computers as part of the process ... at any rate, - this is a suggestion to consider... - it does not have to be automatic ... - it should as I originally suggest (I thought), and as Snow Crash suggested, be after "too many failures" is reached... - Cheating and mis-configuration are legitimate concerns of the project ... - tasks that are impossible to process are equally legitimate concerns of the participant ... - we do best when cooperation is the word of the day, and everyone's concerns are considered ... Thanks for taking the time to think about this ... :)
	ID: 16675 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 16678 - Posted: 30 Apr 2010 \| 14:39:52 UTC - in response to Message 16674.
	Perhaps it could be done for Betas; the limited Beta numbers would deter sneaky programmers from building apps to point steal. Betas are usually the tasks that fail the most because the task/app has a problem. The normal tasks, as said, usually fail as a result of other problems (system stability, other programs [especially games], bad configurations, OC, Boinc problems, hard restarts, summer heat, failing cards, hardware limitations [RAM or drive space], network issues...). So if you reward people for messing up their system configuration it is counterproductive - Better to encourage stability and give help in the forum. If a normal batch of tasks start failing, on several users systems, perhaps points could be rewarded for time spent on the basis that they should have been put through Betas! My main concern is that this would take up too much time for the scientists. If they had to spend 2h a week awarding points, then over a year that’s 100hours they could have been spending developing faster apps, which in turn would result in more points anyway! Oh, and more science ;) What do you want, a few more points now, or your ATI cards to work in 2 months rather than 6 months?
	ID: 16678 \| Rating: 0 \| rate: / Reply Quote

Paul D. Buck Send message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level Scientific publications	Message 16687 - Posted: 30 Apr 2010 \| 19:00:08 UTC - in response to Message 16678.
	You posit this as an either or to the exclusion of both ... This one instance is an example... I could also have, and probably should have pointed back to the many runs where the use of incorrect parameters or other problems caused the users signficant issues and many failed tasks ... less common for the moment ... but who knows when that might return? The other point is that the whole reason for the issue multiple times is to detect these issues ... but a task that fails on 5 different systems is not likely a task that is failing for the reasons you posit ... I mean, I cannot chose the tasks to run, the only way I can do that is to abort lots of tasks and check each one on the off-chance I can find myself as the 5th of the group that has already failed 4 times before ... but in doing that I run my total tasks per day into the toilet as well ... Sorry, but everyone seems to be looking for even the slimmest excuse to not think about doing something like this on the off-chance that one person somewhere some time might get something they don't deserve instead of thinking of all of those that are putting in hours of compute time for no pay, and that is a far larger group ... Or are you saying that the vast majority of participants are cheaters? Heck, even the OC crowd is fairly small ... most of the people I know don't OC, it is not all that common because it is hard to do right, easy to do wrong (which means lots of bad results) ...
	ID: 16687 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 16692 - Posted: 30 Apr 2010 \| 20:19:50 UTC - in response to Message 16687.
	I think that is a slightly limited take on what I said. I dont like that idea of the scientists doing unnessary work, especially if it takes from the science, and as I explained we could all end up getting less points as a result of limiting development of apps for both NVidia and ATI cards!
	ID: 16692 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Graphics cards (GPUs) : Credit where Credit is due ...

	About	Science	Volunteers	Performance	Forum	Join us	Donate