Advanced search

Message boards : Number crunching : ADRIA_2OV5_CONF_CLOSED

Author Message
Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44260 - Posted: 23 Aug 2016 | 8:37:20 UTC
Last modified: 23 Aug 2016 | 8:51:37 UTC

These WU's error immediately with

ERROR: file mdioload.cpp line 229: Error reading parmtop file

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44262 - Posted: 23 Aug 2016 | 9:24:01 UTC - in response to Message 44260.

Yes, I got two of them in the last hour, and both errored out after 2 seconds on a GTX 960 (Ubuntu 16.04).

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44264 - Posted: 23 Aug 2016 | 11:14:31 UTC

That's a bad batch, all workunits from this batch have failed on my hosts too.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44270 - Posted: 23 Aug 2016 | 15:49:09 UTC - in response to Message 44264.

That's a bad batch, all workunits from this batch have failed on my hosts too.

I've had 9 of these fail in the first few seconds but this one is still running after 7 hours:

https://www.gpugrid.net/workunit.php?wuid=11700315

It's the only one preceded by a 0 (0-ADRIA_2OV5_CONF_CLOSED). The others are all preceded by a 1, 2 or 3.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1120
Credit: 8,874,495,176
RAC: 33,547,015
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 44281 - Posted: 24 Aug 2016 | 5:54:44 UTC

the bad thing with this problem is that once 3 such tasks have been downoladed (and failed after a few seconds), the host will not be able to download any other tasks within the next 24 hours :-(
(BOINC notice: "the computer has finished a daily quota of 3 tasks")

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44284 - Posted: 24 Aug 2016 | 7:56:19 UTC - in response to Message 44281.

the bad thing with this problem is that once 3 such tasks have been downoladed (and failed after a few seconds), the host will not be able to download any other tasks within the next 24 hours :-(
(BOINC notice: "the computer has finished a daily quota of 3 tasks")

The computer you are referring to has a 3 task max per day for that type of task. Not all computers have that tight a restriction. That restriction is based on factors I am not aware of, but there are rules about that somewhere. Maybe how many it can do successfully consecutively? Maybe its ratio to fails and successes? Maybe it is lowered based on recent failures and increases over time with no failures? I really don't know. But here is the link to the host computer you are referring to and its max tasks per day for each type of task.
https://www.gpugrid.net/host_app_versions.php?hostid=205584

If you go to your Your Account page and click on View next to Computers on this account, then in each one click Details, then Show next to Application details, you can see the max tasks allowed per task type on each of your computer hosts.

If someone would like to fill us in on the details of the rules that make this max go up and down, please let us/Erich56 know here:
http://www.gpugrid.net/forum_thread.php?id=4360
Thank you.
____________
1 Corinthians 9:16 "For though I preach the gospel, I have nothing to glory of: for necessity is laid upon me; yea, woe is unto me, if I preach not the gospel!"
Ephesians 6:18-20, please ;-)
http://tbc-pa.org

Erich56
Send message
Joined: 1 Jan 15
Posts: 1120
Credit: 8,874,495,176
RAC: 33,547,015
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 44290 - Posted: 24 Aug 2016 | 18:48:43 UTC
Last modified: 24 Aug 2016 | 18:57:27 UTC

Again, this afternoon, serveral ADRIA WUs (2OV5_CONF_OPEN) crashed after a few seconds. This time on a different host, the one with the GTX970. Good that here the number of allowed daily tasks is higher than on the host with the 750Ti, so no further downloads were blocked.
However, still the question is: what's wrong with these ADRIA tasks?

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44293 - Posted: 25 Aug 2016 | 3:11:24 UTC - in response to Message 44290.

Again, this afternoon, serveral ADRIA WUs (2OV5_CONF_OPEN) crashed after a few seconds. This time on a different host, the one with the GTX970. Good that here the number of allowed daily tasks is higher than on the host with the 750Ti, so no further downloads were blocked.
However, still the question is: what's wrong with these ADRIA tasks?

They're most likely improperly defined. Unfortunately the admins just let them run until they have too many errors instead of canceling them.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 10,380,648,466
RAC: 15,214,776
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44294 - Posted: 25 Aug 2016 | 3:40:44 UTC - in response to Message 44293.

Again, this afternoon, serveral ADRIA WUs (2OV5_CONF_OPEN) crashed after a few seconds. This time on a different host, the one with the GTX970. Good that here the number of allowed daily tasks is higher than on the host with the 750Ti, so no further downloads were blocked.
However, still the question is: what's wrong with these ADRIA tasks?

They're most likely improperly defined. Unfortunately the admins just let them run until they have too many errors instead of canceling them.


I had a couple of the "OPEN" tasks crash on me too.

e1s34_2-ADRIA_2OV5_CONF_OPEN2-0-1-RND3144_4 11706928 25 Aug 2016 | 3:29:47 UTC 25 Aug 2016 | 3:34:52 UTC Error while computing 2.19 0.16 --- Long runs (8-12 hours on fastest card) v8.48 (cuda65)

e1s35_2-ADRIA_2OV5_CONF_OPEN1-0-1-RND7299_3 11706879 25 Aug 2016 | 2:27:46 UTC 25 Aug 2016 | 3:34:52 UTC Error while computing 2.06 0.11 --- Long runs (8-12 hours on fastest card) v8.48 (cuda65)

They should be canceled.


Erich56
Send message
Joined: 1 Jan 15
Posts: 1120
Credit: 8,874,495,176
RAC: 33,547,015
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 44297 - Posted: 26 Aug 2016 | 5:41:04 UTC
Last modified: 26 Aug 2016 | 5:41:18 UTC

my hosts are still downloading these WUs, and - as usual - they error out after a few seconds.
Why are they not removed from the batch?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44304 - Posted: 26 Aug 2016 | 20:16:54 UTC - in response to Message 44297.

Looks like the batch was cancelled early today:

Exit status 202 (0xca) EXIT_ABORTED_BY_PROJECT

https://www.gpugrid.net/workunit.php?wuid=11700325
Too many errors (may have bug) WU cancelled

https://www.gpugrid.net/workunit.php?wuid=11700315
WU cancelled

Some of these tasks actually completed, but many didn't run; they failed to load due to a file read error. Some are still in progress.

____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44308 - Posted: 27 Aug 2016 | 1:47:25 UTC - in response to Message 44304.

Looks like the batch was cancelled early today:

https://www.gpugrid.net/workunit.php?wuid=11700315
WU cancelled

Some of these tasks actually completed, but many didn't run; they failed to load due to a file read error. Some are still in progress.

One of my boxes completed the WU above successfully. Now it's cancelled? Huh?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44310 - Posted: 27 Aug 2016 | 10:35:32 UTC - in response to Message 44308.
Last modified: 27 Aug 2016 | 10:35:46 UTC

My guess is that the whole batch needs to be built properly for it to be worth much. If too many WU's are duds then I guess the run isn't much use. I would say, "at least you got 'credit' for your effort" but it's like painting a wall and then having to paint it a different colour, and you pay for the paint!
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44313 - Posted: 27 Aug 2016 | 11:59:23 UTC - in response to Message 44310.

it's like painting a wall and then having to paint it a different colour, and you pay for the paint!

Funny :-) Lately it's been: paint the wall, "No that's not fast enough. PAINT IT AGAIN."

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44318 - Posted: 27 Aug 2016 | 14:12:50 UTC

https://www.gpugrid.net/server_status.phpShows the OPEN ones have been completely removed from the system.
____________
1 Corinthians 9:16 "For though I preach the gospel, I have nothing to glory of: for necessity is laid upon me; yea, woe is unto me, if I preach not the gospel!"
Ephesians 6:18-20, please ;-)
http://tbc-pa.org

Erich56
Send message
Joined: 1 Jan 15
Posts: 1120
Credit: 8,874,495,176
RAC: 33,547,015
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 44324 - Posted: 29 Aug 2016 | 19:28:48 UTC

one of my hosts once more received an Adria WU a few hours ago, and also that one failed after a few seconds.

Obviously, they have not been removed completely

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 44338 - Posted: 30 Aug 2016 | 7:16:44 UTC

They were, then added again. :-(

Post to thread

Message boards : Number crunching : ADRIA_2OV5_CONF_CLOSED

//