Advanced search

Message boards : News : More tasks: MDAD*

Author Message
Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 947
Credit: 4,353,973
RAC: 1,134
Level
Ala
Scientific publications
watwatwatwat
Message 54797 - Posted: 21 May 2020 | 12:47:33 UTC
Last modified: 21 May 2020 | 13:23:42 UTC

I'm filling up the task queue again- - these are called MDAD and suffix.
Happy crunching!

T

Killersocke
Send message
Joined: 18 Oct 13
Posts: 51
Credit: 333,404,147
RAC: 53,704
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 54799 - Posted: 21 May 2020 | 13:21:28 UTC - in response to Message 54797.

thx
you make the user happy :-)

Lazydude
Send message
Joined: 25 Sep 08
Posts: 12
Credit: 119,735,355
RAC: 438,195
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 54801 - Posted: 21 May 2020 | 13:31:47 UTC

Thanks ,
but the certs are not ok



05/21/20 15:26:56 | GPUGRID | [http] HTTP error: SSL peer certificate or SSH remote key was not OK
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: TLSv1.2 (IN), TLS change cipher, Client hello (1):
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: TLSv1.2 (IN), TLS handshake, Finished (20):
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: ALPN, server did not agree to a protocol
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: Server certificate:
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: subject: CN=www.ps3grid.net
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: start date: May 3 10:33:30 2020 GMT
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: expire date: Aug 1 10:33:30 2020 GMT
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: subjectAltName does not match www.gpugrid.org
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: SSL: no alternative certificate subject name matches target host name 'www.gpugrid.org'
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: Closing connection 133
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: TLSv1.2 (OUT), TLS alert, Client hello (1):
05/21/20 15:26:56 | GPUGRID | [http] HTTP error: SSL peer certificate or SSH remote key was not OK

Erich56
Send message
Joined: 1 Jan 15
Posts: 695
Credit: 3,280,645,583
RAC: 550,395
Level
Arg
Scientific publications
watwatwatwatwatwat
Message 54802 - Posted: 21 May 2020 | 13:35:59 UTC

I,too, can't download anything:


21.05.2020 15:26:41 | | Project communication failed: attempting access to reference site
21.05.2020 15:26:42 | | Internet access OK - project servers may be temporarily down.
21.05.2020 15:27:09 | GPUGRID | Started download of 1b35A00_379_2-TONI_MDADpr4sb-0-conf_file_enc
21.05.2020 15:27:10 | GPUGRID | Temporarily failed download of 1b35A00_379_2-TONI_MDADpr4sb-0-conf_file_enc: transient HTTP error
21.05.2020 15:27:10 | GPUGRID | Backing off 00:11:44 on download of 1b35A00_379_2-TONI_MDADpr4sb-0-conf_file_enc
21.05.2020 15:27:11 | | Project communication failed: attempting access to reference site
21.05.2020 15:27:12 | | Internet access OK - project servers may be temporarily down.
21.05.2020 15:28:25 | GPUGRID | Started download of 1b35A00_379_2-TONI_MDADpr4sb-0-xsc_file
21.05.2020 15:28:26 | GPUGRID | Temporarily failed download of 1b35A00_379_2-TONI_MDADpr4sb-0-xsc_file: transient HTTP error
21.05.2020 15:28:26 | GPUGRID | Backing off 00:07:54 on download of 1b35A00_379_2-TONI_MDADpr4sb-0-xsc_file
21.05.2020 15:28:27 | | Project communication failed: attempting access to reference site
21.05.2020 15:28:28 | | Internet access OK - project servers may be temporarily down.
21.05.2020 15:28:28 | GPUGRID | Started download of 1b35A00_379_2-TONI_MDADpr4sb-0-par_file
21.05.2020 15:28:29 | GPUGRID | Temporarily failed download of 1b35A00_379_2-TONI_MDADpr4sb-0-par_file: transient HTTP error
21.05.2020 15:28:29 | GPUGRID | Backing off 00:05:36 on download of 1b35A00_379_2-TONI_MDADpr4sb-0-par_file
21.05.2020 15:28:30 | | Project communication failed: attempting access to reference site
21.05.2020 15:28:31 | | Internet access OK - project servers may be temporarily down.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 947
Credit: 4,353,973
RAC: 1,134
Level
Ala
Scientific publications
watwatwatwat
Message 54803 - Posted: 21 May 2020 | 13:38:24 UTC - in response to Message 54801.

Try again please

Lazydude
Send message
Joined: 25 Sep 08
Posts: 12
Credit: 119,735,355
RAC: 438,195
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 54805 - Posted: 21 May 2020 | 13:48:42 UTC

Did a reset of the project.
Now I got 2 Tasks downloaded whitout any problems


Thanks!

[CSF] Thomas H.V. Dupont
Send message
Joined: 20 Jul 14
Posts: 727
Credit: 96,837,728
RAC: 19,130
Level
Thr
Scientific publications
watwatwatwatwatwat
Message 54806 - Posted: 21 May 2020 | 14:02:19 UTC

Thanks Toni!
____________
[CSF] Thomas H.V. Dupont
Founder of the team CRUNCHERS SANS FRONTIERES
www.crunchersansfrontieres.org

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1001
Credit: 2,426,954,553
RAC: 2,811,505
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54809 - Posted: 21 May 2020 | 14:08:51 UTC

See note in HTTPS thread, message 54807

Windows downloading OK, Linux failing. Users - please mention which OS you are using when reporting.

Erich56
Send message
Joined: 1 Jan 15
Posts: 695
Credit: 3,280,645,583
RAC: 550,395
Level
Arg
Scientific publications
watwatwatwatwatwat
Message 54812 - Posted: 21 May 2020 | 14:20:21 UTC

after I had deleted the task that could not be downloaded and tried a new one, the following message is now coming everytime I push the "update" button:

21.05.2020 16:17:00 | GPUGRID | update requested by user
21.05.2020 16:17:03 | GPUGRID | Sending scheduler request: Requested by user.
21.05.2020 16:17:03 | GPUGRID | Not requesting tasks: some download is stalled
21.05.2020 16:17:04 | GPUGRID | Scheduler request completed

OS is Windows 10.

thimios
Send message
Joined: 10 Jan 09
Posts: 5
Credit: 141,141,164
RAC: 68,183
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 54816 - Posted: 21 May 2020 | 15:05:16 UTC - in response to Message 54812.

after I had deleted the task that could not be downloaded and tried a new one, the following message is now coming everytime I push the "update" button:

21.05.2020 16:17:00 | GPUGRID | update requested by user
21.05.2020 16:17:03 | GPUGRID | Sending scheduler request: Requested by user.
21.05.2020 16:17:03 | GPUGRID | Not requesting tasks: some download is stalled
21.05.2020 16:17:04 | GPUGRID | Scheduler request completed

OS is Windows 10.



If you restart BOINC, the problem will resolve itself.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 66
Credit: 739,971,692
RAC: 6,642,597
Level
Lys
Scientific publications
wat
Message 54822 - Posted: 21 May 2020 | 15:33:40 UTC - in response to Message 54816.

all three of my Linux hosts have downloaded, processed, and reported work successfully. thanks :)
____________

Erich56
Send message
Joined: 1 Jan 15
Posts: 695
Credit: 3,280,645,583
RAC: 550,395
Level
Arg
Scientific publications
watwatwatwatwatwat
Message 54823 - Posted: 21 May 2020 | 15:35:11 UTC - in response to Message 54812.

before, I wrote:

after I had deleted the task that could not be downloaded and tried a new one, the following message is now coming everytime I push the "update" button:

21.05.2020 16:17:00 | GPUGRID | update requested by user
21.05.2020 16:17:03 | GPUGRID | Sending scheduler request: Requested by user.
21.05.2020 16:17:03 | GPUGRID | Not requesting tasks: some download is stalled
21.05.2020 16:17:04 | GPUGRID | Scheduler request completed

OS is Windows 10.


after the problem did not vanish, I reset GPUGRID - and could download new tasks :-)

What I was wondering about though: the GPUGRID masterfile in the newly downloaded "account_www.gpugrid.net.xml" (BOINC folder) still says
"...<master_url>http://www.gpugrid.net/</master_url>..."

I would have expected it be read "https" ...

Matt Cheetham
Avatar
Send message
Joined: 21 Apr 20
Posts: 4
Credit: 2,927,104
RAC: 76,061
Level
Ala
Scientific publications
wat
Message 54826 - Posted: 21 May 2020 | 15:59:55 UTC

Oh what a lovely surprise. Suddenly realized I had a task running for GPUGrid and was surprised thinking it was one of the very last re-runs of an old task in the list.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2184
Credit: 15,800,186,335
RAC: 794,418
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54837 - Posted: 21 May 2020 | 18:54:33 UTC
Last modified: 21 May 2020 | 18:54:53 UTC

There are some bad workuntis in the new batch.

EXCEPTIONAL CONDITION: src\mdio\bincoord.c, line 193: "nelems != 1"
https://www.gpugrid.net/workunit.php?wuid=20162190
https://www.gpugrid.net/workunit.php?wuid=20162534
https://www.gpugrid.net/workunit.php?wuid=20009439
https://www.gpugrid.net/workunit.php?wuid=20009346
https://www.gpugrid.net/workunit.php?wuid=20009664
and
ERROR: src\mdsim\trajectory.cpp line 135: Simulation box has to be rectangular!
https://www.gpugrid.net/workunit.php?wuid=20009564

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 66
Credit: 739,971,692
RAC: 6,642,597
Level
Lys
Scientific publications
wat
Message 54838 - Posted: 21 May 2020 | 18:57:57 UTC - in response to Message 54837.

can confirm. I'm seeing a high number of bad WUs coming through here too.
____________

Trotador
Send message
Joined: 25 Mar 12
Posts: 95
Credit: 1,617,990,324
RAC: 1,472,342
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54839 - Posted: 21 May 2020 | 19:08:32 UTC

+1

joukohan
Send message
Joined: 17 Oct 16
Posts: 5
Credit: 16,833,796
RAC: 173,307
Level
Pro
Scientific publications
wat
Message 54842 - Posted: 21 May 2020 | 19:46:16 UTC
Last modified: 21 May 2020 | 19:51:48 UTC

My Windows-machine has validated one WU ok already, but Debian-machine WUs end up with the same error:

EXCEPTIONAL CONDITION: /home/user/conda/conda-bld/acemd3_1570536635323/work/src/mdio/bincoord.c, line 193: "nelems != 1"


EDIT: Now it seems like crunching better with Debian too; done% actually going up instead of instant error.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 66
Credit: 739,971,692
RAC: 6,642,597
Level
Lys
Scientific publications
wat
Message 54844 - Posted: 21 May 2020 | 20:28:42 UTC

not sure if something should be done about the really high number of bad WUs. one of my systems just went through like 100 bad ones.
____________

=Lupus=
Send message
Joined: 10 Nov 07
Posts: 10
Credit: 2,906,632
RAC: 91,627
Level
Ala
Scientific publications
watwatwatwat
Message 54848 - Posted: 21 May 2020 | 21:07:32 UTC - in response to Message 54844.
Last modified: 21 May 2020 | 21:07:58 UTC

Ah and I thought bad side on my machine... seems WUs are slightly shaky

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1001
Credit: 2,426,954,553
RAC: 2,811,505
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54850 - Posted: 21 May 2020 | 21:26:16 UTC

Most of the 0-50 tasks seems to be bad - 0-10 are usually OK.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 66
Credit: 739,971,692
RAC: 6,642,597
Level
Lys
Scientific publications
wat
Message 54851 - Posted: 21 May 2020 | 22:19:25 UTC

80-90% of what I'm downloading are all bombing out. Setting NNT until it calms down. all of the errors are kicking me into long backoffs and just wasting time.
____________

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1001
Credit: 2,426,954,553
RAC: 2,811,505
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54852 - Posted: 21 May 2020 | 22:32:31 UTC

What's more, I'm starting to get resends of the tasks which couldn't negotiate an SSL server name earlier today, and need manual tweaking to download. Night-time here, and I don't want to stop BOINC to muck about, because they're on machines with mixed GPUs and can't be relied on to restart on the right card.

They'll just have to wait it out overnight and I'll sort them out in the morning.

Erich56
Send message
Joined: 1 Jan 15
Posts: 695
Credit: 3,280,645,583
RAC: 550,395
Level
Arg
Scientific publications
watwatwatwatwatwat
Message 54861 - Posted: 22 May 2020 | 4:49:32 UTC - in response to Message 54848.

... seems WUs are slightly shaky

I've had 3 faulty ones last night, all with "195 (0xc3) EXIT_CHILD_FAILED":

http://www.gpugrid.net/result.php?resultid=25128849
http://www.gpugrid.net/result.php?resultid=25125004
http://www.gpugrid.net/result.php?resultid=25082116

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 947
Credit: 4,353,973
RAC: 1,134
Level
Ala
Scientific publications
watwatwatwat
Message 54862 - Posted: 22 May 2020 | 7:22:25 UTC - in response to Message 54861.

Confirmed. About 10% of the tasks were created with a missing file, which makes them crash on startup. I'm figuring out the best course of action.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1001
Credit: 2,426,954,553
RAC: 2,811,505
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54863 - Posted: 22 May 2020 | 7:48:40 UTC - in response to Message 54862.

Confirmed. About 10% of the tasks were created with a missing file, which makes them crash on startup. I'm figuring out the best course of action.

OK, so long as you know - I'll carry on burning them off as quickly as I can ;-)

Your'e going to have a bit of an extra bandwidth bill this month for us downloading the files that were created.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 947
Credit: 4,353,973
RAC: 1,134
Level
Ala
Scientific publications
watwatwatwat
Message 54864 - Posted: 22 May 2020 | 8:07:10 UTC - in response to Message 54863.

I'm cancelling them.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1001
Credit: 2,426,954,553
RAC: 2,811,505
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54865 - Posted: 22 May 2020 | 8:28:57 UTC - in response to Message 54864.

I'm cancelling them.

And it's working. All GPUs are either running productive work, or have viable tasks waiting to run after backup projects have finished. Thank you.

Profile BladeD
Send message
Joined: 1 May 11
Posts: 7
Credit: 85,802,520
RAC: 98,345
Level
Thr
Scientific publications
watwat
Message 54867 - Posted: 22 May 2020 | 9:37:19 UTC

5/22/2020 4:37:15 AM | GPUGRID | Started download of 1a5cA00_379_0-TONI_MDADex7sa-0-pdb_file
5/22/2020 4:37:16 AM | | Project communication failed: attempting access to reference site
5/22/2020 4:37:16 AM | GPUGRID | Temporarily failed download of 1a5cA00_379_0-TONI_MDADex7sa-0-pdb_file: transient HTTP error
5/22/2020 4:37:16 AM | GPUGRID | Backing off 04:46:52 on download of 1a5cA00_379_0-TONI_MDADex7sa-0-pdb_file
5/22/2020 4:37:17 AM | | Internet access OK - project servers may be temporarily down.


____________

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1001
Credit: 2,426,954,553
RAC: 2,811,505
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54868 - Posted: 22 May 2020 | 9:48:39 UTC - in response to Message 54867.

It often happens here - the project server is very busy, and I think constrained for bandwidth. Wait a couple of minutes and try again.

Zirma
Send message
Joined: 21 Apr 20
Posts: 13
Credit: 4,181,640
RAC: 56,780
Level
Ala
Scientific publications
wat
Message 54869 - Posted: 22 May 2020 | 10:09:31 UTC

23 of 23 dont work

1ezgA00_320_4-TONI_MDADex1se-0-50-RND5032_5 20179781 543598 22 May 2020 | 7:57:25 UTC 22 May 2020 | 8:47:14 UTC Error while computing 5.30 0.00 --- New version of ACEMD v2.10 (cuda101)
1ev0A00_450_3-TONI_MDADex1se-0-50-RND6497_4 20179280 543598 22 May 2020 | 7:22:19 UTC 22 May 2020 | 7:24:03 UTC Error while computing 6.17 0.00 --- New version of ACEMD v2.10 (cuda101)
1eu3A02_348_4-TONI_MDADex1se-0-50-RND0288_5 20179102 543598 22 May 2020 | 7:20:15 UTC 22 May 2020 | 7:22:19 UTC Error while computing 6.17 0.02 --- New version of ACEMD v2.10 (cuda101)
1etb200_348_4-TONI_MDADex1se-0-50-RND4587_5 20178954 543598 22 May 2020 | 7:18:06 UTC 22 May 2020 | 7:20:15 UTC Error while computing 6.16 0.00 --- New version of ACEMD v2.10 (cuda101)
1e8gA04_379_2-TONI_MDADex1se-0-50-RND1569_7 20176164 543598 22 May 2020 | 6:56:28 UTC 22 May 2020 | 7:18:06 UTC Error while computing 6.38 0.00 --- New version of ACEMD v2.10 (cuda101)
1e8gA04_450_3-TONI_MDADex1se-0-50-RND0934_6 20176186 543598 22 May 2020 | 6:06:23 UTC 22 May 2020 | 6:08:27 UTC Error while computing 6.25 0.02 --- New version of ACEMD v2.10 (cuda101)
1ba5A00_413_3-TONI_MDADex1sb-0-50-RND8087_6 20163061 543598 22 May 2020 | 6:04:55 UTC 22 May 2020 | 6:06:23 UTC Error while computing 7.15 0.00 --- New version of ACEMD v2.10 (cuda101)
1eb4A02_413_2-TONI_MDADex1se-0-50-RND9618_7 20176742 543598 22 May 2020 | 6:02:37 UTC 22 May 2020 | 6:04:55 UTC Error while computing 9.20 0.02 --- New version of ACEMD v2.10 (cuda101)
1encA00_320_0-TONI_MDADex1se-0-50-RND5173_2 20178305 543598 22 May 2020 | 6:00:23 UTC 22 May 2020 | 6:02:37 UTC Error while computing 6.12 0.00 --- New version of ACEMD v2.10 (cuda101)
1edqA03_348_4-TONI_MDADex1se-0-50-RND8101_5 20177101 543598 22 May 2020 | 5:58:59 UTC 22 May 2020 | 6:00:23 UTC Error while computing 6.07 0.00 --- New version of ACEMD v2.10 (cuda101)
1e8uA00_450_4-TONI_MDADex1se-0-50-RND7534_2 20176331 543598 22 May 2020 | 5:55:17 UTC 22 May 2020 | 5:56:55 UTC Error while computing 7.13 0.00 --- New version of ACEMD v2.10 (cuda101)
1ej6A01_450_0-TONI_MDADex1se-0-50-RND9222_4 20177970 543598 22 May 2020 | 5:53:54 UTC 22 May 2020 | 5:55:17 UTC Error while computing 6.55 0.00 --- New version of ACEMD v2.10 (cuda101)
1e5wA04_413_4-TONI_MDADex1se-0-50-RND9110_7 20175698 543598 22 May 2020 | 5:52:10 UTC 22 May 2020 | 5:53:54 UTC Error while computing 6.57 0.02 --- New version of ACEMD v2.10 (cuda101)
1efpB00_413_3-TONI_MDADex1se-0-50-RND6574_6 20177356 543598 22 May 2020 | 5:50:31 UTC 22 May 2020 | 5:52:10 UTC Error while computing 5.85 0.00 --- New version of ACEMD v2.10 (cuda101)
1e20A00_320_4-TONI_MDADex1se-0-50-RND7445_6 20175171 543598 22 May 2020 | 5:48:44 UTC 22 May 2020 | 5:50:31 UTC Error while computing 6.53 0.02 --- New version of ACEMD v2.10 (cuda101)
1e8uA00_450_0-TONI_MDADex1se-0-50-RND6268_2 20176319 543598 22 May 2020 | 5:46:13 UTC 22 May 2020 | 5:48:44 UTC Error while computing 6.37 0.00 --- New version of ACEMD v2.10 (cuda101)
1e7lA02_450_4-TONI_MDADex1se-0-50-RND8816_6 20175989 543598 22 May 2020 | 5:43:58 UTC 22 May 2020 | 5:46:13 UTC Error while computing 30.73 0.66 --- New version of ACEMD v2.10 (cuda101)
1e6dM01_379_2-TONI_MDADex1se-0-50-RND3537_5 20175771 543598 22 May 2020 | 5:42:03 UTC 22 May 2020 | 5:43:58 UTC Error while computing 6.23 0.00 --- New version of ACEMD v2.10 (cuda101)
1ej5A00_413_0-TONI_MDADex1se-0-50-RND4767_1 20177877 543598 22 May 2020 | 5:40:10 UTC 22 May 2020 | 5:42:03 UTC Error while computing 6.15 0.00 --- New version of ACEMD v2.10 (cuda101)
1e6vA03_348_1-TONI_MDADex1se-0-50-RND5457_5 20175828 543598 22 May 2020 | 5:38:16 UTC 22 May 2020 | 5:40:10 UTC Error while computing 6.56 0.00 --- New version of ACEMD v2.10 (cuda101)

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1001
Credit: 2,426,954,553
RAC: 2,811,505
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54870 - Posted: 22 May 2020 | 10:11:31 UTC - in response to Message 54869.

Read the older posts in this thread. There was a problem, but it's over - they've been cancelled.

Zirma
Send message
Joined: 21 Apr 20
Posts: 13
Credit: 4,181,640
RAC: 56,780
Level
Ala
Scientific publications
wat
Message 54871 - Posted: 22 May 2020 | 10:14:20 UTC - in response to Message 54870.

canceld ?.. i still get them

1eb6A00_450_3-TONI_MDADex1se-0-50-RND7847_4 20176835 543598 22 May 2020 | 5:20:15 UTC 22 May 2020 | 5:38:16 UTC Error while computing 6.38 0.00 --- New version of ACEMD v2.10 (cuda101)
1eokA00_379_1-TONI_MDADex1se-0-50-RND6186_0 20178430 543598 22 May 2020 | 5:56:55 UTC 22 May 2020 | 5:58:59 UTC Error while computing 6.91 0.00 --- New version of ACEMD v2.10 (cuda101)
1a8oA00_348_3-TONI_MDADpr4sa-9-10-RND7509_0 20009637 543598 11 May 2020 | 3:23:47 UTC 11 May 2020 | 7:04:15 UTC Error while computing 6.10 0.02 --- New version of ACEMD v2.10 (cuda101)

not onely ...ex1... even pr4sa...

tullio
Send message
Joined: 8 May 18
Posts: 167
Credit: 47,443,737
RAC: 201,208
Level
Val
Scientific publications
wat
Message 54872 - Posted: 22 May 2020 | 10:39:56 UTC

Tasks 0-50 seem to work right now. I had 36 failures.
Tullio
____________

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 947
Credit: 4,353,973
RAC: 1,134
Level
Ala
Scientific publications
watwatwatwat
Message 54873 - Posted: 22 May 2020 | 10:42:22 UTC - in response to Message 54871.

Cancellation is always flaky. Let them wither.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1001
Credit: 2,426,954,553
RAC: 2,811,505
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54874 - Posted: 22 May 2020 | 11:37:24 UTC - in response to Message 54871.

canceld ?.. i still get them

You got them - past tense. You had already returned them before Toni got into the office this morning and started thinking about what to do.

Zirma
Send message
Joined: 21 Apr 20
Posts: 13
Credit: 4,181,640
RAC: 56,780
Level
Ala
Scientific publications
wat
Message 54875 - Posted: 22 May 2020 | 11:51:06 UTC - in response to Message 54874.

canceld ?.. i still get them

You got them - past tense. You had already returned them before Toni got into the office this morning and started thinking about what to do.


Yes the first 20 wu but i reed he have stop the bad but i got at least 3 bad after. I diden know there was some latency when he stop them. But now it looks good and working fine so it's no problems. (I put out the work list onely so he can se if ther was a system error on some work he not have notis aboute. I know u all working hard on it.) No hard minds. Ty for all u suport.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 91
Credit: 104,518,948
RAC: 700,539
Level
Cys
Scientific publications
wat
Message 54887 - Posted: 22 May 2020 | 21:50:34 UTC

So far I'm batting .500 on this batch.
54 successes and 54 'exceptional condition' errors.

🤔I'm curious if we are purposely "pushing the envelope" here, Toni. It looks like we're exploring the outer boundaries of the acemd3 program viability from under my rock.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1001
Credit: 2,426,954,553
RAC: 2,811,505
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54888 - Posted: 22 May 2020 | 22:01:58 UTC - in response to Message 54862.

Confirmed. About 10% of the tasks were created with a missing file, which makes them crash on startup. I'm figuring out the best course of action.

Just pulling forward what Toni has has already written in this thread. The only envelope we're pushing is that of one very tired researcher, who - like all of us - makes mistakes from time to time.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 482
Credit: 553,802,986
RAC: 50,461
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54889 - Posted: 22 May 2020 | 22:11:37 UTC
Last modified: 22 May 2020 | 22:15:42 UTC

I've only had one task since the latest changes.

https://www.gpugrid.net/result.php?resultid=25200010

Its output showed some dump sections, but it appears to have downloaded, run, and uploaded correctly otherwise. Marked as Valid.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 91
Credit: 104,518,948
RAC: 700,539
Level
Cys
Scientific publications
wat
Message 54892 - Posted: 22 May 2020 | 23:17:27 UTC
Last modified: 22 May 2020 | 23:42:26 UTC

Grosso appears to be feeling the strain of so many WUs failing and hosts requesting replacement downloads. Things are pretty slow on my end, only one host at a time getting anything, and that download is intermittent.

It looks to me like the shutting down of SETI@home triggered an unexpected hardware bottleneck for many other projects.

I wonder if a policy of 2 'spares' per GPU might alleviate this some.

Profile [AF] fansyl
Send message
Joined: 26 Sep 13
Posts: 14
Credit: 1,456,272,932
RAC: 354,523
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 54940 - Posted: 24 May 2020 | 17:14:38 UTC

Thanks for the new work !

Alls seems to be OK today :)

vonboedefeldt
Send message
Joined: 24 Mar 20
Posts: 3
Credit: 5,394,765
RAC: 31,888
Level
Ser
Scientific publications
wat
Message 55034 - Posted: 4 Jun 2020 | 5:46:28 UTC

Since june the 2nd I just have failures in calculating (Berechnungsfehler), wu ends after some seconds.
In hope for a solution,

vonboedefeldt

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 947
Credit: 4,353,973
RAC: 1,134
Level
Ala
Scientific publications
watwatwatwat
Message 55035 - Posted: 4 Jun 2020 | 10:37:31 UTC - in response to Message 55034.

Since june the 2nd I just have failures in calculating (Berechnungsfehler), wu ends after some seconds.
In hope for a solution,

vonboedefeldt


Worked in some other PC... try to reboot?

Lazydude
Send message
Joined: 25 Sep 08
Posts: 12
Credit: 119,735,355
RAC: 438,195
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 55036 - Posted: 4 Jun 2020 | 10:39:34 UTC - in response to Message 55034.

check in your tasklist and check worklist
if there are more than 2 others in worklist then dont worry
its something wrong with thah barch

vonboedefeldt
Send message
Joined: 24 Mar 20
Posts: 3
Credit: 5,394,765
RAC: 31,888
Level
Ser
Scientific publications
wat
Message 55037 - Posted: 4 Jun 2020 | 11:21:32 UTC - in response to Message 55035.

I tried, but the same result as before

vonboedefeldt
Send message
Joined: 24 Mar 20
Posts: 3
Credit: 5,394,765
RAC: 31,888
Level
Ser
Scientific publications
wat
Message 55038 - Posted: 4 Jun 2020 | 11:23:14 UTC - in response to Message 55036.

Did you mean batch, or what is "barch"?

Lazydude
Send message
Joined: 25 Sep 08
Posts: 12
Credit: 119,735,355
RAC: 438,195
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 55039 - Posted: 4 Jun 2020 | 14:50:26 UTC - in response to Message 55038.

yes batch of work

Post to thread

Message boards : News : More tasks: MDAD*