Advanced search

Message boards : Number crunching : ACEMD 3 very long

Author Message
[AF>Libristes] alain65
Send message
Joined: 30 May 14
Posts: 6
Credit: 1,062,571,853
RAC: 5,812,246
Level
Met
Scientific publications
wat
Message 60558 - Posted: 3 Jul 2023 | 10:07:26 UTC

Hello!

Since 24h00 I have 2 atm b├ęta in process with only 25 and 17% done.
Must I wait?
Or possible they finish in error?
____________
PC are like air conditioning, they becomes useless when you open Windows (L.T)

In a world without walls and fences, who needs windows and gates?

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 455
Credit: 7,524,739,466
RAC: 8,174,920
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60559 - Posted: 3 Jul 2023 | 10:34:44 UTC

I just finished crunching one of these units successfully, but one of the outfiles is too big to upload at 891.78 MB.

https://www.gpugrid.net/workunit.php?wuid=27512927

Mon 03 Jul 2023 05:55:20 AM EDT | GPUGRID | Computation for task e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1 finished
Mon 03 Jul 2023 05:55:27 AM EDT | GPUGRID | Started upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_0
Mon 03 Jul 2023 05:55:27 AM EDT | GPUGRID | Started upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_1
Mon 03 Jul 2023 05:55:30 AM EDT | GPUGRID | Finished upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_0
Mon 03 Jul 2023 05:55:30 AM EDT | GPUGRID | Started upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_2
Mon 03 Jul 2023 05:55:47 AM EDT | GPUGRID | Finished upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_1
Mon 03 Jul 2023 05:55:47 AM EDT | GPUGRID | Started upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_9
Mon 03 Jul 2023 05:55:49 AM EDT | GPUGRID | Backing off 00:03:22 on upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_9
Mon 03 Jul 2023 05:55:49 AM EDT | GPUGRID | Started upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_10
Mon 03 Jul 2023 05:55:50 AM EDT | GPUGRID | Finished upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_10
Mon 03 Jul 2023 05:55:55 AM EDT | GPUGRID | Finished upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_2
Mon 03 Jul 2023 05:59:12 AM EDT | GPUGRID | Started upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_9
Mon 03 Jul 2023 05:59:15 AM EDT | GPUGRID | Backing off 00:07:46 on upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_9
Mon 03 Jul 2023 06:01:18 AM EDT | GPUGRID | Started upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_9
Mon 03 Jul 2023 06:01:21 AM EDT | GPUGRID | Backing off 00:08:36 on upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_9
Mon 03 Jul 2023 06:09:58 AM EDT | GPUGRID | Started upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_9
Mon 03 Jul 2023 06:10:01 AM EDT | GPUGRID | Backing off 00:16:02 on upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_9
Mon 03 Jul 2023 06:26:05 AM EDT | GPUGRID | Started upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_9
Mon 03 Jul 2023 06:26:08 AM EDT | GPUGRID | Backing off 00:57:31 on upload of e1s76_MALT_100ns_37-QUICO_AdB_MALT_test-0-1-RND0358_1_9


That's more than 22 hours of crunching, wasted. Please stop making the same mistake over and over again...................


So, I had to abort the transfer, but I got credit for it anyway.



Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 455
Credit: 7,524,739,466
RAC: 8,174,920
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60560 - Posted: 3 Jul 2023 | 10:45:14 UTC

BTW: The units that you describe are not AMT betas, but are ACEMD 3: molecular dynamics simulations for GPUs v2.20 (cuda1121), just for the record......



[AF>Libristes] alain65
Send message
Joined: 30 May 14
Posts: 6
Credit: 1,062,571,853
RAC: 5,812,246
Level
Met
Scientific publications
wat
Message 60561 - Posted: 3 Jul 2023 | 11:41:03 UTC - in response to Message 60560.

Yes it's thrue. I made a mistake :( That is ACEMD3...


I'll let them finish and let you know what happens when I send them off.
____________
PC are like air conditioning, they becomes useless when you open Windows (L.T)

In a world without walls and fences, who needs windows and gates?

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 927
Credit: 21,713,707,333
RAC: 153,077,930
Level
Trp
Scientific publications
wat
Message 60562 - Posted: 3 Jul 2023 | 13:41:11 UTC

indeed these MALT units are long. one of my A4000s picked one up and at 26hrs runtime its about 96% complete.
____________

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 927
Credit: 21,713,707,333
RAC: 153,077,930
Level
Trp
Scientific publications
wat
Message 60563 - Posted: 3 Jul 2023 | 15:00:49 UTC - in response to Message 60562.
Last modified: 3 Jul 2023 | 15:02:16 UTC

annnnnd the _9 too big to upload issue again. it was nearly 900MB

not dealing with it. aborted the transfer of the big file a reported as-is.

got credit anyway: https://gpugrid.net/result.php?resultid=33527902
____________

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 543
Credit: 5,197,980,308
RAC: 17,045,093
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60564 - Posted: 3 Jul 2023 | 15:35:53 UTC - in response to Message 60563.

annnnnd the _9 too big to upload issue again. it was nearly 900MB

I thought admins would have dealt with that by now... Wrong

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1225
Credit: 4,349,091,144
RAC: 16,194,481
Level
Arg
Scientific publications
watwatwatwatwat
Message 60565 - Posted: 3 Jul 2023 | 16:09:47 UTC - in response to Message 60563.
Last modified: 3 Jul 2023 | 16:13:47 UTC

I've got one in the similar situation. Too big to upload. I've never had any luck though in aborting the transfer and still able to get credit for it.

I just get an aborted task and no credit.

Taking my chances once again, we will see if I am lucky this time.

[Edit] What do you know . . . for once I got credit for the aborted task. Lost some bonus though for letting it sit there trying to upload.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 927
Credit: 21,713,707,333
RAC: 153,077,930
Level
Trp
Scientific publications
wat
Message 60566 - Posted: 3 Jul 2023 | 19:07:58 UTC - in response to Message 60565.

i bet a large portion of those 97 left out in the field are stuck uploading and the users aren't aware of what to do with them.
____________

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 543
Credit: 5,197,980,308
RAC: 17,045,093
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60567 - Posted: 4 Jul 2023 | 5:38:36 UTC

Task: e1s85_MALT_100ns_42-QUICO_AdB_MALT_test-0-1-RND3115_2
Resulted in a e1s85_MALT_100ns_42-QUICO_AdB_MALT_test-0-1-RND3115_2_9 file 891.61 MB in size.

Following Ian&Steve C. kind indications:
I waited for transfer Status "Upload: retry in..." was shown



I selected "Abort Transfer"



Abort File Transfer Yes
Task 33527760]e1s85_MALT_100ns_42-QUICO_AdB_MALT_test-0-1-RND3115_2 was reported as finished.



Now, I'm one of those six users in last 24h



95 ACEMD 3 tasks in progress left by now
Server status live

[AF>Libristes] alain65
Send message
Joined: 30 May 14
Posts: 6
Credit: 1,062,571,853
RAC: 5,812,246
Level
Met
Scientific publications
wat
Message 60568 - Posted: 4 Jul 2023 | 7:13:44 UTC



How can I change the title from atm beta to ACEMD 3?
I'm really sorry for the mistake...
____________
PC are like air conditioning, they becomes useless when you open Windows (L.T)

In a world without walls and fences, who needs windows and gates?

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1554
Credit: 4,997,841,851
RAC: 7,488,809
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60569 - Posted: 4 Jul 2023 | 7:21:59 UTC - in response to Message 60568.
Last modified: 4 Jul 2023 | 8:07:33 UTC

Go back and edit your last post - it's recent enough. It should show you an editing box for the title, as well as allowing you to edit the body text.

That trick only works if you make a brand new post, which you did. It doesn't work if your post was a reply or a quote.

Edit: you can only edit your post (and hence the title) within one hour of writing it, and that's coming up. If you leave it too long, you'll have to make a new post, and edit that instead.

JochenZ
Send message
Joined: 15 Aug 09
Posts: 2
Credit: 327,624,742
RAC: 793,581
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60570 - Posted: 4 Jul 2023 | 16:59:30 UTC

Hello,

are there any other possibilties than to abort/delete the ACEMD 3 file?
Or will this problem be solved so I can upload a 891 MB result?

Thanks

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1225
Credit: 4,349,091,144
RAC: 16,194,481
Level
Arg
Scientific publications
watwatwatwatwat
Message 60571 - Posted: 4 Jul 2023 | 17:38:12 UTC - in response to Message 60570.

No, under the current server limitations, it is impossible to upload the _9 result file.

The only option is to abort the task transfer.

But you will get credit for it because the bulk of the result files already successfully transferred.

I just aborted my second too big task and got credit for it.

If I remember the previous commentary on this problem, the scientist said the task science is valid even without the _9 file.

[AF>Libristes] alain65
Send message
Joined: 30 May 14
Posts: 6
Credit: 1,062,571,853
RAC: 5,812,246
Level
Met
Scientific publications
wat
Message 60578 - Posted: 8 Jul 2023 | 8:03:28 UTC - in response to Message 60569.
Last modified: 8 Jul 2023 | 8:07:34 UTC

Go back and edit your last post - it's recent enough. It should show you an editing box for the title, as well as allowing you to edit the body text.



Thank you very much for this information. I'll do it right away.


I gave up sending for the first completed job, I did get the 850000 credit. On the other hand, I didn't get anything for the second one.

Edit: unfortunately this only works for the body text, not for the title.
____________
PC are like air conditioning, they becomes useless when you open Windows (L.T)

In a world without walls and fences, who needs windows and gates?

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1554
Credit: 4,997,841,851
RAC: 7,488,809
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60579 - Posted: 8 Jul 2023 | 9:03:55 UTC - in response to Message 60578.

That was a reply post - "in response to" - and editing that does not let you alter the title.

If you make a brand new post - "Post to thread" - you can edit the title, as well as the post.

[AF>Libristes] alain65
Send message
Joined: 30 May 14
Posts: 6
Credit: 1,062,571,853
RAC: 5,812,246
Level
Met
Scientific publications
wat
Message 60580 - Posted: 8 Jul 2023 | 11:36:39 UTC
Last modified: 8 Jul 2023 | 11:39:47 UTC

OK!
Thank you :)
Done...
____________
PC are like air conditioning, they becomes useless when you open Windows (L.T)

In a world without walls and fences, who needs windows and gates?

[CSF] Aleksey Belkov
Avatar
Send message
Joined: 26 Dec 13
Posts: 85
Credit: 1,205,088,498
RAC: 813,359
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60584 - Posted: 10 Jul 2023 | 22:07:32 UTC - in response to Message 60563.

aborted the transfer of the big file a reported as-is.
got credit anyway

Thx for advise!
Got some credits for 2d+ crunching... : )

Padanian
Send message
Joined: 1 May 09
Posts: 8
Credit: 16,534,026
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 60587 - Posted: 12 Jul 2023 | 13:11:46 UTC

Me too. Trying to abort now e1s20_MALT_100ns_9-QUICO_AdB_MALT_test-0-1-RND6616_3_9

Validated. Credited.

Post to thread

Message boards : Number crunching : ACEMD 3 very long

//