Advanced search

Message boards : Number crunching : ATM: Free Energy Calculations new application

Author Message
Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59751 - Posted: 18 Jan 2023 | 19:50:34 UTC
Last modified: 18 Jan 2023 | 20:30:40 UTC

Just starting the thread for discussion of this new application. ATM = AToM.

after a little snafu with the first batch (incorrect config files), the latest batch seems to run on my system. no idea for runtime yet or if it will finish successfully.

This is another Python-based application. the package ships with the python environment similar to how the PythonGPU Reinforcement Learning (RL) app does.

Test Bench:
Xeon E5-2697Av4 (16c/32t)
64GB DDR4-2400 RDIMM (ECC)
RTX 3060 12GB
Ubuntu 22.04.1


So far observed behavior:
-uses ~97% of the GPU core, ~45% GPU memory bus, ~0-1% PCIe bus, close to full power use.
-about 400-500MB VRAM used (low, like acemd3)
-does not like to be paused and resumed, or BOINC stopped and restarted. it causes the task to fail

unknown total runtime expectation since the one task I had failed when I restarted BOINC lol.
____________

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59752 - Posted: 18 Jan 2023 | 20:36:00 UTC - in response to Message 59751.

about the restart failure. looks like it fails trying to create a directory that already exists.

mkdir: cannot create directory 'atm_tmp': File exists


needs some work to allow for that.
____________

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59754 - Posted: 18 Jan 2023 | 20:55:38 UTC

another quality of life improvement should be adding a <weight> line to the main task in the job.xml file. right now with 2 tasks in the file, and no weights defined, I'm guessing it splits it 50/50 and it thinks the task is 50% done once the extraction phase is complete.
____________

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59755 - Posted: 18 Jan 2023 | 21:33:51 UTC
Last modified: 18 Jan 2023 | 21:34:03 UTC

task ran to completion in about an hour. but hit an error and threw it all away because the file size is too big.

upload failure: <file_xfer_error>
<file_name>T11_4-RAIMIS_TEST_ATM-0-1-RND7054_2_0</file_name>
<error_code>-131 (file size too big)</error_code>
</file_xfer_error>


what a waste.
____________

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1301
Credit: 5,525,616,959
RAC: 8,503,032
Level
Tyr
Scientific publications
watwatwatwatwat
Message 59771 - Posted: 19 Jan 2023 | 18:06:13 UTC

Have over a dozen of quick-failing ATM tasks.

The wrapper does not have a correctly name tar file or something.

02:56:29 (1242346): wrapper: running /bin/tar (xf input.tar.bz2)
/bin/tar: This does not look like a tar archive
bzip2: (stdin) is not a bzip2 file.
/bin/tar: Child returned status 2
/bin/tar: Error is not recoverable: exiting now

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59792 - Posted: 24 Jan 2023 | 16:01:40 UTC - in response to Message 59755.

looks like the small batch of tasks that went out today are better setup. ran for about an hour and completed successfully without the file size issue when complete.

great :)

still would like a little more background info on these tasks, what they are doing, and the goal of the research.
____________

FritzB
Send message
Joined: 7 Apr 15
Posts: 11
Credit: 2,165,453,600
RAC: 3,337,979
Level
Phe
Scientific publications
wat
Message 59899 - Posted: 10 Feb 2023 | 22:25:09 UTC
Last modified: 10 Feb 2023 | 22:25:52 UTC

This one https://www.gpugrid.net/workunit.php?wuid=27399736 is runnig for about 11 hours and it is stuck at 66,666% for at least 4 hours now. There is almost no load on the GPU. Just a few percent (3-5) once in a while, but constantly some load on the memory controller (10-30). Hope it will finish some day :)

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59936 - Posted: 16 Feb 2023 | 16:24:10 UTC

still no official communication from the project about these tasks.

the recent batches have been very hit or miss and exhibit much different behavior than my initial post.

"TL2" tasks, ran for hours and hours with little to no GPU or CPU use. I aborted them and moved on.

"TL3" tasks yesterday, also had little to no GPU or CPU use, but did complete in about 30 mins.

"TL4" tasks today seem like a repeat of TL2. no GPU use, runs for hours with no progress.

also weights need to be defined in the jobs.xml file so the tasks don't jump to 75% after a few seconds and then sit there for hours doing nothing.
____________

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1584
Credit: 6,209,931,851
RAC: 8,224,154
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59937 - Posted: 16 Feb 2023 | 17:01:14 UTC

Just been sent a TL4 from WU 27405970. I see you've aborted two previous tasks from the same WU, Ian, on two different machines. Did you get any CPU usage figures from previous runs? I think I'll start it up with the GTX 1660 plus one core, but I'll probably abort it myself if it doesn't show much response.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59938 - Posted: 16 Feb 2023 | 17:04:18 UTC - in response to Message 59937.

they spin up multiple processes like the Python tasks do. but i didnt catch them at the very beginning to see if they spike in use or anything like that.

once they get going, they basically sit idle as far as the GPU and CPU go. little to no use at all. i just killed them rather than letting them sit there for hours occupying my GPU.
____________

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1584
Credit: 6,209,931,851
RAC: 8,224,154
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59939 - Posted: 16 Feb 2023 | 17:30:46 UTC

OK, I've set 3 CPUs for continuity from the current Python task, and I've put weights of 1-1-1-97 in the job file so I can see what's happening.

My normal remote monitoring console shows the current average CPU usage, and I've put nvidia-smi on a five second loop. If either of those drops to zero, I'll abort it.

Chocks away!

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1584
Credit: 6,209,931,851
RAC: 8,224,154
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59941 - Posted: 16 Feb 2023 | 18:15:43 UTC

I see what you mean. Nearly half an hour in, CPU usage is showing around 25% of a single core, and GPU usage spiked once, to 41%, after about a quarter of an hour. It's one way of saving electricity, but I'd rather be doing something useful. Aborting.

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 399
Credit: 13,271,814,882
RAC: 902,320
Level
Trp
Scientific publications
watwatwat
Message 59961 - Posted: 22 Feb 2023 | 17:28:34 UTC

1.13 ATM running fine for me.
Keep aborting them and I'll run them for you.

zombie67 [MM]
Avatar
Send message
Joined: 16 Jul 07
Posts: 207
Credit: 1,955,411,456
RAC: 4,653,958
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59964 - Posted: 23 Feb 2023 | 3:08:08 UTC

FWIW, the first task I received completed successfully.

http://www.gpugrid.net/workunit.php?wuid=27410175
____________
Reno, NV
Team: SETI.USA

FritzB
Send message
Joined: 7 Apr 15
Posts: 11
Credit: 2,165,453,600
RAC: 3,337,979
Level
Phe
Scientific publications
wat
Message 59967 - Posted: 23 Feb 2023 | 8:18:47 UTC - in response to Message 59964.
Last modified: 23 Feb 2023 | 8:19:26 UTC

I've also finished one:
https://www.gpugrid.net/workunit.php?wuid=27410166

We're both using Linux Mint. It seems to crash on Win 10 machines (computer #600532 is mine, too).

zombie67 [MM]
Avatar
Send message
Joined: 16 Jul 07
Posts: 207
Credit: 1,955,411,456
RAC: 4,653,958
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59968 - Posted: 23 Feb 2023 | 15:33:06 UTC

Over night, I had 4 of these tasks cancelled by server.
____________
Reno, NV
Team: SETI.USA

KAMasud
Send message
Joined: 27 Jul 11
Posts: 137
Credit: 523,901,354
RAC: 0
Level
Lys
Scientific publications
watwat
Message 59969 - Posted: 23 Feb 2023 | 16:50:20 UTC - in response to Message 59961.

1.13 ATM running fine for me.
Keep aborting them and I'll run them for you.

_______________

Same here. I quite enjoy completing these WUs. There should be a way to analyse these WUs as to why it is happening on certain machines. We are mostly running the same hardware and OS. It would be fun to see the results.



















-

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59970 - Posted: 23 Feb 2023 | 17:27:24 UTC
Last modified: 23 Feb 2023 | 17:29:36 UTC

keep in mind these are bata tasks, and the batch being sent NOW are not necessarily the same as the batch sent last week and wont be the same as whatever is sent sometime in the future, until they get all the bugs worked out. the tasks last week basically ran with no perceived use of the GPU or CPU, so what were they doing? who knows. no official word from the project about these tasks at all. I wasn't willing to let the GPU/CPU be occupied for hours on end with the task spinning it's wheels when they could be doing something more useful.

maybe this current batch has been tweaked from last week and thats why they are working OK, for those that have completed this latest batch, did they have any meaningful use of the GPU or CPU? it also seems this batch was released with a new Windows application (they were Linux only before) for testing.
____________

KAMasud
Send message
Joined: 27 Jul 11
Posts: 137
Credit: 523,901,354
RAC: 0
Level
Lys
Scientific publications
watwat
Message 59971 - Posted: 24 Feb 2023 | 5:28:26 UTC - in response to Message 59970.

keep in mind these are bata tasks, and the batch being sent NOW are not necessarily the same as the batch sent last week and wont be the same as whatever is sent sometime in the future, until they get all the bugs worked out. the tasks last week basically ran with no perceived use of the GPU or CPU, so what were they doing? who knows. no official word from the project about these tasks at all. I wasn't willing to let the GPU/CPU be occupied for hours on end with the task spinning it's wheels when they could be doing something more useful.

maybe this current batch has been tweaked from last week and thats why they are working OK, for those that have completed this latest batch, did they have any meaningful use of the GPU or CPU? it also seems this batch was released with a new Windows application (they were Linux only before) for testing.

_______________________

Well, most of us know that Abouh reads every word written on these threads and without much song and dance, makes changes. He is the Only Admin on all the projects who diligently attend. Maybe, quite possibly. No arguments with your tweaking statement.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1097
Credit: 7,394,557,676
RAC: 8,742,409
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 59972 - Posted: 24 Feb 2023 | 8:36:54 UTC - in response to Message 59971.

Well, most of us know that Abouh reads every word written on these threads and without much song and dance, makes changes. He is the Only Admin on all the projects who diligently attend. Maybe, quite possibly. No arguments with your tweaking statement.

well, Abouh is the only one from the project team who actively communicates with us volunteers - which is great.
All others obviously don't care, and this has been like this over the years, unfortunately.
For example: 9 days ago I asked in the ACEMD 4 thread when new ACEMD 4 task will be around, or whether this subproject is dead.
No reply so far; whereas a reply could be very simple, not longer than just a line :-(

You know what I want to say ... it's kind of disappointing at times :-(

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59973 - Posted: 24 Feb 2023 | 13:27:38 UTC - in response to Message 59971.

keep in mind these are bata tasks, and the batch being sent NOW are not necessarily the same as the batch sent last week and wont be the same as whatever is sent sometime in the future, until they get all the bugs worked out. the tasks last week basically ran with no perceived use of the GPU or CPU, so what were they doing? who knows. no official word from the project about these tasks at all. I wasn't willing to let the GPU/CPU be occupied for hours on end with the task spinning it's wheels when they could be doing something more useful.

maybe this current batch has been tweaked from last week and thats why they are working OK, for those that have completed this latest batch, did they have any meaningful use of the GPU or CPU? it also seems this batch was released with a new Windows application (they were Linux only before) for testing.

_______________________

Well, most of us know that Abouh reads every word written on these threads and without much song and dance, makes changes. He is the Only Admin on all the projects who diligently attend. Maybe, quite possibly. No arguments with your tweaking statement.


that's great and all, but abouh is not the researcher working with this application. Abouh deals with the research with the Python RL tasks.

These ATM tasks look to be being run by Raimis.

(the researcher names are in the filenames of the WUs)

____________

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59974 - Posted: 24 Feb 2023 | 13:30:55 UTC

https://gpugrid.net/result.php?resultid=33321222

ran for 10+hours, failed due to file size limit after an otherwise successful computation.

:(
____________

Erich56
Send message
Joined: 1 Jan 15
Posts: 1097
Credit: 7,394,557,676
RAC: 8,742,409
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 59975 - Posted: 24 Feb 2023 | 14:39:26 UTC - in response to Message 59974.

... failed due to file size limit
:(

I am just trying to remember with which other application we've had the same problem some time ago - last year or 2 years ago ???

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59976 - Posted: 24 Feb 2023 | 14:51:34 UTC - in response to Message 59975.

... failed due to file size limit
:(

I am just trying to remember with which other application we've had the same problem some time ago - last year or 2 years ago ???


it's happened a few times in the past with acemd3 tasks.

see here from July 2021: https://www.gpugrid.net/forum_thread.php?id=5239#57117
____________

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 399
Credit: 13,271,814,882
RAC: 902,320
Level
Trp
Scientific publications
watwatwat
Message 59977 - Posted: 24 Feb 2023 | 18:19:21 UTC

Yea, I got my first ATM checkpoint :-)
Now my list of ATM ULs are stuck.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59978 - Posted: 24 Feb 2023 | 18:43:59 UTC - in response to Message 59977.

Yea, I got my first ATM checkpoint :-)
Now my list of ATM ULs are stuck.


the uploads are nearly 700MB in size, and likely the same problem from my link that we saw over a year ago. their server can't accept something that big, I don't think they ever figured out how to adjust the settings of their file server and just tried to keep the file sizes below the limit, which they seem to have forgotten about. nothing you do will get them to upload.

I've disabled ATM until they get it together with them.
____________

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 568
Credit: 6,937,279,524
RAC: 12,293,171
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59979 - Posted: 24 Feb 2023 | 21:20:56 UTC - in response to Message 59978.

On past chance, I bet and lost.
Currently, I'm only processing ACEMD tasks, when available. I happened to catch one this morning.

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 399
Credit: 13,271,814,882
RAC: 902,320
Level
Trp
Scientific publications
watwatwat
Message 59980 - Posted: 24 Feb 2023 | 23:20:56 UTC

GDF, Should I Abort these 12 completed ATM WUs that won't upload or is there a reasonable chance you'll fix it?

zombie67 [MM]
Avatar
Send message
Joined: 16 Jul 07
Posts: 207
Credit: 1,955,411,456
RAC: 4,653,958
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59981 - Posted: 25 Feb 2023 | 1:18:39 UTC

Well, I just achieved my 100 hours, which was my 1st priority. I will abort and reset (if necessary) the completed tasks I have. If/when the project gets its act together, I'll be back.
____________
Reno, NV
Team: SETI.USA

gemini8
Send message
Joined: 3 Jul 16
Posts: 31
Credit: 1,450,750,176
RAC: 2,723,060
Level
Met
Scientific publications
watwat
Message 59986 - Posted: 26 Feb 2023 | 11:04:13 UTC

For me it's just this:

So 26 Feb 2023 11:57:00 CET | GPUGRID | Started upload of TL9_55-RAIMIS_TEST_ATM-0-1-RND1804_0_0
So 26 Feb 2023 11:57:02 CET | GPUGRID | Backing off 04:12:16 on upload of TL9_55-RAIMIS_TEST_ATM-0-1-RND1804_0_0
So 26 Feb 2023 11:57:19 CET | GPUGRID | Started upload of TL9_55-RAIMIS_TEST_ATM-0-1-RND1804_0_0
So 26 Feb 2023 11:57:22 CET | GPUGRID | Backing off 05:10:06 on upload of TL9_55-RAIMIS_TEST_ATM-0-1-RND1804_0_0

No message about the size, just about backing off.
Hooray!
____________
Greetings, Jens

FritzB
Send message
Joined: 7 Apr 15
Posts: 11
Credit: 2,165,453,600
RAC: 3,337,979
Level
Phe
Scientific publications
wat
Message 59988 - Posted: 26 Feb 2023 | 11:56:45 UTC - in response to Message 59986.

I just aborted the upload (not the workunit) and then it was reported as valid.

https://www.gpugrid.net/results.php?hostid=604029

gemini8
Send message
Joined: 3 Jul 16
Posts: 31
Credit: 1,450,750,176
RAC: 2,723,060
Level
Met
Scientific publications
watwat
Message 59989 - Posted: 26 Feb 2023 | 13:55:33 UTC - in response to Message 59988.

I just aborted the upload (not the workunit) and then it was reported as valid.

https://www.gpugrid.net/results.php?hostid=604029

Indeed, this worked out for me as well.
But is there a result that can be used?
____________
Greetings, Jens

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 59990 - Posted: 26 Feb 2023 | 14:03:51 UTC - in response to Message 59986.

For me it's just this:
So 26 Feb 2023 11:57:00 CET | GPUGRID | Started upload of TL9_55-RAIMIS_TEST_ATM-0-1-RND1804_0_0
So 26 Feb 2023 11:57:02 CET | GPUGRID | Backing off 04:12:16 on upload of TL9_55-RAIMIS_TEST_ATM-0-1-RND1804_0_0
So 26 Feb 2023 11:57:19 CET | GPUGRID | Started upload of TL9_55-RAIMIS_TEST_ATM-0-1-RND1804_0_0
So 26 Feb 2023 11:57:22 CET | GPUGRID | Backing off 05:10:06 on upload of TL9_55-RAIMIS_TEST_ATM-0-1-RND1804_0_0

No message about the size, just about backing off.
Hooray!


There won’t be any message about why it failed until you enable debugging messages. See the previous link I posted about when this issues happened 1.5 years ago.
____________

kksplace
Send message
Joined: 4 Mar 18
Posts: 53
Credit: 1,633,996,749
RAC: 3,259,371
Level
His
Scientific publications
wat
Message 59991 - Posted: 26 Feb 2023 | 14:12:44 UTC - in response to Message 59988.

I just aborted the upload (not the workunit) and then it was reported as valid.


Partially successful for me. I attempted with two of these and one ended up as "Upload failed" while the other "Completed and validated".

fzs600
Send message
Joined: 14 Nov 10
Posts: 2
Credit: 638,367,557
RAC: 3,277,772
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 59992 - Posted: 26 Feb 2023 | 15:57:47 UTC - in response to Message 59988.

I just aborted the upload (not the workunit) and then it was reported as valid.

https://www.gpugrid.net/results.php?hostid=604029

Indeed, this worked out for me as well.

mikey
Send message
Joined: 2 Jan 09
Posts: 292
Credit: 2,859,508,615
RAC: 11,647,930
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59993 - Posted: 26 Feb 2023 | 16:27:45 UTC - in response to Message 59992.

I just aborted the upload (not the workunit) and then it was reported as valid.

https://www.gpugrid.net/results.php?hostid=604029

Indeed, this worked out for me as well.


It worked on multiple pc's for me too

Speedy
Send message
Joined: 19 Aug 07
Posts: 42
Credit: 28,391,082
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwat
Message 60008 - Posted: 4 Mar 2023 | 3:36:01 UTC - in response to Message 59755.

task ran to completion in about an hour. but hit an error and threw it all away because the file size is too big.

upload failure: <file_xfer_error>
<file_name>T11_4-RAIMIS_TEST_ATM-0-1-RND7054_2_0</file_name>
<error_code>-131 (file size too big)</error_code>
</file_xfer_error>


what a waste.

No it's not a waste in my opinion because you found something out. You found that "the file size was too big" so it can be corrected so it doesn't happen again hopefully. :-)

Erich56
Send message
Joined: 1 Jan 15
Posts: 1097
Credit: 7,394,557,676
RAC: 8,742,409
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 60010 - Posted: 4 Mar 2023 | 6:31:15 UTC

this now is a topic also on this thread:
https://www.gpugrid.net/forum_thread.php?id=5379
which has been opened by the developer Quico

Magiceye04
Send message
Joined: 1 Apr 09
Posts: 24
Credit: 67,905,687
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwat
Message 60178 - Posted: 25 Mar 2023 | 13:28:56 UTC

How can I get ATM ?
Serverstatus tells me, there are more then hundred WUs ready to send at the moment.
Boinc Manager tells me:
Sa 25 Mär 2023 14:20:07 CET | GPUGRID | No tasks are available for ATM: Free energy calculations of protein-ligand binding

The PC is running with Ubuntu 20LTS, Geforce1070ti and driver 470.16

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1039
Credit: 40,148,957,483
RAC: 28,921,098
Level
Trp
Scientific publications
wat
Message 60179 - Posted: 25 Mar 2023 | 13:38:05 UTC - in response to Message 60178.

you need to enable beta/test applications in your project preferences
____________

Magiceye04
Send message
Joined: 1 Apr 09
Posts: 24
Credit: 67,905,687
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwat
Message 60180 - Posted: 25 Mar 2023 | 13:40:43 UTC
Last modified: 25 Mar 2023 | 14:06:37 UTC

Ah, Thanks. The "test application" setting I have missed.

Now I have to wait some hours for the download to be finished.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 568
Credit: 6,937,279,524
RAC: 12,293,171
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60313 - Posted: 12 Apr 2023 | 22:11:05 UTC

So far, I noticed on ATM tasks an abnormal progress notification.
Progress usually jumped from 0% to 0.199% in a short first step, and then directly to 100% in a second long step, staying so until task completion.
Along this second step, estimated time remaining was not shown ( "---" shown instead)

Example of wrong progress notification:
CDK2_29_26_5-QUICO_ATM_OFF_STEPS-2-5-RND5867_0




Today, I catched two ATM tasks showing a linear progression and accurate estimated time remaining.
At this moment, both of them are still in progress.

Examples of right progress notification:
Tyk2_jmc_23_jmc_27_2-QUICO_ATM_OFF12_STEPS-0-5-RND0292_2




Tyk2_jmc_23_ejm_55_5-QUICO_ATM_OFF12_STEPS-0-5-RND1896_3

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1301
Credit: 5,525,616,959
RAC: 8,503,032
Level
Tyr
Scientific publications
watwatwatwatwat
Message 60314 - Posted: 13 Apr 2023 | 2:12:35 UTC

There is still a mix of old, broken progress tasks along with fixed progress tasks in rotation.

Just depends on whether you get a new _0 or an older _x wingman task.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1584
Credit: 6,209,931,851
RAC: 8,224,154
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60315 - Posted: 13 Apr 2023 | 7:04:17 UTC - in response to Message 60314.

No, it's not the replication number.

The clue is in the task name: one with "STEPS-0-5" will show normal progress, one with any other "STEPS-n-5" will jump quickly to 100%.

The old, very long running, tasks processed 341 samples all in one go. The new shorter ones have been split into five shorter runs, processing 70 samples each (confusingly numbered 0 to 4 - I've never seen a 'steps-5-5').

Number zero - the first in the chain - processes samples 1 to 70, which is what the progress display expects. The second processes samples 71 to 140 - so it starts beyond the finishing point. And so on.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 568
Credit: 6,937,279,524
RAC: 12,293,171
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60317 - Posted: 13 Apr 2023 | 10:36:53 UTC - in response to Message 60315.

Nice explanation.
This makes full sense to that behavior.

Post to thread

Message boards : Number crunching : ATM: Free Energy Calculations new application

//