Advanced search

Message boards : Number crunching : All ATM tasks error out or have to be aborted

Author Message
Greg _BE
Send message
Joined: 30 Jun 14
Posts: 126
Credit: 99,844,439
RAC: 174,283
Level
Thr
Scientific publications
watwatwatwatwatwat
Message 60478 - Posted: 22 May 2023 | 21:05:08 UTC

I can't seem to get any ATM tasks to run.
They either terminate on their own in a computation error or I abort them after they reach "100%" and stick in a endless loop until I kill them.
For now I have suspended this project because its pointless to run stuff that doesn't work on my system.

Here is a combination of BOINC tasks information (first section) and a run log (2nd section)

Name p38_m2z_maa_3-QUICO_ATM_Sage_xTB_14-4-5-RND6656_0

Application ATMbeta: Free energy calculations of protein-ligand binding 1.09 (cuda1121)
Workunit name p38_m2z_maa_3-QUICO_ATM_Sage_xTB_14-4-5-RND6656
State Running
Received 5/20/2023 10:22:37 PM
Report deadline 5/25/2023 10:22:36 PM
Estimated app speed 32,450.25 GFLOPs/sec
Estimated task size 1,000,000,000 GFLOPs
Resources 0.988 CPUs + 1 NVIDIA GPU (device 0)
CPU time at last checkpoint 00:00:00
CPU time 02:18:29
Elapsed time 00:51:04
Estimated time remaining 00:00:00
Fraction done 100.000%
Virtual memory size 7,218.47 MB
Working set size 2,101.48 MB
Directory slots/0
Process ID 17800

Debug State: 2 - Scheduler: 2

This runs on a GTX 1080 and one virtual core.



What does run.log show?



Setup environment

D:\data\slots\0>set HOMEPATH=D:\data\slots\0

D:\data\slots\0>set PATH=D:\data\slots\0;D:\data\slots\0\Library\usr\bin;D:\data\slots\0\Library\bin;C:\Windows\system32;C:\Windows

D:\data\slots\0>set PYTHONPATH=D:\data\slots\0\Lib\python3.9\site-packages

D:\data\slots\0>set SYSTEMROOT=C:\Windows
Create a temporary directory

D:\data\slots\0>set TEMP=D:\data\slots\0\tmp

D:\data\slots\0>mkdir D:\data\slots\0\tmp
Install AToM

D:\data\slots\0>set REPO_URL=git+https://github.com/raimis/AToM-OpenMM.git@d7931b9a6217232d481731f7589d64b100a514ac

D:\data\slots\0>python.exe -m pip install git+https://github.com/raimis/AToM-OpenMM.git@d7931b9a6217232d481731f7589d64b100a514ac || exit 14
Collecting git+https://github.com/raimis/AToM-OpenMM.git@d7931b9a6217232d481731f7589d64b100a514ac
Cloning https://github.com/raimis/AToM-OpenMM.git (to revision d7931b9a6217232d481731f7589d64b100a514ac) to d:\data\slots\0\tmp\pip-req-build-n4bnfm46
Resolved https://github.com/raimis/AToM-OpenMM.git to commit d7931b9a6217232d481731f7589d64b100a514ac
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: async-re
Building wheel for async-re (setup.py): started
Building wheel for async-re (setup.py): finished with status 'done'
Created wheel for async-re: filename=async_re-3.3.0-py3-none-any.whl size=40735 sha256=b78cd7a2db0c0a4584d16c9a7967a2bdafe4f91754d0514cf1d4be7fd9e7038f
Stored in directory: c:\users\greg\appdata\local\pip\cache\wheels\e3\94\02\5d2f795e8088e5cda09e48b0a167d6325c316862c02fa11467
Successfully built async-re
Installing collected packages: async-re
Successfully installed async-re-3.3.0

D:\data\slots\0>python.exe -m pip list
Package Version
------------ -------
async-re 3.3.0
atmmetaforce 0.3
configobj 5.0.8
numpy 1.24.2
OpenMM 8.0.0
pip 23.0.1
setuptools 67.6.0
six 1.16.0
wheel 0.40.0
Configure AToM

D:\data\slots\0>echo localhost,0:0,1,CUDA,,D:\data\slots\0\tmp 1>nodefile
Extract restart

D:\data\slots\0>tar.exe xjvf restart.tar.bz2 || true
r0/p38_m2z_maa_ckpt.xml
r1/p38_m2z_maa_ckpt.xml
r10/p38_m2z_maa_ckpt.xml
r11/p38_m2z_maa_ckpt.xml
r12/p38_m2z_maa_ckpt.xml
r13/p38_m2z_maa_ckpt.xml
r14/p38_m2z_maa_ckpt.xml
r15/p38_m2z_maa_ckpt.xml
r16/p38_m2z_maa_ckpt.xml
r17/p38_m2z_maa_ckpt.xml
r18/p38_m2z_maa_ckpt.xml
r19/p38_m2z_maa_ckpt.xml
r2/p38_m2z_maa_ckpt.xml
r20/p38_m2z_maa_ckpt.xml
r21/p38_m2z_maa_ckpt.xml
r3/p38_m2z_maa_ckpt.xml
r4/p38_m2z_maa_ckpt.xml
r5/p38_m2z_maa_ckpt.xml
r6/p38_m2z_maa_ckpt.xml
r7/p38_m2z_maa_ckpt.xml
r8/p38_m2z_maa_ckpt.xml
r9/p38_m2z_maa_ckpt.xml
Run AToM

D:\data\slots\0>set CONFIG_FILE=p38_m2z_maa_asyncre.cntl

D:\data\slots\0>python.exe Scripts\rbfe_explicit_sync.py p38_m2z_maa_asyncre.cntl || exit 22
2023-05-21 12:44:43 - INFO - sync_re - Configuration:
2023-05-21 12:44:43 - INFO - sync_re - JOB_TRANSPORT: LOCAL_OPENMM
2023-05-21 12:44:43 - INFO - sync_re - BASENAME: p38_m2z_maa
2023-05-21 12:44:43 - INFO - sync_re - RE_SETUP: YES
2023-05-21 12:44:43 - INFO - sync_re - TEMPERATURES: 300
2023-05-21 12:44:43 - INFO - sync_re - LAMBDAS: 0.00, 0.05, 0.10, 0.15, 0.20, 0.25, 0.30, 0.35, 0.40, 0.45, 0.50, 0.50, 0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.85, 0.90, 0.95, 1.00
2023-05-21 12:44:43 - INFO - sync_re - DIRECTION: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1
2023-05-21 12:44:43 - INFO - sync_re - INTERMEDIATE: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
2023-05-21 12:44:43 - INFO - sync_re - LAMBDA1: 0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.10, 0.20, 0.30, 0.40, 0.50, 0.50, 0.40, 0.30, 0.20, 0.10, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00
2023-05-21 12:44:43 - INFO - sync_re - LAMBDA2: 0.00, 0.10, 0.20, 0.30, 0.40, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.40, 0.30, 0.20, 0.10, 0.00
2023-05-21 12:44:43 - INFO - sync_re - ALPHA: 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10, 0.10
2023-05-21 12:44:43 - INFO - sync_re - U0: 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110., 110.
2023-05-21 12:44:43 - INFO - sync_re - W0COEFF: 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
2023-05-21 12:44:43 - INFO - sync_re - DISPLACEMENT: 22.0, 22.0, 22.0
2023-05-21 12:44:43 - INFO - sync_re - WALL_TIME: 9999
2023-05-21 12:44:43 - INFO - sync_re - CYCLE_TIME: 60
2023-05-21 12:44:43 - INFO - sync_re - CHECKPOINT_TIME: 300
2023-05-21 12:44:43 - INFO - sync_re - NODEFILE: nodefile
2023-05-21 12:44:43 - INFO - sync_re - SUBJOBS_BUFFER_SIZE: 0
2023-05-21 12:44:43 - INFO - sync_re - PRODUCTION_STEPS: 2000
2023-05-21 12:44:43 - INFO - sync_re - PRNT_FREQUENCY: 2000
2023-05-21 12:44:43 - INFO - sync_re - TRJ_FREQUENCY: 40000
2023-05-21 12:44:43 - INFO - sync_re - LIGAND1_ATOMS: ['5596', '5597', '5598', '5599', '5600', '5601', '5602', '5603', '5604', '5605', '5606', '5607', '5608', '5609', '5610', '5611', '5612', '5613', '5614', '5615', '5616', '5617', '5618', '5619', '5620', '5621', '5622', '5623', '5624', '5625', '5626', '5627', '5628', '5629', '5630', '5631', '5632', '5633', '5634', '5635', '5636', '5637', '5638', '5639', '5640']
2023-05-21 12:44:43 - INFO - sync_re - LIGAND2_ATOMS: ['5641', '5642', '5643', '5644', '5645', '5646', '5647', '5648', '5649', '5650', '5651', '5652', '5653', '5654', '5655', '5656', '5657', '5658', '5659', '5660', '5661', '5662', '5663', '5664', '5665', '5666', '5667', '5668', '5669', '5670', '5671', '5672', '5673', '5674', '5675', '5676', '5677', '5678', '5679', '5680', '5681', '5682', '5683']
2023-05-21 12:44:43 - INFO - sync_re - LIGAND1_CM_ATOMS: 5601
2023-05-21 12:44:43 - INFO - sync_re - LIGAND2_CM_ATOMS: 5646
2023-05-21 12:44:43 - INFO - sync_re - RCPT_CM_ATOMS: ['460', '483', '494', '501', '550', '745', '755', '771', '1178', '1337', '1363', '1654', '1673', '1689', '1703', '1720', '1739', '1756', '1763', '1773', '2532', '2685']
2023-05-21 12:44:43 - INFO - sync_re - CM_KF: 25.00
2023-05-21 12:44:43 - INFO - sync_re - CM_TOL: 10
2023-05-21 12:44:43 - INFO - sync_re - POS_RESTRAINED_ATOMS: ['4', '19', '51', '57', '71', '91', '112', '136', '153', '168', '187', '201', '223', '237', '256', '280', '295', '319', '325', '340', '364', '385', '402', '416', '435', '454', '460', '476', '483', '494', '501', '511', '532', '539', '550', '566', '577', '587', '597', '617', '629', '643', '665', '679', '686', '705', '729', '745', '755', '771', '793', '815', '834', '845', '877', '883', '903', '920', '931', '950', '969', '986', '996', '1018', '1042', '1056', '1077', '1101', '1116', '1135', '1159', '1178', '1197', '1219', '1236', '1253', '1275', '1292', '1307', '1321', '1337', '1356', '1363', '1382', '1401', '1413', '1429', '1449', '1471', '1477', '1487', '1511', '1522', '1541', '1556', '1571', '1591', '1605', '1617', '1633', '1654', '1673', '1689', '1703', '1720', '1739', '1756', '1763', '1773', '1785', '1804', '1818', '1832', '1851', '1867', '1889', '1900', '1917', '1939', '1958', '1972', '1984', '1996', '2013', '2029', '2046', '2066', '2085', '2104', '2125', '2142', '2161', '2180', '2204', '2211', '2230', '2252', '2273', '2292', '2309', '2320', '2330', '2342', '2361', '2380', '2397', '2421', '2433', '2452', '2482', '2488', '2499', '2513', '2532', '2542', '2558', '2572', '2587', '2599', '2610', '2625', '2644', '2666', '2685', '2704', '2716', '2736', '2743', '2765', '2782', '2796', '2808', '2820', '2835', '2852', '2866', '2873', '2894', '2910', '2920', '2934', '2958', '2982', '3003', '3027', '3045', '3051', '3066', '3085', '3102', '3121', '3135', '3159', '3176', '3193', '3214', '3228', '3245', '3259', '3275', '3287', '3306', '3330', '3341', '3357', '3364', '3375', '3394', '3411', '3421', '3436', '3455', '3474', '3488', '3495', '3519', '3533', '3552', '3580', '3586', '3593', '3607', '3619', '3636', '3655', '3667', '3684', '3703', '3725', '3744', '3763', '3782', '3806', '3825', '3841', '3848', '3870', '3876', '3883', '3893', '3908', '3927', '3946', '3968', '3990', '4009', '4020', '4031', '4046', '4057', '4067', '4091', '4105', '4126', '4145', '4162', '4173', '4192', '4206', '4223', '4248', '4254', '4276', '4293', '4307', '4327', '4337', '4351', '4367', '4387', '4406', '4413', '4423', '4445', '4451', '4470', '4480', '4496', '4508', '4527', '4546', '4561', '4583', '4600', '4619', '4635', '4654', '4666', '4677', '4689', '4711', '4735', '4754', '4768', '4778', '4788', '4805', '4815', '4834', '4844', '4861', '4871', '4892', '4912', '4922', '4939', '4960', '4977', '4997', '5003', '5015', '5027', '5050', '5056', '5072', '5082', '5102', '5108', '5129', '5141', '5158', '5169', '5189', '5204', '5215', '5239', '5251', '5270', '5289', '5308', '5320', '5335', '5359', '5381', '5392', '5411', '5425', '5446', '5458', '5473', '5489', '5508', '5519', '5539', '5563', '5577', '5591']
2023-05-21 12:44:43 - INFO - sync_re - POSRE_FORCE_CONSTANT: 25.0
2023-05-21 12:44:43 - INFO - sync_re - POSRE_TOLERANCE: 1.5
2023-05-21 12:44:43 - INFO - sync_re - ALIGN_LIGAND1_REF_ATOMS: ['5', '1', '20']
2023-05-21 12:44:43 - INFO - sync_re - ALIGN_LIGAND2_REF_ATOMS: ['5', '1', '20']
2023-05-21 12:44:43 - INFO - sync_re - ALIGN_KF_SEP: 2.5
2023-05-21 12:44:43 - INFO - sync_re - ALIGN_K_THETA: 25.0
2023-05-21 12:44:43 - INFO - sync_re - ALIGN_K_PSI: 25.0
2023-05-21 12:44:43 - INFO - sync_re - UMAX: 200.00
2023-05-21 12:44:43 - INFO - sync_re - ACORE: 0.062500
2023-05-21 12:44:43 - INFO - sync_re - UBCORE: 100.0
2023-05-21 12:44:43 - INFO - sync_re - FRICTION_COEFF: 0.100000
2023-05-21 12:44:43 - INFO - sync_re - TIME_STEP: 0.004
2023-05-21 12:44:43 - INFO - sync_re - OPENMM_PLATFORM: CUDA
2023-05-21 12:44:43 - INFO - sync_re - VERBOSE: no
2023-05-21 12:44:43 - INFO - sync_re - HMASS: 1.5
2023-05-21 12:44:43 - INFO - sync_re - MAX_SAMPLES: +70
2023-05-21 12:44:43 - INFO - sync_re - State parameters
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.0, 'atmdirection': 1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.0, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.05, 'atmdirection': 1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.1, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.1, 'atmdirection': 1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.2, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.15, 'atmdirection': 1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.3, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.2, 'atmdirection': 1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.4, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.25, 'atmdirection': 1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.3, 'atmdirection': 1.0, 'atmintermediate': 0.0, 'lambda1': 0.1, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.35, 'atmdirection': 1.0, 'atmintermediate': 0.0, 'lambda1': 0.2, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.4, 'atmdirection': 1.0, 'atmintermediate': 0.0, 'lambda1': 0.3, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.45, 'atmdirection': 1.0, 'atmintermediate': 0.0, 'lambda1': 0.4, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.5, 'atmdirection': 1.0, 'atmintermediate': 1.0, 'lambda1': 0.5, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=1.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.5, 'atmdirection': -1.0, 'atmintermediate': 1.0, 'lambda1': 0.5, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=1.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.55, 'atmdirection': -1.0, 'atmintermediate': 0.0, 'lambda1': 0.4, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.6, 'atmdirection': -1.0, 'atmintermediate': 0.0, 'lambda1': 0.3, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.65, 'atmdirection': -1.0, 'atmintermediate': 0.0, 'lambda1': 0.2, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.7, 'atmdirection': -1.0, 'atmintermediate': 0.0, 'lambda1': 0.1, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.75, 'atmdirection': -1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.5, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.8, 'atmdirection': -1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.4, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.85, 'atmdirection': -1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.3, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.9, 'atmdirection': -1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.2, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 0.95, 'atmdirection': -1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.1, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - State: {'lambda': 1.0, 'atmdirection': -1.0, 'atmintermediate': 0.0, 'lambda1': 0.0, 'lambda2': 0.0, 'alpha': Quantity(value=0.1, unit=mole/kilocalorie), 'u0': Quantity(value=110.0, unit=kilocalorie/mole), 'w0': Quantity(value=0.0, unit=kilocalorie/mole), 'temperature': Quantity(value=300.0, unit=kelvin)}
2023-05-21 12:44:43 - INFO - sync_re - Started: ATM setup
2023-05-21 12:44:43 - INFO - sync_re - Started: create system
warning: AddRestraintForce() is deprecated. Use addVsiteRestraintForceCMCM()
warning: AddRestraintForce() is deprecated. Use addVsiteRestraintForceCMCM()
2023-05-21 12:44:46 - INFO - sync_re - Running with a 4.000000 fs time-step with bonded forces integrated 4 times per time-step
2023-05-21 12:44:46 - INFO - sync_re - Finished: create system (duration: 3.515999999999849 s)
2023-05-21 12:44:46 - INFO - sync_re - Started: create worker
2023-05-21 12:44:46 - INFO - sync_re - Device: CUDA 0
2023-05-21 12:45:24 - INFO - sync_re - Finished: create worker (duration: 37.702999999999975 s)
2023-05-21 12:45:24 - INFO - sync_re - Started: create replicas
2023-05-21 12:45:24 - INFO - sync_re - Loading checkpointfile r0/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:27 - INFO - sync_re - Loading checkpointfile r1/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:29 - INFO - sync_re - Loading checkpointfile r2/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:32 - INFO - sync_re - Loading checkpointfile r3/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:34 - INFO - sync_re - Loading checkpointfile r4/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:37 - INFO - sync_re - Loading checkpointfile r5/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:39 - INFO - sync_re - Loading checkpointfile r6/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:42 - INFO - sync_re - Loading checkpointfile r7/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:44 - INFO - sync_re - Loading checkpointfile r8/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:47 - INFO - sync_re - Loading checkpointfile r9/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:49 - INFO - sync_re - Loading checkpointfile r10/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:52 - INFO - sync_re - Loading checkpointfile r11/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:54 - INFO - sync_re - Loading checkpointfile r12/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:57 - INFO - sync_re - Loading checkpointfile r13/p38_m2z_maa_ckpt.xml
2023-05-21 12:45:59 - INFO - sync_re - Loading checkpointfile r14/p38_m2z_maa_ckpt.xml
2023-05-21 12:46:02 - INFO - sync_re - Loading checkpointfile r15/p38_m2z_maa_ckpt.xml
2023-05-21 12:46:04 - INFO - sync_re - Loading checkpointfile r16/p38_m2z_maa_ckpt.xml
2023-05-21 12:46:07 - INFO - sync_re - Loading checkpointfile r17/p38_m2z_maa_ckpt.xml
2023-05-21 12:46:09 - INFO - sync_re - Loading checkpointfile r18/p38_m2z_maa_ckpt.xml
2023-05-21 12:46:12 - INFO - sync_re - Loading checkpointfile r19/p38_m2z_maa_ckpt.xml
2023-05-21 12:46:14 - INFO - sync_re - Loading checkpointfile r20/p38_m2z_maa_ckpt.xml
2023-05-21 12:46:17 - INFO - sync_re - Loading checkpointfile r21/p38_m2z_maa_ckpt.xml
2023-05-21 12:46:19 - INFO - sync_re - Replica 0: cycle 211, state 5
2023-05-21 12:46:19 - INFO - sync_re - Replica 1: cycle 211, state 2
2023-05-21 12:46:19 - INFO - sync_re - Replica 2: cycle 211, state 3
2023-05-21 12:46:19 - INFO - sync_re - Replica 3: cycle 211, state 11
2023-05-21 12:46:19 - INFO - sync_re - Replica 4: cycle 211, state 1
2023-05-21 12:46:19 - INFO - sync_re - Replica 5: cycle 211, state 6
2023-05-21 12:46:19 - INFO - sync_re - Replica 6: cycle 211, state 1
2023-05-21 12:46:19 - INFO - sync_re - Replica 7: cycle 211, state 10
2023-05-21 12:46:19 - INFO - sync_re - Replica 8: cycle 211, state 4
2023-05-21 12:46:19 - INFO - sync_re - Replica 9: cycle 211, state 8
2023-05-21 12:46:19 - INFO - sync_re - Replica 10: cycle 211, state 7
2023-05-21 12:46:19 - INFO - sync_re - Replica 11: cycle 211, state 20
2023-05-21 12:46:19 - INFO - sync_re - Replica 12: cycle 211, state 12
2023-05-21 12:46:19 - INFO - sync_re - Replica 13: cycle 211, state 14
2023-05-21 12:46:19 - INFO - sync_re - Replica 14: cycle 211, state 9
2023-05-21 12:46:19 - INFO - sync_re - Replica 15: cycle 211, state 13
2023-05-21 12:46:19 - INFO - sync_re - Replica 16: cycle 211, state 19
2023-05-21 12:46:19 - INFO - sync_re - Replica 17: cycle 211, state 15
2023-05-21 12:46:19 - INFO - sync_re - Replica 18: cycle 211, state 17
2023-05-21 12:46:19 - INFO - sync_re - Replica 19: cycle 211, state 16
2023-05-21 12:46:19 - INFO - sync_re - Replica 20: cycle 211, state 18
2023-05-21 12:46:19 - INFO - sync_re - Replica 21: cycle 211, state 21
2023-05-21 12:46:19 - INFO - sync_re - Finished: create replicas (duration: 55.406000000000176 s)
2023-05-21 12:46:19 - INFO - sync_re - Started: update replicas
2023-05-21 12:46:28 - INFO - sync_re - Finished: update replicas (duration: 9.1099999999999 s)
2023-05-21 12:46:28 - INFO - sync_re - Finished: ATM setup (duration: 105.7349999999999 s)
2023-05-21 12:46:28 - INFO - sync_re - Started: ATM simulations
2023-05-21 12:46:28 - INFO - sync_re - Additional number of samples: 70
2023-05-21 12:46:28 - INFO - sync_re - Started: sample 211
2023-05-21 12:46:28 - INFO - sync_re - Started: sample 211, replica 0
2023-05-21 12:47:05 - INFO - sync_re - Finished: sample 211, replica 0 (duration: 36.125 s)
2023-05-21 12:47:05 - INFO - sync_re - Started: sample 211, replica 1
2023-05-21 12:47:35 - INFO - sync_re - Finished: sample 211, replica 1 (duration: 30.8900000000001 s)
2023-05-21 12:47:35 - INFO - sync_re - Started: sample 211, replica 2
2023-05-21 12:48:07 - INFO - sync_re - Finished: sample 211, replica 2 (duration: 31.9849999999999 s)
2023-05-21 12:48:07 - INFO - sync_re - Started: sample 211, replica 3
2023-05-21 12:48:38 - INFO - sync_re - Finished: sample 211, replica 3 (duration: 30.54600000000005 s)
2023-05-21 12:48:38 - INFO - sync_re - Started: sample 211, replica 4
2023-05-21 12:49:09 - INFO - sync_re - Finished: sample 211, replica 4 (duration: 31.297000000000025 s)
2023-05-21 12:49:09 - INFO - sync_re - Started: sample 211, replica 5
2023-05-21 12:49:41 - INFO - sync_re - Finished: sample 211, replica 5 (duration: 31.375 s)
2023-05-21 12:49:41 - INFO - sync_re - Started: sample 211, replica 6
2023-05-21 12:50:11 - INFO - sync_re - Finished: sample 211, replica 6 (duration: 30.672000000000025 s)
2023-05-21 12:50:11 - INFO - sync_re - Started: sample 211, replica 7
2023-05-21 12:50:43 - INFO - sync_re - Finished: sample 211, replica 7 (duration: 31.6099999999999 s)
2023-05-21 12:50:43 - INFO - sync_re - Started: sample 211, replica 8
2023-05-21 12:51:14 - INFO - sync_re - Finished: sample 211, replica 8 (duration: 30.843000000000075 s)
2023-05-21 12:51:14 - INFO - sync_re - Started: sample 211, replica 9
2023-05-21 12:51:42 - INFO - sync_re - Finished: sample 211, replica 9 (duration: 28.672000000000025 s)
2023-05-21 12:51:42 - INFO - sync_re - Started: sample 211, replica 10
2023-05-21 12:52:13 - INFO - sync_re - Finished: sample 211, replica 10 (duration: 30.5 s)
2023-05-21 12:52:13 - INFO - sync_re - Started: sample 211, replica 11
2023-05-21 12:52:43 - INFO - sync_re - Finished: sample 211, replica 11 (duration: 30.312999999999874 s)
2023-05-21 12:52:43 - INFO - sync_re - Started: sample 211, replica 12
2023-05-21 12:53:15 - INFO - sync_re - Finished: sample 211, replica 12 (duration: 31.797000000000025 s)
2023-05-21 12:53:15 - INFO - sync_re - Started: sample 211, replica 13
2023-05-21 12:53:48 - INFO - sync_re - Finished: sample 211, replica 13 (duration: 32.5 s)
2023-05-21 12:53:48 - INFO - sync_re - Started: sample 211, replica 14
2023-05-21 12:54:20 - INFO - sync_re - Finished: sample 211, replica 14 (duration: 32.53099999999995 s)
2023-05-21 12:54:20 - INFO - sync_re - Started: sample 211, replica 15
2023-05-21 12:54:52 - INFO - sync_re - Finished: sample 211, replica 15 (duration: 32.34400000000005 s)
2023-05-21 12:54:52 - INFO - sync_re - Started: sample 211, replica 16
2023-05-21 12:55:24 - INFO - sync_re - Finished: sample 211, replica 16 (duration: 31.75 s)
2023-05-21 12:55:24 - INFO - sync_re - Started: sample 211, replica 17
2023-05-21 12:55:56 - INFO - sync_re - Finished: sample 211, replica 17 (duration: 31.875 s)
2023-05-21 12:55:56 - INFO - sync_re - Started: sample 211, replica 18
2023-05-21 12:56:29 - INFO - sync_re - Finished: sample 211, replica 18 (duration: 32.4849999999999 s)
2023-05-21 12:56:29 - INFO - sync_re - Started: sample 211, replica 19
2023-05-21 12:56:59 - INFO - sync_re - Finished: sample 211, replica 19 (duration: 30.077999999999975 s)
2023-05-21 12:56:59 - INFO - sync_re - Started: sample 211, replica 20
2023-05-21 12:57:30 - INFO - sync_re - Finished: sample 211, replica 20 (duration: 31.2650000000001 s)
2023-05-21 12:57:30 - INFO - sync_re - Started: sample 211, replica 21
2023-05-21 12:58:00 - INFO - sync_re - Finished: sample 211, replica 21 (duration: 30.593999999999824 s)
2023-05-21 12:58:00 - INFO - sync_re - Started: exchange replicas
2023-05-21 12:58:00 - INFO - sync_re - Replica 18: 17 --> 18
2023-05-21 12:58:00 - INFO - sync_re - Replica 20: 18 --> 17
2023-05-21 12:58:00 - INFO - sync_re - Finished: exchange replicas (duration: 0.047000000000025466 s)
2023-05-21 12:58:00 - INFO - sync_re - Started: update replicas
2023-05-21 12:58:09 - INFO - sync_re - Finished: update replicas (duration: 8.812000000000126 s)
2023-05-21 12:58:09 - INFO - sync_re - Started: write replicas samples and trajectories
2023-05-21 12:58:09 - INFO - sync_re - Finished: write replicas samples and trajectories (duration: 0.015999999999849024 s)
2023-05-21 12:58:09 - INFO - sync_re - Started: checkpointing
2023-05-21 12:58:59 - INFO - sync_re - Finished: checkpointing (duration: 50.031000000000176 s)
2023-05-21 12:58:59 - INFO - sync_re - Finished: sample 211 (duration: 750.9680000000001 s)
2023-05-21 12:58:59 - INFO - sync_re - Started: sample 212
2023-05-21 12:58:59 - INFO - sync_re - Started: sample 212, replica 0
2023-05-21 12:59:30 - INFO - sync_re - Finished: sample 212, replica 0 (duration: 30.687999999999874 s)
2023-05-21 12:59:30 - INFO - sync_re - Started: sample 212, replica 1
2023-05-21 13:00:00 - INFO - sync_re - Finished: sample 212, replica 1 (duration: 30.25 s)
2023-05-21 13:00:00 - INFO - sync_re - Started: sample 212, replica 2
2023-05-21 13:00:31 - INFO - sync_re - Finished: sample 212, replica 2 (duration: 30.797000000000025 s)
2023-05-21 13:00:31 - INFO - sync_re - Started: sample 212, replica 3
2023-05-21 13:01:02 - INFO - sync_re - Finished: sample 212, replica 3 (duration: 30.467999999999847 s)
2023-05-21 13:01:02 - INFO - sync_re - Started: sample 212, replica 4
2023-05-21 13:01:31 - INFO - sync_re - Finished: sample 212, replica 4 (duration: 29.71900000000005 s)
2023-05-21 13:01:31 - INFO - sync_re - Started: sample 212, replica 5
2023-05-21 13:02:02 - INFO - sync_re - Finished: sample 212, replica 5 (duration: 30.90599999999995 s)
2023-05-21 13:02:02 - INFO - sync_re - Started: sample 212, replica 6
2023-05-21 13:02:32 - INFO - sync_re - Finished: sample 212, replica 6 (duration: 29.96900000000005 s)
2023-05-21 13:02:32 - INFO - sync_re - Started: sample 212, replica 7
2023-05-21 13:03:03 - INFO - sync_re - Finished: sample 212, replica 7 (duration: 30.391000000000076 s)
2023-05-21 13:03:03 - INFO - sync_re - Started: sample 212, replica 8
2023-05-21 13:03:33 - INFO - sync_re - Finished: sample 212, replica 8 (duration: 30.34400000000005 s)
2023-05-21 13:03:33 - INFO - sync_re - Started: sample 212, replica 9
2023-05-21 13:04:04 - INFO - sync_re - Finished: sample 212, replica 9 (duration: 30.79599999999982 s)
2023-05-21 13:04:04 - INFO - sync_re - Started: sample 212, replica 10
2023-05-21 13:04:34 - INFO - sync_re - Finished: sample 212, replica 10 (duration: 30.375 s)
2023-05-21 13:04:34 - INFO - sync_re - Started: sample 212, replica 11
2023-05-21 13:05:04 - INFO - sync_re - Finished: sample 212, replica 11 (duration: 30.063000000000102 s)
2023-05-21 13:05:04 - INFO - sync_re - Started: sample 212, replica 12
2023-05-21 13:05:34 - INFO - sync_re - Finished: sample 212, replica 12 (duration: 29.827999999999975 s)
2023-05-21 13:05:34 - INFO - sync_re - Started: sample 212, replica 13
2023-05-21 13:06:05 - INFO - sync_re - Finished: sample 212, replica 13 (duration: 30.952999999999975 s)
2023-05-21 13:06:05 - INFO - sync_re - Started: sample 212, replica 14
2023-05-21 13:06:35 - INFO - sync_re - Finished: sample 212, replica 14 (duration: 30.264999999999873 s)
2023-05-21 13:06:35 - INFO - sync_re - Started: sample 212, replica 15
2023-05-21 13:07:04 - INFO - sync_re - Finished: sample 212, replica 15 (duration: 28.563000000000102 s)
2023-05-21 13:07:04 - INFO - sync_re - Started: sample 212, replica 16
2023-05-21 13:07:16 - INFO - sync_re - Finished: sample 212, replica 16 (duration: 12.0 s)
2023-05-21 13:07:16 - INFO - sync_re - Started: sample 212, replica 17
2023-05-21 13:07:27 - INFO - sync_re - Finished: sample 212, replica 17 (duration: 11.530999999999949 s)
2023-05-21 13:07:27 - INFO - sync_re - Started: sample 212, replica 18
2023-05-21 13:07:39 - INFO - sync_re - Finished: sample 212, replica 18 (duration: 11.938000000000102 s)
2023-05-21 13:07:39 - INFO - sync_re - Started: sample 212, replica 19
2023-05-21 13:07:51 - INFO - sync_re - Finished: sample 212, replica 19 (duration: 11.5 s)
2023-05-21 13:07:51 - INFO - sync_re - Started: sample 212, replica 20
2023-05-21 13:08:03 - INFO - sync_re - Finished: sample 212, replica 20 (duration: 11.967999999999847 s)
2023-05-21 13:08:03 - INFO - sync_re - Started: sample 212, replica 21
2023-05-21 13:08:14 - INFO - sync_re - Finished: sample 212, replica 21 (duration: 11.672000000000025 s)
2023-05-21 13:08:14 - INFO - sync_re - Started: exchange replicas
2023-05-21 13:08:14 - INFO - sync_re - Replica 4: 1 --> 1
2023-05-21 13:08:14 - INFO - sync_re - Replica 6: 1 --> 1
2023-05-21 13:08:14 - INFO - sync_re - Replica 7: 10 --> 11
2023-05-21 13:08:14 - INFO - sync_re - Replica 3: 11 --> 10
2023-05-21 13:08:14 - INFO - sync_re - Finished: exchange replicas (duration: 0.06300000000010186 s)
2023-05-21 13:08:14 - INFO - sync_re - Started: update replicas
2023-05-21 13:08:23 - INFO - sync_re - Finished: update replicas (duration: 8.827999999999975 s)
2023-05-21 13:08:23 - INFO - sync_re - Started: write replicas samples and trajectories
2023-05-21 13:08:23 - INFO - sync_re - Finished: write replicas samples and trajectories (duration: 0.0 s)
2023-05-21 13:08:23 - INFO - sync_re - Started: checkpointing
2023-05-21 13:09:12 - INFO - sync_re - Finished: checkpointing (duration: 49.266000000000076 s)
2023-05-21 13:09:13 - INFO - sync_re - Finished: sample 212 (duration: 613.1569999999999 s)
2023-05-21 13:09:13 - INFO - sync_re - Started: sample 213
2023-05-21 13:09:13 - INFO - sync_re - Started: sample 213, replica 0
2023-05-21 13:09:59 - INFO - sync_re - Finished: sample 213, replica 0 (duration: 46.4369999999999 s)
2023-05-21 13:09:59 - INFO - sync_re - Started: sample 213, replica 1
2023-05-21 13:10:45 - INFO - sync_re - Finished: sample 213, replica 1 (duration: 45.733999999999924 s)
2023-05-21 13:10:45 - INFO - sync_re - Started: sample 213, replica 2
2023-05-21 13:11:33 - INFO - sync_re - Finished: sample 213, replica 2 (duration: 48.28200000000015 s)
2023-05-21 13:11:33 - INFO - sync_re - Started: sample 213, replica 3
2023-05-21 13:12:24 - INFO - sync_re - Finished: sample 213, replica 3 (duration: 51.3119999999999 s)
2023-05-21 13:12:24 - INFO - sync_re - Started: sample 213, replica 4
2023-05-21 13:13:16 - INFO - sync_re - Finished: sample 213, replica 4 (duration: 51.922000000000025 s)
2023-05-21 13:13:16 - INFO - sync_re - Started: sample 213, replica 5
2023-05-21 13:14:08 - INFO - sync_re - Finished: sample 213, replica 5 (duration: 51.375 s)
2023-05-21 13:14:08 - INFO - sync_re - Started: sample 213, replica 6
2023-05-21 13:14:59 - INFO - sync_re - Finished: sample 213, replica 6 (duration: 51.53099999999995 s)
2023-05-21 13:14:59 - INFO - sync_re - Started: sample 213, replica 7
2023-05-21 13:15:51 - INFO - sync_re - Finished: sample 213, replica 7 (duration: 51.8130000000001 s)
2023-05-21 13:15:51 - INFO - sync_re - Started: sample 213, replica 8
2023-05-21 13:16:42 - INFO - sync_re - Finished: sample 213, replica 8 (duration: 51.0619999999999 s)
2023-05-21 13:16:42 - INFO - sync_re - Started: sample 213, replica 9
2023-05-21 13:17:34 - INFO - sync_re - Finished: sample 213, replica 9 (duration: 51.98500000000013 s)
2023-05-21 13:17:34 - INFO - sync_re - Started: sample 213, replica 10
2023-05-21 13:18:26 - INFO - sync_re - Finished: sample 213, replica 10 (duration: 52.0 s)
2023-05-21 13:18:26 - INFO - sync_re - Started: sample 213, replica 11
2023-05-21 13:19:18 - INFO - sync_re - Finished: sample 213, replica 11 (duration: 52.28099999999995 s)
2023-05-21 13:19:18 - INFO - sync_re - Started: sample 213, replica 12
2023-05-21 13:20:10 - INFO - sync_re - Finished: sample 213, replica 12 (duration: 51.266000000000076 s)
2023-05-21 13:20:10 - INFO - sync_re - Started: sample 213, replica 13
2023-05-21 13:21:03 - INFO - sync_re - Finished: sample 213, replica 13 (duration: 53.233999999999924 s)
2023-05-21 13:21:03 - INFO - sync_re - Started: sample 213, replica 14
2023-05-21 13:21:55 - INFO - sync_re - Finished: sample 213, replica 14 (duration: 52.03099999999995 s)
2023-05-21 13:21:55 - INFO - sync_re - Started: sample 213, replica 15
2023-05-21 13:22:49 - INFO - sync_re - Finished: sample 213, replica 15 (duration: 53.75 s)
2023-05-21 13:22:49 - INFO - sync_re - Started: sample 213, replica 16
2023-05-21 13:23:42 - INFO - sync_re - Finished: sample 213, replica 16 (duration: 53.077999999999975 s)
2023-05-21 13:23:42 - INFO - sync_re - Started: sample 213, replica 17
2023-05-21 13:24:34 - INFO - sync_re - Finished: sample 213, replica 17 (duration: 52.327999999999975 s)
2023-05-21 13:24:34 - INFO - sync_re - Started: sample 213, replica 18
2023-05-21 13:25:27 - INFO - sync_re - Finished: sample 213, replica 18 (duration: 52.82900000000018 s)
2023-05-21 13:25:27 - INFO - sync_re - Started: sample 213, replica 19
2023-05-21 13:26:20 - INFO - sync_re - Finished: sample 213, replica 19 (duration: 52.92099999999982 s)
2023-05-21 13:26:20 - INFO - sync_re - Started: sample 213, replica 20
2023-05-21 13:27:13 - INFO - sync_re - Finished: sample 213, replica 20 (duration: 53.40700000000015 s)
2023-05-21 13:27:13 - INFO - sync_re - Started: sample 213, replica 21
2023-05-21 13:28:05 - INFO - sync_re - Finished: sample 213, replica 21 (duration: 52.28099999999995 s)
2023-05-21 13:28:05 - INFO - sync_re - Started: exchange replicas
2023-05-21 13:28:05 - INFO - sync_re - Finished: exchange replicas (duration: 0.047000000000025466 s)
2023-05-21 13:28:05 - INFO - sync_re - Started: update replicas
2023-05-21 13:28:14 - INFO - sync_re - Finished: update replicas (duration: 9.030999999999949 s)
2023-05-21 13:28:14 - INFO - sync_re - Started: write replicas samples and trajectories
2023-05-21 13:28:14 - INFO - sync_re - Finished: write replicas samples and trajectories (duration: 0.0 s)
2023-05-21 13:28:14 - INFO - sync_re - Started: checkpointing
2023-05-21 13:29:04 - INFO - sync_re - Finished: checkpointing (duration: 50.016000000000076 s)
2023-05-21 13:29:04 - INFO - sync_re - Finished: sample 213 (duration: 1191.953 s)
2023-05-21 13:29:04 - INFO - sync_re - Started: sample 214
2023-05-21 13:29:04 - INFO - sync_re - Started: sample 214, replica 0
2023-05-21 13:29:58 - INFO - sync_re - Finished: sample 214, replica 0 (duration: 53.28099999999995 s)
2023-05-21 13:29:58 - INFO - sync_re - Started: sample 214, replica 1
2023-05-21 13:30:50 - INFO - sync_re - Finished: sample 214, replica 1 (duration: 52.483999999999924 s)
2023-05-21 13:30:50 - INFO - sync_re - Started: sample 214, replica 2
2023-05-21 13:31:43 - INFO - sync_re - Finished: sample 214, replica 2 (duration: 53.172000000000025 s)
2023-05-21 13:31:43 - INFO - sync_re - Started: sample 214, replica 3
2023-05-21 13:32:37 - INFO - sync_re - Finished: sample 214, replica 3 (duration: 53.141000000000076 s)
2023-05-21 13:32:37 - INFO - sync_re - Started: sample 214, replica 4
2023-05-21 13:33:30 - INFO - sync_re - Finished: sample 214, replica 4 (duration: 53.63999999999987 s)
2023-05-21 13:33:30 - INFO - sync_re - Started: sample 214, replica 5
2023-05-21 13:34:22 - INFO - sync_re - Finished: sample 214, replica 5 (duration: 51.98500000000013 s)
2023-05-21 13:34:22 - INFO - sync_re - Started: sample 214, replica 6
2023-05-21 13:35:07 - INFO - sync_re - Finished: sample 214, replica 6 (duration: 44.85900000000038 s)
2023-05-21 13:35:07 - INFO - sync_re - Started: sample 214, replica 7
2023-05-21 13:35:51 - INFO - sync_re - Finished: sample 214, replica 7 (duration: 44.35999999999967 s)
2023-05-21 13:35:51 - INFO - sync_re - Started: sample 214, replica 8

The common message in the stderr file contains this:

Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.
Detected memory leaks!
Dumping objects ->
..\api\boinc_api.cpp(309) : {3078512} normal block at 0x000002153E930FD0, 8 bytes long.
Data: < 6@ > 00 00 36 40 15 02 00 00
..\lib\diagnostics_win.cpp(417) : {3077232} normal block at 0x000002153E919030, 1080 bytes long.
Data: <8> h > 38 3E 00 00 CD CD CD CD 68 01 00 00 00 00 00 00
..\zip\boinc_zip.cpp(122) : {278} normal block at 0x000002153E92E7F0, 260 bytes long.
Data: < > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Object dump complete.

This depreciated item seems to be the cause.
I have no memory leaks, thats just a stupid line.

I run primegrid,LHC, SiDock and others just fine.

Anyone know what the cause of this depricated whatever is?
I saw something in github but didn't really understand it.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1274
Credit: 4,777,831,959
RAC: 4,780,403
Level
Arg
Scientific publications
watwatwatwatwat
Message 60479 - Posted: 22 May 2023 | 23:05:54 UTC

You do realize that most of the tasks with naming other than 0-5 jump right to 100% completion from the start.

That is normal. You should let them run until they finish. Those tasks take around 10K seconds on my 2080 Ti.

They are not stalled. If you view your disk activity you should see activity spiking every few minutes as the task writes out its progress in the slot files.

Greg _BE
Send message
Joined: 30 Jun 14
Posts: 126
Credit: 99,844,439
RAC: 174,283
Level
Thr
Scientific publications
watwatwatwatwatwat
Message 60482 - Posted: 23 May 2023 | 6:52:41 UTC - in response to Message 60479.

You do realize that most of the tasks with naming other than 0-5 jump right to 100% completion from the start.

That is normal. You should let them run until they finish. Those tasks take around 10K seconds on my 2080 Ti.

They are not stalled. If you view your disk activity you should see activity spiking every few minutes as the task writes out its progress in the slot files.



Interesting...ok...will restart them tonight and see what happens.
BUT....the computation error tasks (look in my profile)....thats not normal.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1022
Credit: 32,367,107,483
RAC: 289,239,659
Level
Trp
Scientific publications
wat
Message 60485 - Posted: 23 May 2023 | 11:57:21 UTC - in response to Message 60482.
Last modified: 23 May 2023 | 11:59:13 UTC

these tasks are still in beta testing. some errors are normal.

you will occasionally get errors if some tasks are sent out missing some files.
you will occasionally get "energy is NaN" errors
you will get errors if you suspend or interrupt the run of the task as they cannot be interrupted, will fail on restart because they don't resume from checkpoint properly.

if you aren't willing to accept these kinds of situations, then I would recommend disabling beta testing in your project preferences and sticking to the non-beta tasks (PythonGPU and ACEMD3) when they are available.
____________

Greg _BE
Send message
Joined: 30 Jun 14
Posts: 126
Credit: 99,844,439
RAC: 174,283
Level
Thr
Scientific publications
watwatwatwatwatwat
Message 60490 - Posted: 23 May 2023 | 18:20:35 UTC - in response to Message 60485.

these tasks are still in beta testing. some errors are normal.

you will occasionally get errors if some tasks are sent out missing some files.
you will occasionally get "energy is NaN" errors
you will get errors if you suspend or interrupt the run of the task as they cannot be interrupted, will fail on restart because they don't resume from checkpoint properly.

if you aren't willing to accept these kinds of situations, then I would recommend disabling beta testing in your project preferences and sticking to the non-beta tasks (PythonGPU and ACEMD3) when they are available.



Thanks for the feedback.

Question: what is a "normal" runtime to completion despite the status saying 100%?

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1274
Credit: 4,777,831,959
RAC: 4,780,403
Level
Arg
Scientific publications
watwatwatwatwat
Message 60493 - Posted: 23 May 2023 | 18:27:39 UTC - in response to Message 60490.
Last modified: 23 May 2023 | 18:28:56 UTC

10,000 to 12,000 seconds typically on all my hosts. IOW 3 1/2 to 4 hours.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1571
Credit: 5,362,011,851
RAC: 8,880,102
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 60494 - Posted: 23 May 2023 | 18:29:27 UTC - in response to Message 60493.

10,000 to 12,000 seconds typically on all my hosts.

It varies, depending on the GPU you're using.

Greg _BE
Send message
Joined: 30 Jun 14
Posts: 126
Credit: 99,844,439
RAC: 174,283
Level
Thr
Scientific publications
watwatwatwatwatwat
Message 60496 - Posted: 23 May 2023 | 21:26:32 UTC - in response to Message 60494.
Last modified: 23 May 2023 | 21:27:29 UTC

10,000 to 12,000 seconds typically on all my hosts.

It varies, depending on the GPU you're using.



1080 typically maybe my 1050 on occasion.
I'll see if the 3-5 hour mark works the next time something is sent out.
providing it doesn't error out on its own.

Thanks guys

Post to thread

Message boards : Number crunching : All ATM tasks error out or have to be aborted

//