Advanced search

Message boards : News : CPU jobs on Linux

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1914
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 48820 - Posted: 5 Feb 2018 | 19:30:28 UTC

Hi, we need more CPUs on Linux to run QM simulations. Anybody can help?

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 155
Credit: 135,281,463
RAC: 1,053,499
Level
Cys
Scientific publications
wat
Message 48821 - Posted: 5 Feb 2018 | 20:05:56 UTC

Do you want to make them more appealing to crunch?? Take the QMML tasks OFF the Boinc CreditNew credit award mechanism and assign them fixed values like you do for the gpu tasks.

biodoc
Send message
Joined: 26 Aug 08
Posts: 121
Credit: 847,311,950
RAC: 4,566
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48823 - Posted: 5 Feb 2018 | 20:47:06 UTC

Sure, I can help.

Matt Kowal
Send message
Joined: 27 May 14
Posts: 1
Credit: 1,188,500
RAC: 0
Level
Ala
Scientific publications
wat
Message 48824 - Posted: 5 Feb 2018 | 21:18:15 UTC
Last modified: 5 Feb 2018 | 21:23:18 UTC

This forum news post was syndicated to your Twitter account, however, the link is broken.

Relevant post: https://twitter.com/gpugrid/status/960604705171808256

The link resolves to https://www.gpugrid.net/extra_arg_utm_source.html

I have reposted your call to the BOINC subreddit

klepel
Send message
Joined: 23 Dec 09
Posts: 153
Credit: 2,361,165,663
RAC: 2,411,802
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48825 - Posted: 5 Feb 2018 | 21:45:30 UTC - in response to Message 48821.
Last modified: 5 Feb 2018 | 21:47:43 UTC

Do you want to make them more appealing to crunch?? Take the QMML tasks OFF the Boinc CreditNew credit award mechanism and assign them fixed values like you do for the gpu tasks.

+1

PS Unfortunatelly, they crash my computer with my strongest GPU frequently, so I will not run them on this computer.

John C MacAlister
Send message
Joined: 17 Feb 13
Posts: 179
Credit: 137,176,461
RAC: 19,335
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwat
Message 48826 - Posted: 5 Feb 2018 | 22:51:50 UTC - in response to Message 48820.

Unfortunately, I gave up on Linux and run Win 10.


Hi, we need more CPUs on Linux to run QM simulations. Anybody can help?


____________
John

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1914
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 48827 - Posted: 5 Feb 2018 | 23:23:41 UTC - in response to Message 48821.

Do you want to make them more appealing to crunch?? Take the QMML tasks OFF the Boinc CreditNew credit award mechanism and assign them fixed values like you do for the gpu tasks.



We are testing this.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 155
Credit: 135,281,463
RAC: 1,053,499
Level
Cys
Scientific publications
wat
Message 48828 - Posted: 6 Feb 2018 | 0:22:33 UTC - in response to Message 48826.

I've had good luck with the Linux apps up until the recent gpu application errors that started this month. The cpu tasks ran fine.

I and others have voiced our displeasure with the credit awarded for the flops used for the QM tasks in this thread.
New Student and QMML Project

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1914
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 48833 - Posted: 6 Feb 2018 | 9:02:58 UTC - in response to Message 48828.

we are testing different credit systems now.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 748
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 48834 - Posted: 6 Feb 2018 | 10:45:39 UTC - in response to Message 48833.
Last modified: 6 Feb 2018 | 10:46:23 UTC

We don't use CreditNew but the previous credit system. In any case, two changes were made yesterday:

* CPU threads are limited to 4 (you should still be able to crunch multiple WUs at once, please check)
* Credits should be in line with other projects'

Let us know.

Profile Bikermatt
Send message
Joined: 8 Apr 10
Posts: 37
Credit: 2,673,572,873
RAC: 4,897,768
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48836 - Posted: 6 Feb 2018 | 12:54:52 UTC - in response to Message 48834.


* CPU threads are limited to 4 (you should still be able to crunch multiple WUs at once, please check)

Let us know.


Boinc is still assigning all of my 32 threads to one task even though it is only using 4.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 48837 - Posted: 6 Feb 2018 | 13:29:06 UTC - in response to Message 48820.

Hi, we need more CPUs on Linux to run QM simulations. Anybody can help?

I have three machines on it, but I can run only one work unit at a time on average. That is because when any more start up at once, they error out, as has been discussed before. And I run two cores per work unit for efficiency. But if you could solve the start-up problem, I could run more.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 48838 - Posted: 6 Feb 2018 | 13:55:26 UTC

I don't see that you have made an announcement on the BOINC forum yet. The Projects section would probably be best.
http://boinc.berkeley.edu/dev/forum_forum.php?id=11

biodoc
Send message
Joined: 26 Aug 08
Posts: 121
Credit: 847,311,950
RAC: 4,566
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48839 - Posted: 6 Feb 2018 | 14:24:00 UTC

I have an Intel 2600K with 8 logical cores. I wanted to reserved 2 cores for feeding 2 GPUs that are running Folding@home. I set the computing preference in boinc to use, at most, 80% of processors (6 cores).

Data on 19 WUs before GPUGrid changes (processor usage varied but 5-6 cores on average I think)

Average run time (sec): 3,129.23
Average CPU time (sec): 16,122.19
Average credit per WU: 228.7
Average WU per day: 27.6
PPD: 6314.8

Data on 15 WUs after GPUGrid changes. Processor usage was 4 cores even though set at 6)

Average run time (sec): 3,566.32
Average CPU time (sec): 14,114.28
Average credit per WU: 819.284
Average WU per day: 24.2
PPD: 19848.5

Summary: Processor usage seems to be maxed out at 4. Run time has increased, CPU time has decreased, WU completed per day has decreased and PPD has increased significantly.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 748
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 48840 - Posted: 6 Feb 2018 | 14:57:09 UTC - in response to Message 48836.
Last modified: 6 Feb 2018 | 15:09:56 UTC



Boinc is still assigning all of my 32 threads to one task even though it is only using 4.


Ouch. This should not happen. May be fixed now.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 184
Credit: 423,501,239
RAC: 2,245,528
Level
Gln
Scientific publications
wat
Message 48842 - Posted: 6 Feb 2018 | 15:50:39 UTC - in response to Message 48837.

Hi, we need more CPUs on Linux to run QM simulations. Anybody can help?

I have three machines on it, but I can run only one work unit at a time on average. That is because when any more start up at once, they error out, as has been discussed before. And I run two cores per work unit for efficiency. But if you could solve the start-up problem, I could run more.


I agree. There were too many issues that had not been resolved. On top of that the credit much worse than even CreditNew. I see the credit has been changed just saying there are reasons for the lack of CPU time.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 155
Credit: 135,281,463
RAC: 1,053,499
Level
Cys
Scientific publications
wat
Message 48845 - Posted: 6 Feb 2018 | 17:48:28 UTC - in response to Message 48834.

Are you sure that you are using an "older" credit mechanism? From the response in the "New Student and QMML" thread from Richard Haselgrove who corrected me in my assumption you might be using an "older" mechanism.

Be careful of your terms. 'CreditNew' has been the default BOINC mechanism since 2010. I suspect this is what GPUGrid is using for these tasks: the support mechanisms for 'even older credit' have been removed from the codebase.

I wonder where you found the older codebase that has been removed that contained the "older" credit award algorithm. If you do in fact have such, I would like access to it. Or have it reinstituted into the BOINC Github codebase.

It would be helpful in persuading the BOINC maintainers that there is in fact a way to return to the older credit algorithm. One of their stated reasons why they said they would not change from CreditNew is that they said they no longer have the original code and can't replicate it.

That said, it looks like the award for QC cpu tasks is much more appealing now.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 155
Credit: 135,281,463
RAC: 1,053,499
Level
Cys
Scientific publications
wat
Message 48847 - Posted: 6 Feb 2018 | 19:21:26 UTC - in response to Message 48839.

Hmmm ... I just ran a new QC task with the supposed new credit. Not seeing any difference.

Run time 3,292.22
CPU time 12,925.47
Validate state Valid
Credit 110.63

I used 4 cores to generate 110 credits for 54 minutes of compute time.

I can use one core to generate 108 credits for 60 minutes of compute time for SETI CPU tasks.

No reason to run these tasks still for me.

STARBASEn
Avatar
Send message
Joined: 17 Feb 09
Posts: 58
Credit: 762,514,044
RAC: 1,813,935
Level
Glu
Scientific publications
watwatwatwatwat
Message 48850 - Posted: 6 Feb 2018 | 23:50:37 UTC

Since just after mid day yesterday (UTC) I have noticed an increase in credit awarded for the QC WU's. Doing some quick calcs, it appears the increase is about 4.5x of what we were getting. It also appears they are more fixed in value proportional to the size of the WU. My faster machines are getting over 500 credits/hr (4 cores) compute time whereas the slower machines are getting proportionately less/hr and still getting equivalent credit but over a longer period of time than the faster ones.

Well, at least my avg credit per day will not be taking as much of a hit per day as it has been without Linux GPU WU's :).

So far, I have 5 machines with 4 cores each running the QC project and will add one more when I install the memory upgrade on one of the headless systems.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 155
Credit: 135,281,463
RAC: 1,053,499
Level
Cys
Scientific publications
wat
Message 48851 - Posted: 7 Feb 2018 | 0:28:50 UTC - in response to Message 48850.

I'm not sure the higher credit is not due to the larger molecule size in the latest tasks that Dominik explained to me here.

Trotador
Send message
Joined: 25 Mar 12
Posts: 88
Credit: 1,253,193,955
RAC: 726,641
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 48857 - Posted: 7 Feb 2018 | 6:26:25 UTC

No higher credits for the tasks I've crunched yesterday and today.

I stop and will try to understand what is happening.

NUCCpod_NAPTIMELABS_01
Send message
Joined: 18 Aug 17
Posts: 6
Credit: 133,454,498
RAC: 609,543
Level
Cys
Scientific publications
wat
Message 48858 - Posted: 7 Feb 2018 | 6:33:41 UTC - in response to Message 48837.

I've also been having trouble with work units erroring out in this way. I have over 150 cpu cores spread out across a variety of machines, all under linux. Just a few moments ago I attempted to attach to GPUGRID only to have computational error after computational error. I hope this is fixed shortly as I would love to have GPUGRID as one of my default projects.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 155
Credit: 135,281,463
RAC: 1,053,499
Level
Cys
Scientific publications
wat
Message 48859 - Posted: 7 Feb 2018 | 7:46:25 UTC

Yes, credit is doing something very strange. I got 111 credits for 3292 seconds of cpu time.

klepel got 1362 credits for the same time.

Run time 3,262.28
CPU time 12,834.34
Validate state Valid
Credit 1,361.79

Task 13118994

kain
Send message
Joined: 3 Sep 14
Posts: 140
Credit: 301,799,425
RAC: 1,124,040
Level
Asp
Scientific publications
watwatwatwatwat
Message 48862 - Posted: 7 Feb 2018 | 12:23:54 UTC - in response to Message 48847.


I used 4 cores to generate 110 credits for 54 minutes of compute time.

I can use one core to generate 108 credits for 60 minutes of compute time for SETI CPU tasks.

No reason to run these tasks still for me.


Are you seriously comparing SETI to GPUGRID?!

mmonnin
Send message
Joined: 2 Jul 16
Posts: 184
Credit: 423,501,239
RAC: 2,245,528
Level
Gln
Scientific publications
wat
Message 48863 - Posted: 7 Feb 2018 | 12:53:02 UTC - in response to Message 48862.


I used 4 cores to generate 110 credits for 54 minutes of compute time.

I can use one core to generate 108 credits for 60 minutes of compute time for SETI CPU tasks.

No reason to run these tasks still for me.


Are you seriously comparing SETI to GPUGRID?!


CreditNew is CreditNew.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 48864 - Posted: 7 Feb 2018 | 13:33:40 UTC - in response to Message 48863.


I used 4 cores to generate 110 credits for 54 minutes of compute time.

I can use one core to generate 108 credits for 60 minutes of compute time for SETI CPU tasks.

No reason to run these tasks still for me.


Are you seriously comparing SETI to GPUGRID?!


CreditNew is CreditNew.

You cannot compare one project's credit to another.

computezrmle
Send message
Joined: 10 Jun 13
Posts: 5
Credit: 58,199,491
RAC: 284,962
Level
Thr
Scientific publications
wat
Message 48866 - Posted: 7 Feb 2018 | 16:54:10 UTC - in response to Message 48864.

You cannot compare one project's credit to another.

At least it should be comparable as described in the BOINC documentation.
See: http://boinc.berkeley.edu/trac/wiki/CreditNew#Cross-projectversionnormalization

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 155
Credit: 135,281,463
RAC: 1,053,499
Level
Cys
Scientific publications
wat
Message 48867 - Posted: 7 Feb 2018 | 17:01:28 UTC - in response to Message 48864.


I used 4 cores to generate 110 credits for 54 minutes of compute time.

I can use one core to generate 108 credits for 60 minutes of compute time for SETI CPU tasks.

No reason to run these tasks still for me.


Are you seriously comparing SETI to GPUGRID?!


CreditNew is CreditNew.

You cannot compare one project's credit to another.

One of the stated objectives of CreditNew it to make credit the same across all projects for the same amount of cobblestones used to compute.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 184
Credit: 423,501,239
RAC: 2,245,528
Level
Gln
Scientific publications
wat
Message 48872 - Posted: 7 Feb 2018 | 18:35:25 UTC

Thus my comment about CreditNew. CPU projects that have higher or lower then typical RAC are most likely using something else besides CreditNew like a fixed credit or another algorithm.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 155
Credit: 135,281,463
RAC: 1,053,499
Level
Cys
Scientific publications
wat
Message 48873 - Posted: 7 Feb 2018 | 18:46:31 UTC - in response to Message 48872.

Yes I understood your post and sentiment. My post was directed at the other poster's incredulous comment.

This project itself utilize both mechanisms. CreditNew for cpu tasks and fixed credit awards for gpu tasks.

As far as I have been able to find, that is unique among projects. Usually it is either/or not both.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 184
Credit: 423,501,239
RAC: 2,245,528
Level
Gln
Scientific publications
wat
Message 48876 - Posted: 7 Feb 2018 | 22:11:38 UTC - in response to Message 48873.

Yes I understood your post and sentiment. My post was directed at the other poster's incredulous comment.

This project itself utilize both mechanisms. CreditNew for cpu tasks and fixed credit awards for gpu tasks.

As far as I have been able to find, that is unique among projects. Usually it is either/or not both.


I wasn't referencing you as I didn't quote you. ;)

I agree.

Profile Michael H.W. Weber
Send message
Joined: 9 Feb 16
Posts: 38
Credit: 461,894,057
RAC: 1,013,525
Level
Gln
Scientific publications
watwat
Message 48893 - Posted: 10 Feb 2018 | 15:13:42 UTC
Last modified: 10 Feb 2018 | 15:18:45 UTC

Would you please list the QC project progress on the server status page as well:

http://www.gpugrid.net/server_status.php

Thanks.

Another issue is that your app does not dynamically allocate CPU cores according to the BOINC settings. Instead it claims all physically present cores. That is a major problem when trying to run computations on the GPU as well because to do so, the GPU project automatically (or I manually) reserve(s) one CPU core per GPU task.
Example: When the BOINC manager is set to use 7 of the 8 cores to do CPU computations, your CPU client grabs all 8 cores (or since recently 2x 4 cores).
That is not acceptable.
Please fix this to attract more people to donate CPU cycles to your project.

Michael.
____________
President of Rechenkraft.net - Germany's first and largest distributed computing organization.

Profile SMTB1963
Avatar
Send message
Joined: 27 Jun 10
Posts: 38
Credit: 425,643,996
RAC: 1,825,048
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwat
Message 48894 - Posted: 10 Feb 2018 | 17:48:45 UTC - in response to Message 48893.

Another issue is that your app does not dynamically allocate CPU cores according to the BOINC settings. Instead it claims all physically present cores.


I'm seeing this behavior as well. On my Ryzen 1700X system with 2 GPUs, these WUs basically take over all CPUs and throw the GPUs into "Waiting to run".

I suppose one could set max_concurrent in an app_config.xml to fix this...what would be the proper app name to use?

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 155
Credit: 135,281,463
RAC: 1,053,499
Level
Cys
Scientific publications
wat
Message 48895 - Posted: 10 Feb 2018 | 17:54:26 UTC - in response to Message 48894.

QC is the proper app name. This is how I limit QC to 2 threads per task.

<app>
<name>QC</name>
<max_concurrent>1</max_concurrent>
</app>
<app_version>
<app_name>QC</app_name>
<plan_class>mt</plan_class>
<avg_ncpus>2.000000</avg_ncpus>
<cmdline>--nthreads 2</cmdline>


I also just run 1 task at a time to avoid the starting two tasks at the same time flaw in the application.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 48988 - Posted: 19 Feb 2018 | 8:24:05 UTC

An error on a wu (task 17040682):

<core_client_version>7.8.3</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
ConnectionError(MaxRetryError("HTTPSConnectionPool(host='repo.continuum.io', port=443): Max retries exceeded with url: /pkgs/main/linux-64/repodata.json.bz2 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fc2a3f56860>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))",),)


Traceback (most recent call last):
File "pre_script.py", line 13, in <module>
raise Exception("Error installing h5py")
Exception: Error installing h5py
08:49:09 (1252): $PROJECT_DIR/miniconda/bin/python exited; CPU time 0.469942
08:49:09 (1252): app exit status: 0x1
08:49:09 (1252): called boinc_finish(195)

Profile Logan Carr
Send message
Joined: 12 Aug 15
Posts: 240
Credit: 60,038,511
RAC: 114,084
Level
Thr
Scientific publications
wat
Message 49092 - Posted: 24 Feb 2018 | 0:21:06 UTC - in response to Message 48820.

Hi, we need more CPUs on Linux to run QM simulations. Anybody can help?


okay, I'm going to install Linux right now on my computer and it should be ready tonight or early tomorrow.
____________
Cruncher/Learner in progress.

biodoc
Send message
Joined: 26 Aug 08
Posts: 121
Credit: 847,311,950
RAC: 4,566
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49094 - Posted: 24 Feb 2018 | 12:04:32 UTC - in response to Message 48988.


Traceback (most recent call last):
File "pre_script.py", line 13, in <module>
raise Exception("Error installing h5py")
Exception: Error installing h5py
08:49:09 (1252): $PROJECT_DIR/miniconda/bin/python exited; CPU time 0.469942
08:49:09 (1252): app exit status: 0x1
08:49:09 (1252): called boinc_finish(195)


If you are running the latest distros from Ubuntu or Mint you may need to install the python-support package.

wget http://launchpadlibrarian.net/109052632/python-support_1.0.15_all.deb

sudo dpkg -i python-support_1.0.15_all.deb

kain
Send message
Joined: 3 Sep 14
Posts: 140
Credit: 301,799,425
RAC: 1,124,040
Level
Asp
Scientific publications
watwatwatwatwat
Message 49096 - Posted: 24 Feb 2018 | 15:57:29 UTC

One of my CPUs is crunching linux jobs, I will add two more next week.

kain
Send message
Joined: 3 Sep 14
Posts: 140
Credit: 301,799,425
RAC: 1,124,040
Level
Asp
Scientific publications
watwatwatwatwat
Message 49164 - Posted: 16 Mar 2018 | 11:53:15 UTC

I have added three. Athlon 5350, I3 3240, I3 6100. Not the best ones but doing their job. I can add more but I would like to know how many WUs can we expect for the Linux QM app?

BTW ATM just 92 users are crunching QM, that's sad :(

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49165 - Posted: 16 Mar 2018 | 13:05:14 UTC - in response to Message 49164.
Last modified: 16 Mar 2018 | 13:09:04 UTC

I have added another machine, and now have four on it (2 i7-3770, 1 i7-4790 and 1 Ryzen 1700), with two to four cores per machine allocated via the resource share. The main problem in running them is that when two or more start up at the same time, they error out. That happens mainly during reboots, but otherwise they never start up at the same time. I leave my machines running 24/7, so I don't reboot very often.

And to minimize the problem, you can run with the default 4 cores per work unit and only 4 cores (or less) per machine on average, so that they usually don't start more than one work unit at a time anyway. In that way, it is a manageable problem for me, though it would be best if they fix it. I am sure more people would then be willing to run it.

Also a Windows version would help of course, and for that they had better be seriously thinking about VirtualBox.

captainjack
Send message
Joined: 9 May 13
Posts: 147
Credit: 1,056,117,961
RAC: 1,790,249
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49167 - Posted: 16 Mar 2018 | 19:28:12 UTC

Jim 1348 said:

Also a Windows version would help of course, and for that they had better be seriously thinking about VirtualBox.


If you want to go the virtualbox route, you can create your own virtualbox instance, install your favorite flavor of Linux (I chose Ubuntu), make sure that gcc is installed, install BOINC and start running the Linux version of the Quantum Chemistry tasks.

It took me a few tries to figure out that gcc needed to be installed, but now they seem to be running fine.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 184
Credit: 423,501,239
RAC: 2,245,528
Level
Gln
Scientific publications
wat
Message 49168 - Posted: 17 Mar 2018 | 15:09:22 UTC - in response to Message 49164.

I have added three. Athlon 5350, I3 3240, I3 6100. Not the best ones but doing their job. I can add more but I would like to know how many WUs can we expect for the Linux QM app?

BTW ATM just 92 users are crunching QM, that's sad :(


People aren't running it since it has so many issues that haven't been addressed.

kain
Send message
Joined: 3 Sep 14
Posts: 140
Credit: 301,799,425
RAC: 1,124,040
Level
Asp
Scientific publications
watwatwatwatwat
Message 49169 - Posted: 17 Mar 2018 | 16:42:59 UTC

Lucky me (not even a single problem on my end)

flashawk
Send message
Joined: 18 Jun 12
Posts: 284
Credit: 2,443,879,847
RAC: 3,166,028
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 49171 - Posted: 18 Mar 2018 | 4:07:07 UTC

The main reason why I came here was to put my GPU's to work, my CPU cores are all very busy with CPDN.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 184
Credit: 423,501,239
RAC: 2,245,528
Level
Gln
Scientific publications
wat
Message 49172 - Posted: 18 Mar 2018 | 12:35:01 UTC - in response to Message 49169.

Lucky me (not even a single problem on my end)


Start two at once..

kain
Send message
Joined: 3 Sep 14
Posts: 140
Credit: 301,799,425
RAC: 1,124,040
Level
Asp
Scientific publications
watwatwatwatwat
Message 49173 - Posted: 18 Mar 2018 | 12:39:17 UTC - in response to Message 49172.

I understand that. I'm using only 4C or 2c/4t CPUs so I can't start two at once. Probably that's why i don't have any issues :)

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49174 - Posted: 18 Mar 2018 | 13:37:13 UTC - in response to Message 49167.

Jim 1348 said:
Also a Windows version would help of course, and for that they had better be seriously thinking about VirtualBox.


If you want to go the virtualbox route, you can create your own virtualbox instance, install your favorite flavor of Linux (I chose Ubuntu), make sure that gcc is installed, install BOINC and start running the Linux version of the Quantum Chemistry tasks.

Thanks, but I have five Ubuntu machines. I was offering that only as advice if they want to increase their processing power. They need to enlist the Windows users.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1914
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 49179 - Posted: 20 Mar 2018 | 21:53:21 UTC - in response to Message 49174.

The linux QM app runs fine but it requires some packages to be installed. People should expect infinite number of workunits...

We will run this for ever.

kain
Send message
Joined: 3 Sep 14
Posts: 140
Credit: 301,799,425
RAC: 1,124,040
Level
Asp
Scientific publications
watwatwatwatwat
Message 49180 - Posted: 20 Mar 2018 | 22:03:00 UTC - in response to Message 49179.

The linux QM app runs fine but it requires some packages to be installed. People should expect infinite number of workunits...

We will run this for ever.



Wow, wow, wow! :D
I will add some more CPUs.

When can we expect some info about scientific results created with this app?

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1914
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 49193 - Posted: 22 Mar 2018 | 10:59:44 UTC - in response to Message 49180.


When can we expect some info about scientific results created with this app?


Months away, we have just started.

(Ryle)
Send message
Joined: 7 Jun 09
Posts: 21
Credit: 818,987,646
RAC: 248,951
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 49197 - Posted: 22 Mar 2018 | 13:16:35 UTC

Could someone who knows how, make a description of exactly which dependencies are needed to make the QC app run flawlessly, in the FAQ section? I think it is a good place for it to be described.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 274
Credit: 887,370,590
RAC: 1,752,527
Level
Glu
Scientific publications
watwat
Message 49204 - Posted: 24 Mar 2018 | 9:23:17 UTC - in response to Message 49173.
Last modified: 24 Mar 2018 | 9:31:11 UTC

I understand that. I'm using only 4C or 2c/4t CPUs so I can't start two at once. Probably that's why i don't have any issues :)


Once those issues have been solved I will gladly set a 16/32t Threadripper on the QC tasks to shovel them away ;)
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 274
Credit: 887,370,590
RAC: 1,752,527
Level
Glu
Scientific publications
watwat
Message 49205 - Posted: 25 Mar 2018 | 8:57:18 UTC
Last modified: 25 Mar 2018 | 9:04:26 UTC

And to minimize the problem, you can run with the default 4 cores per work unit and only 4 cores (or less) per machine on average, so that they usually don't start more than one work unit at a time anyway. In that way, it is a manageable problem for me, though it would be best if they fix it. I am sure more people would then be willing to run it.


I have set up a i3 Sandy Bridge machine (2c/4t) on Ubuntu and use the default config as suggested above, but all tasks report a calculation error after just 1-3 minutes. Any idea what goes wrong here?

http://www.gpugrid.net/result.php?resultid=17336453
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

captainjack
Send message
Joined: 9 May 13
Posts: 147
Credit: 1,056,117,961
RAC: 1,790,249
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49206 - Posted: 25 Mar 2018 | 12:18:12 UTC

JoergF

Try installing gcc and see if that helps.

In a terminal session

sudo apt-get install gcc

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 274
Credit: 887,370,590
RAC: 1,752,527
Level
Glu
Scientific publications
watwat
Message 49207 - Posted: 25 Mar 2018 | 18:36:21 UTC - in response to Message 49206.

JoergF

Try installing gcc and see if that helps.

In a terminal session
sudo apt-get install gcc


Thank you... I have installed it. However I am not able to test it as there seems to be a daily quota of 2 tasks per computer and I have to wait for the next day. Really, in view of of >10.000 unsent QC tasks, that limitation is somewhat surprising.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

captainjack
Send message
Joined: 9 May 13
Posts: 147
Credit: 1,056,117,961
RAC: 1,790,249
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49208 - Posted: 25 Mar 2018 | 20:34:23 UTC

JoergF said:

However I am not able to test it as there seems to be a daily quota of 2 tasks per computer and I have to wait for the next day


There is a daily limit if a cruncher starts turning more than the usual number of errors. Once you start turning in valid tasks, the daily limit will go back up. I have been caught in it a few times when running test tasks.

Hopefully you will get better results tomorrow.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 274
Credit: 887,370,590
RAC: 1,752,527
Level
Glu
Scientific publications
watwat
Message 49209 - Posted: 26 Mar 2018 | 6:53:19 UTC - in response to Message 49208.

Thank you.. seems to work now :)
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 49211 - Posted: 26 Mar 2018 | 19:13:52 UTC - in response to Message 49179.

People should expect infinite number of workunits...
We will run this for ever.


Oh, well.
So we dream for a opencl client (or an sse/avx cpu optimization)

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49212 - Posted: 26 Mar 2018 | 20:07:51 UTC - in response to Message 49211.

Oh, well.
So we dream for a opencl client (or an sse/avx cpu optimization)

As you probably know, the FMA version of TN-Grid is faster than the AVX version. At least that was my result comparing a Ryzen 1700 (FMA) to an i7-4770 (AVX), both on Ubuntu.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 49213 - Posted: 27 Mar 2018 | 8:20:51 UTC - in response to Message 49212.

As you probably know, the FMA version of TN-Grid is faster than the AVX version. At least that was my result comparing a Ryzen 1700 (FMA) to an i7-4770 (AVX), both on Ubuntu.


I know.
And, i forgot, a Windows app!!! :-)

klepel
Send message
Joined: 23 Dec 09
Posts: 153
Credit: 2,361,165,663
RAC: 2,411,802
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49215 - Posted: 27 Mar 2018 | 17:58:25 UTC - in response to Message 49179.

The linux QM app runs fine but it requires some packages to be installed. People should expect infinite number of workunits...

We will run this for ever.

I still think, the main obstacle to gain a wider LINUX contributor base is to solve the start-up error when two WUs start at the same time. In my view, this hinders the implementation of the app on newer computers with more than 4 threads/cores, as the other available computer threads/cores are not loaded by other BOINC projects.

kain
Send message
Joined: 3 Sep 14
Posts: 140
Credit: 301,799,425
RAC: 1,124,040
Level
Asp
Scientific publications
watwatwatwatwat
Message 49216 - Posted: 27 Mar 2018 | 18:36:09 UTC - in response to Message 49215.

The linux QM app runs fine but it requires some packages to be installed. People should expect infinite number of workunits...

We will run this for ever.

I still think, the main obstacle to gain a wider LINUX contributor base is to solve the start-up error when two WUs start at the same time. In my view, this hinders the implementation of the app on newer computers with more than 4 threads/cores, as the other available computer threads/cores are not loaded by other BOINC projects.


True. I'm using only 2c/4t and 4c CPUs because of that. Ryzen 8c/16t is crunching WCG.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49217 - Posted: 27 Mar 2018 | 21:02:46 UTC
Last modified: 27 Mar 2018 | 21:45:43 UTC

You can sort of learn to live with its idiosyncrasies after a while. They are more an annoyance at first. But I wonder if a non multi-core version would fix the start-up problems? It would be worth looking into.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 274
Credit: 887,370,590
RAC: 1,752,527
Level
Glu
Scientific publications
watwat
Message 49270 - Posted: 15 Apr 2018 | 9:16:06 UTC
Last modified: 15 Apr 2018 | 9:19:17 UTC

May I touch upon this matter again after 2,5-3 weeks... at the risk of being a pain in the neck... is there any progress regarding concurrent QC tasks? I would really like to use a 8-16 core CPU (instead of my i3) and run several jobs at the same time. Hope that the admins can take some time to follow up, in view of the many jobs we still have to crunch.

Thanks.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile Logan Carr
Send message
Joined: 12 Aug 15
Posts: 240
Credit: 60,038,511
RAC: 114,084
Level
Thr
Scientific publications
wat
Message 49298 - Posted: 17 Apr 2018 | 14:10:21 UTC - in response to Message 48820.

Hey,

What are the recommended PC specs you recommend for these Linux job units? (minimal CPU and ram?) Thanks
____________
Cruncher/Learner in progress.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49305 - Posted: 17 Apr 2018 | 21:41:46 UTC - in response to Message 49298.

I have five QC running at the moment on two machines, and they are taking from 230 MB to 260 MB. I would plan on at least 300 MB to be safe.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49308 - Posted: 18 Apr 2018 | 8:26:36 UTC - in response to Message 49298.

I now have one running at 1016 MB for a few minutes, but now down to 932 MB. It looks like the upper limit is a bit elastic.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 49309 - Posted: 18 Apr 2018 | 11:11:00 UTC - in response to Message 49308.

I now have one running at 1016 MB for a few minutes, but now down to 932 MB. It looks like the upper limit is a bit elastic.

Just to confirm, that's 1GB per work unit?

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 274
Credit: 887,370,590
RAC: 1,752,527
Level
Glu
Scientific publications
watwat
Message 49310 - Posted: 18 Apr 2018 | 11:18:15 UTC - in response to Message 49305.
Last modified: 18 Apr 2018 | 11:20:10 UTC

I have five QC running at the moment on two machines, and they are taking from 230 MB to 260 MB. I would plan on at least 300 MB to be safe.


pardon me for jumping in, does that mean it is possible now to run several QC tasks at the same time? For example, 4 Jobs on the Ryzen 1700?
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 748
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 49311 - Posted: 18 Apr 2018 | 11:32:10 UTC - in response to Message 49310.
Last modified: 18 Apr 2018 | 11:33:45 UTC

It is possible to run as many as you want concurrently. A bug unfortunately prevents simultaneous starts. I am investigating possible workarounds but no timeline yet, sorry. Requirements are in some other thread; they are rather mild (only thing you need to have the gcc package installed).

mmonnin
Send message
Joined: 2 Jul 16
Posts: 184
Credit: 423,501,239
RAC: 2,245,528
Level
Gln
Scientific publications
wat
Message 49312 - Posted: 18 Apr 2018 | 11:44:27 UTC - in response to Message 49310.

I have five QC running at the moment on two machines, and they are taking from 230 MB to 260 MB. I would plan on at least 300 MB to be safe.


pardon me for jumping in, does that mean it is possible now to run several QC tasks at the same time? For example, 4 Jobs on the Ryzen 1700?


Best to just run 1 per computer via app_config or you'll eventually get two starting at once. They'll crash at once, and end up in this loop.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49313 - Posted: 18 Apr 2018 | 13:46:14 UTC - in response to Message 49309.

I now have one running at 1016 MB for a few minutes, but now down to 932 MB. It looks like the upper limit is a bit elastic.

Just to confirm, that's 1GB per work unit?

Yes, but now I see 1466 MB for a single work unit, the most I have seen.

Note however that I use an app_config to limit them to only one CPU core per work unit, but I do not limit the number of work units that can run at a time. It would undoubtedly be more efficient use of memory to allow the default value of four cores to run on a single work unit. It probably would not use much more memory than a single core, but what the maximum is I don't really know. I have 32 GB, so I have never really paid much attention to it. I use BOINCTasks to measure the memory use by the way.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 274
Credit: 887,370,590
RAC: 1,752,527
Level
Glu
Scientific publications
watwat
Message 49314 - Posted: 18 Apr 2018 | 16:00:00 UTC - in response to Message 49311.

It is possible to run as many as you want concurrently. A bug unfortunately prevents simultaneous starts.


Okay, thank you :)
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile Logan Carr
Send message
Joined: 12 Aug 15
Posts: 240
Credit: 60,038,511
RAC: 114,084
Level
Thr
Scientific publications
wat
Message 49316 - Posted: 18 Apr 2018 | 21:12:02 UTC - in response to Message 49311.
Last modified: 18 Apr 2018 | 21:13:14 UTC

Requirements are in some other thread; they are rather mild (only thing you need to have the gcc package installed).


I have looked around and cannot find any requirements for this specific project. The only CPU requirements I see are here
http://www.gpugrid.net/join.php

But that looks like general use to me.

I will make my question a little more simple... would a core i3 work fine with this project, and if so, would a gen 1 be fine, gen 2, or newer? My budget for another PC dedicated to cpu work is about 100 USD. I do not mind for a small desktop either without a big gpu slot. Thank you.
____________
Cruncher/Learner in progress.

NUCCpod_NAPTIMELABS_01
Send message
Joined: 18 Aug 17
Posts: 6
Credit: 133,454,498
RAC: 609,543
Level
Cys
Scientific publications
wat
Message 49317 - Posted: 18 Apr 2018 | 22:26:57 UTC - in response to Message 49316.


I will make my question a little more simple... would a core i3 work fine with this project, and if so, would a gen 1 be fine, gen 2, or newer? My budget for another PC dedicated to cpu work is about 100 USD. I do not mind for a small desktop either without a big gpu slot. Thank you.


For questions like this, wuprop is a fantastic resource. Check the link and you can see things like time to complete for various CPUs and other requirements like memory. In this case looks like you may need 776.5 MB of RAM and an i3 should do just fine.

Profile Logan Carr
Send message
Joined: 12 Aug 15
Posts: 240
Credit: 60,038,511
RAC: 114,084
Level
Thr
Scientific publications
wat
Message 49318 - Posted: 18 Apr 2018 | 23:13:18 UTC - in response to Message 49317.


I will make my question a little more simple... would a core i3 work fine with this project, and if so, would a gen 1 be fine, gen 2, or newer? My budget for another PC dedicated to cpu work is about 100 USD. I do not mind for a small desktop either without a big gpu slot. Thank you.


For questions like this, wuprop is a fantastic resource. Check the link and you can see things like time to complete for various CPUs and other requirements like memory. In this case looks like you may need 776.5 MB of RAM and an i3 should do just fine.



Thank you very much.
____________
Cruncher/Learner in progress.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 274
Credit: 887,370,590
RAC: 1,752,527
Level
Glu
Scientific publications
watwat
Message 49372 - Posted: 29 Apr 2018 | 8:43:29 UTC - in response to Message 49311.
Last modified: 29 Apr 2018 | 8:44:03 UTC

It is possible to run as many as you want concurrently. A bug unfortunately prevents simultaneous starts. I am investigating possible workarounds but no timeline yet, sorry.


One simple solution would be creating a TEMP file in the application directory. The first thing a job does is trying to create this file. In case the create command fails (because the file is already there or a multiple create/write conflict occurred) the job must back off for a while. If successful, this job is allowed to start and the others will try again in a couple of seconds. The now starting job shall delete the TEMP file in a timely manner, so that the others can also get started. So the TEMP file actually works like a (do-not-start) flag. You may also stop all tasks sometime and delete the TEMP at regular intervals, to make sure there is no failing or cancelled task leaving the file in place forever.

Frankly there are also other and more professional things like semaphores available in the OS, but the above is possibly the fastest solution.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 274
Credit: 887,370,590
RAC: 1,752,527
Level
Glu
Scientific publications
watwat
Message 49373 - Posted: 29 Apr 2018 | 17:12:07 UTC
Last modified: 29 Apr 2018 | 17:13:52 UTC

...You may also stop all tasks


Edit: I mean to pause the tasks, not to cancel them of course. By the way, does the error show up also when multiple tasks are paused and then re-started/continued at the same time?
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 49376 - Posted: 30 Apr 2018 | 12:28:51 UTC

I am having a lot of CPU Quantum Chemistry WUs error out as soon as they start on my Ubuntu machine. Some work but most error out. Is this inherent with the CPU app or is it my machine?

Below are the machine's tasks:
http://www.gpugrid.net/results.php?hostid=424454

ChristianVirtual
Send message
Joined: 16 Aug 14
Posts: 17
Credit: 378,257,954
RAC: 79
Level
Asp
Scientific publications
watwatwat
Message 49379 - Posted: 2 May 2018 | 12:08:20 UTC

Me too:

http://www.gpugrid.net/result.php?resultid=17510462

ChristianVirtual
Send message
Joined: 16 Aug 14
Posts: 17
Credit: 378,257,954
RAC: 79
Level
Asp
Scientific publications
watwatwat
Message 49380 - Posted: 2 May 2018 | 12:08:32 UTC
Last modified: 2 May 2018 | 12:08:58 UTC

(SORRY, double post)

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1965
Credit: 12,939,183,544
RAC: 11,521,091
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49384 - Posted: 2 May 2018 | 16:55:58 UTC - in response to Message 49376.
Last modified: 2 May 2018 | 16:57:03 UTC

I am having a lot of CPU Quantum Chemistry WUs error out as soon as they start on my Ubuntu machine. Some work but most error out. Is this inherent with the CPU app or is it my machine?
There's a bug in the app, which prevents more than 1 task starting simultaneously. When the task started successfully, you can start another task manually. Since there's no automated way for this, you should pause all but one task before you shut down your computer, then start them one by one. The other option is to limit the concurrently running QC apps to 1. Since this app uses only 4 threads (cores) you should utilize your other CPU cores with a different project.
To do this you should create / modify your app_config.xml file in the projects\www.gpugrid.net folder.

<app_config> <app> <name>QC</name> <max_concurrent>1</max_concurrent> </app> <app_version> <app_name>QC</app_name> <plan_class>mt</plan_class> <avg_ncpus>4</avg_ncpus> </app_version> </app_config>

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 49405 - Posted: 7 May 2018 | 11:22:37 UTC

So I understand why I receive "error while computing" with almost 0 seconds of compute time (multiple WUs starting at once) but I don't understand the fairly high number of failed WUs with multiple thousand second or even over ten thousand second run times. Linked below are my error rates:

http://www.gpugrid.net/results.php?hostid=424454&offset=0&show_names=0&state=5&appid=

Does anyone think they can explain why the non WUs starting at once errors are occurring?

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 883
Credit: 1,833,823,145
RAC: 1,220,321
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49406 - Posted: 7 May 2018 | 13:38:07 UTC - in response to Message 49405.

Did you stop the computer at any point, and let them all RE-start in unison - from whatever point they'd reached?

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 49407 - Posted: 7 May 2018 | 14:04:06 UTC - in response to Message 49406.

Did you stop the computer at any point, and let them all RE-start in unison - from whatever point they'd reached?

From my knowledge, the computer only turned off once last week. It is on 24/7 otherwise. It looks like these errors are all over the place in terms of date.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 49408 - Posted: 8 May 2018 | 11:00:59 UTC

We are losing CPU Volunteers, can anyone help?

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 668
Credit: 2,498,095,550
RAC: 1
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49409 - Posted: 8 May 2018 | 12:01:32 UTC - in response to Message 49408.

We are losing CPU Volunteers, can anyone help?



It's up to the scientists to sort. They will need a Windowa version to have any hope of getting a reasonable throughput.
____________
Radio Caroline, the world's most famous offshore pirate radio station.
Great music since April 1964. Support Radio Caroline Team -
Radio Caroline

kain
Send message
Joined: 3 Sep 14
Posts: 140
Credit: 301,799,425
RAC: 1,124,040
Level
Asp
Scientific publications
watwatwatwatwat
Message 49414 - Posted: 8 May 2018 | 16:16:33 UTC - in response to Message 49409.

We are losing CPU Volunteers, can anyone help?



It's up to the scientists to sort. They will need a Windowa version to have any hope of getting a reasonable throughput.


Or just linux app without 'starting many at once' issue. I could add more than 100 cores but I cant because of that.

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 335
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 49417 - Posted: 9 May 2018 | 9:49:02 UTC

We will talk this week with the devs of the QM CPU software to see how we can make a Windows build. Once we know the status and how much work it is we will update. But believe me we are extremely keen on a Windows version...

tullio
Send message
Joined: 8 May 18
Posts: 141
Credit: 17,584,854
RAC: 118,098
Level
Pro
Scientific publications
wat
Message 49419 - Posted: 9 May 2018 | 11:19:36 UTC

Since you are using the wrapper from LHC@home why don't you use VirtualBox? This would eliminate the need for a Windows version. All Windows user of LHC@home can run LHC@home tasks, written in Scientific Linux, using Virtual Box.
Tullio

Failboat
Send message
Joined: 24 Apr 16
Posts: 2
Credit: 17,557,751
RAC: 0
Level
Pro
Scientific publications
wat
Message 49420 - Posted: 9 May 2018 | 23:15:38 UTC

changed my desktop settings to include cpu tasks again. (im in the pool so it wont show on my account). hope my contribution, small as it is, helps!

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 49427 - Posted: 10 May 2018 | 7:24:21 UTC - in response to Message 49419.

Since you are using the wrapper from LHC@home why don't you use VirtualBox?


No, please.
Now i'm using a Linux vm with Virtual Box for this project and it's a nightmare.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 49429 - Posted: 10 May 2018 | 10:42:27 UTC - in response to Message 49427.
Last modified: 10 May 2018 | 10:43:01 UTC

Since you are using the wrapper from LHC@home why don't you use VirtualBox?


No, please.
Now i'm using a Linux vm with Virtual Box for this project and it's a nightmare.

The LHC@home performance using Virtualbox vs direct from linux is a ridiculous loss in efficiency and performance, not to mention substantial ram usage. If we can avoid virtualbox outright I would consider that a win.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49430 - Posted: 10 May 2018 | 13:03:44 UTC - in response to Message 49429.

The LHC@home performance using Virtualbox vs direct from linux is a ridiculous loss in efficiency and performance, not to mention substantial ram usage. If we can avoid virtualbox outright I would consider that a win.

You keep saying that, which I think just shows that you have had problems with VirtualBox. Even if so, I don't know why that is a reason for GPUGrid not to use it. A lot of people can get it to work.

It is not a "ridiculous loss in efficiency", but about the same as using Windows on some projects optimized for Linux (I do both VirtualBox ATLAS and native ATLAS on LHC, and have a basis for comparison).

And the ram usage depends mainly on the project. It is reasonable enough (about 2 GB) when running 4 cores on Cosmology. If that is too much for you, then upgrade your hardware or stop hassling other people that do have it.

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 335
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 49481 - Posted: 15 May 2018 | 15:50:04 UTC

We talked with the devs. The idea is to collaborate over summer for a Windows version. So I hope by the end of summer we should have a new release.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 49483 - Posted: 15 May 2018 | 16:12:02 UTC - in response to Message 49481.

We talked with the devs. The idea is to collaborate over summer for a Windows version. So I hope by the end of summer we should have a new release.


Great news!!
Meantime, please fix problems with linux

tullio
Send message
Joined: 8 May 18
Posts: 141
Credit: 17,584,854
RAC: 118,098
Level
Pro
Scientific publications
wat
Message 49493 - Posted: 17 May 2018 | 15:36:15 UTC

All CPU tasks fail on my SuSE Linux Leap 42.3. All GPU tasks complete and validate on my GTX 750 Ti giving me huge credits, more than Einstein@home and SETI@home GPU tasks.
Tullio

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 49494 - Posted: 17 May 2018 | 16:42:19 UTC - in response to Message 49493.
Last modified: 17 May 2018 | 16:42:39 UTC

All CPU tasks fail on my SuSE Linux Leap 42.3. All GPU tasks complete and validate on my GTX 750 Ti giving me huge credits, more than Einstein@home and SETI@home GPU tasks.
Tullio

Hi, Credits cannot be compared from project to project. And even though GPU tasks give so much credit here at GPUGrid there is still substantial scientific weight and benefit to GPUGrid's CPU Work Units even though they don't give as much credit. Please keep this in mind.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 883
Credit: 1,833,823,145
RAC: 1,220,321
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49495 - Posted: 17 May 2018 | 18:17:15 UTC - in response to Message 49494.

All CPU tasks fail on my SuSE Linux Leap 42.3. All GPU tasks complete and validate on my GTX 750 Ti giving me huge credits, more than Einstein@home and SETI@home GPU tasks.
Tullio

Hi, Credits cannot be compared from project to project. And even though GPU tasks give so much credit here at GPUGrid there is still substantial scientific weight and benefit to GPUGrid's CPU Work Units even though they don't give as much credit. Please keep this in mind.

Since BOINC credits are defined as a certain number of 'cobblestones' (completed floating point operations), they should be comparable. But I agree, they're not.

That somewhat arcane point becomes more significant when statistics sites, and BOINC itself, back-calculate the floating point performance of a project from the number of credits awarded.

tullio
Send message
Joined: 8 May 18
Posts: 141
Credit: 17,584,854
RAC: 118,098
Level
Pro
Scientific publications
wat
Message 49499 - Posted: 18 May 2018 | 3:58:40 UTC - in response to Message 49494.
Last modified: 18 May 2018 | 3:59:05 UTC


Hi, Credits cannot be compared from project to project. And even though GPU tasks give so much credit here at GPUGrid there is still substantial scientific weight and benefit to GPUGrid's CPU Work Units even though they don't give as much credit. Please keep this in mind.

I don't give a damn about credits. But all my CPU tasks fail miserably.
Tullio

mmonnin
Send message
Joined: 2 Jul 16
Posts: 184
Credit: 423,501,239
RAC: 2,245,528
Level
Gln
Scientific publications
wat
Message 49504 - Posted: 19 May 2018 | 3:07:28 UTC - in response to Message 49408.

We are losing CPU Volunteers, can anyone help?


If admins want CPU support they'll make an app w/o bugs. No effort by them, no effort from us. They get paid, we pay for the PCs and we'd rather not have it aimlessly wasted.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 49514 - Posted: 21 May 2018 | 12:17:04 UTC - in response to Message 49504.

We are losing CPU Volunteers, can anyone help?


If admins want CPU support they'll make an app w/o bugs. No effort by them, no effort from us.


+1

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49515 - Posted: 21 May 2018 | 13:57:21 UTC - in response to Message 49504.

I would put up with the bugs if I could. But I get only crashes on QC now, so I am out of business.

I hope they make a big announcement when it is fixed, since I won't know about it otherwise. Maybe they are spending their time on the Windows version?

tullio
Send message
Joined: 8 May 18
Posts: 141
Credit: 17,584,854
RAC: 118,098
Level
Pro
Scientific publications
wat
Message 49516 - Posted: 21 May 2018 | 14:03:43 UTC

Maybe they are more interested to the results of the CRUNCHATHLON competition. You can not compete with a dead horse.
Tullio

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49517 - Posted: 21 May 2018 | 14:10:03 UTC - in response to Message 49516.

You can not compete with a dead horse.
Tullio

Not at the Palio either.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 748
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 49521 - Posted: 22 May 2018 | 14:21:27 UTC - in response to Message 49517.

Please see the Number Crunching forum.

Ken_g6
Send message
Joined: 6 Aug 11
Posts: 8
Credit: 15,776,087
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwat
Message 49578 - Posted: 1 Jun 2018 | 0:37:05 UTC

Well, I gave QM a try, but I really didn't like it, for a few reasons.

First, the setup doesn't work when running more than one WU at a time. You need a "lock file" to ensure only one thread does the setup. If that thread gets aborted you might have problems, but it's better than the current situation.

Second, you know what I hate? I hate BOINC apps that download more data themselves. (Such as LHC@Home Atlas.) All necessary data should be coming from your BOINC server. At least your app only seems to do that downloading once.

Third, you know what else I hate? Bloated apps. Your "miniconda" includes such things as TK (not needed unless there's a screen saver?) and man pages. Maybe if you streamlined it you could fit whatever extra data it downloads in the initial download, and then you wouldn't need the networking libraries either.

But, I finally did get QM working. What made me give up on it was the credit. It started around 600 credits/WU, but it seemed to get cut in half every few sets of WUs. It was down around 50 when I gave up.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49579 - Posted: 1 Jun 2018 | 0:43:40 UTC - in response to Message 49578.

But, I finally did get QM working. What made me give up on it was the credit. It started around 600 credits/WU, but it seemed to get cut in half every few sets of WUs. It was down around 50 when I gave up.

I don't care about credits themselves, but I do wonder about the science that is being accomplished. If it is the same per work unit, that is OK, but if it is being cut in half periodically also, then that is a problem. I wonder what causes it?

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 49580 - Posted: 1 Jun 2018 | 11:27:36 UTC - in response to Message 49578.
Last modified: 1 Jun 2018 | 11:37:39 UTC

But, I finally did get QM working. What made me give up on it was the credit. It started around 600 credits/WU, but it seemed to get cut in half every few sets of WUs. It was down around 50 when I gave up.

The credits are down to around 50 right now because the Work Units are extremely short due to testing.

I'm not sure what everyone's fantasy is with credit. I personally couldn't care less as long as what I am doing is benefiting science. The credit itself is worth nothing so I'm not sure I get the point. I guess some people just need something more.

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 335
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 49581 - Posted: 1 Jun 2018 | 14:11:22 UTC
Last modified: 1 Jun 2018 | 14:44:33 UTC

Indeed as papalito said these WUs are muuuuuch shorter than the 600 credits ones. They scale by computation time automatically. I miscalculated a bit yesterday how fast they are so we ran out but I'm going to submit more in an hour or so.

The downloading stuff I understand but we depend unfortunately on external software (psi4) so we cannot control everything. As you said it only downloads once. But yes conda is bloated in general but we kept it down to the bare minimum packages.

@Jim the WUs that I send contain varying molecules of different number of atoms. The ones I sent yesterday had very few atoms so they completed super fast. But "every molecule is sacred" as per Monty Python. I guess I could mix all molecules together but it would become an organizational chaos for me to keep track of what I have already calculated, so for now I will continue submitting them with increasing molecule size.

klepel
Send message
Joined: 23 Dec 09
Posts: 153
Credit: 2,361,165,663
RAC: 2,411,802
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49582 - Posted: 1 Jun 2018 | 14:55:59 UTC

Stefan, Can you reduce free disk-space requirement of 4768.37 MB to something like 4000MB or less? Your QM WUs do not fit on my USB 16 GB Stick anymore! (After Lubuntu Up-Grade from 17.10 to 18.04)

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 335
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 49583 - Posted: 1 Jun 2018 | 15:13:14 UTC - in response to Message 49582.

Sorry klepel, I don't think I can :( Most of it is taken up by miniconda and the required software, not the workunits themselves.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49584 - Posted: 1 Jun 2018 | 23:22:09 UTC - in response to Message 49581.

@Jim the WUs that I send contain varying molecules of different number of atoms. The ones I sent yesterday had very few atoms so they completed super fast. But "every molecule is sacred" as per Monty Python. I guess I could mix all molecules together but it would become an organizational chaos for me to keep track of what I have already calculated, so for now I will continue submitting them with increasing molecule size.

No problem at all. Keep doing what you have to do.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 49585 - Posted: 3 Jun 2018 | 1:24:25 UTC

@Stefan can you quickly describe for us what the Quantum Chemistry Work Units are doing and what you are learning from these work units?

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 49587 - Posted: 3 Jun 2018 | 9:30:46 UTC - in response to Message 49193.


When can we expect some info about scientific results created with this app?


Months away, we have just started.



Some months have gone.
We are crunching for....? Cancer research? Any preliminary result?
Do you plan, togheter with windows app, also a gpu version?

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 49588 - Posted: 3 Jun 2018 | 9:31:53 UTC - in response to Message 49585.

@Stefan can you quickly describe for us what the Quantum Chemistry Work Units are doing and what you are learning from these work units?


+1

STARBASEn
Avatar
Send message
Joined: 17 Feb 09
Posts: 58
Credit: 762,514,044
RAC: 1,813,935
Level
Glu
Scientific publications
watwatwatwatwat
Message 49589 - Posted: 3 Jun 2018 | 17:41:19 UTC - in response to Message 49588.

+ another 1. I have 7+7+3 cores 100% dedicated to QC and 4+4+4 cores 50% dedicated to QC and WCG, all 24/7. Along with 3 GTX 1060's, makes for a very warm office, especially now in summer. Would be interesting to find out what the project goals are :).

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1965
Credit: 12,939,183,544
RAC: 11,521,091
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49590 - Posted: 3 Jun 2018 | 21:11:31 UTC - in response to Message 49585.

@Stefan can you quickly describe for us what the Quantum Chemistry Work Units are doing and what you are learning from these work units?

See the "New Student and QMML Project" thread for a clue.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 49594 - Posted: 4 Jun 2018 | 7:50:57 UTC - in response to Message 49590.

@Stefan can you quickly describe for us what the Quantum Chemistry Work Units are doing and what you are learning from these work units?

See the "New Student and QMML Project" thread for a clue.


Really a "clue".
"We are simulating molecules". Ok, very precise :-P

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 335
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 49595 - Posted: 4 Jun 2018 | 13:11:08 UTC
Last modified: 4 Jun 2018 | 13:23:27 UTC

Yeah sorry. I've developed a partially healthy paranoia over the last years due to some researchers being unhealthily competitive. In this case I'm trying to steer away from what others do to avoid problems but I'll be a bit vague anyway just to be safe.

Practically we are trying to teach a neural network to calculate molecular energies and forces. QM calculations are horribly slow and scale quadratically to the number of atoms. But a network trained on QM data is orders of magnitude faster, scales linearly with the numbers of atoms and achieves decently good accuracy. We believe these networks are the future for molecular simulations so we try to work with them and see what problems we can apply them to. At the moment they are still slower than usual MD simulations which we used to do but they should be much more accurate given enough training data. This training data is what is critical and what we are trying to produce currently.

Currently there are already three or so groups working on such networks and they have shown great results so we try to mostly collaborate with them to avoid duplication of effort and clash of research topics.
If you want to read up on some great projects that inspired us, check out ANI1 https://arxiv.org/abs/1610.08935 TensorMol https://arxiv.org/abs/1711.06385 DeepMD https://arxiv.org/abs/1712.03641

On applications to more biological research and implications to disease related research you will have to wait for my publication :)

kain
Send message
Joined: 3 Sep 14
Posts: 140
Credit: 301,799,425
RAC: 1,124,040
Level
Asp
Scientific publications
watwatwatwatwat
Message 49596 - Posted: 4 Jun 2018 | 14:57:41 UTC

Very interesting, thank you :)

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 49597 - Posted: 4 Jun 2018 | 16:05:50 UTC - in response to Message 49595.

Thank you!!

biodoc
Send message
Joined: 26 Aug 08
Posts: 121
Credit: 847,311,950
RAC: 4,566
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49598 - Posted: 4 Jun 2018 | 18:58:20 UTC

Thank you for sharing more info Stefan!

STARBASEn
Avatar
Send message
Joined: 17 Feb 09
Posts: 58
Credit: 762,514,044
RAC: 1,813,935
Level
Glu
Scientific publications
watwatwatwatwat
Message 49599 - Posted: 4 Jun 2018 | 22:02:49 UTC

Yes, thank you, much appreciated.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49600 - Posted: 4 Jun 2018 | 23:15:48 UTC - in response to Message 49595.

Practically we are trying to teach a neural network to calculate molecular energies and forces.

As you well know, the new Nvidia cards are said to be designed for "deep learning". Maybe that will be of some use to you someday.

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 335
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 49601 - Posted: 5 Jun 2018 | 7:44:18 UTC - in response to Message 49600.

Yes, these quantum chemistry potential networks are a deep learning application :) So we train them on our local NVIDIA GPUs. But for the moment I don't see a need to distribute the training of the networks to GPUGRID if that's what you meant. It trains fast enough locally.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49602 - Posted: 5 Jun 2018 | 10:35:45 UTC - in response to Message 49601.

OK, that gives me a better idea of how we fit into the overall scheme of things. I am glad we can do that work for you.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 49603 - Posted: 5 Jun 2018 | 11:16:50 UTC

Thank you for answering my questions Stefan.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 184
Credit: 423,501,239
RAC: 2,245,528
Level
Gln
Scientific publications
wat
Message 49604 - Posted: 5 Jun 2018 | 11:47:24 UTC

I've been running the GPU tasks on a 3570k (along side some other projects) on Ubuntu 14.04, just 4 threads so no chance of the simultaneous start issue. All good so far.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 49605 - Posted: 5 Jun 2018 | 13:08:28 UTC - in response to Message 49601.
Last modified: 5 Jun 2018 | 13:08:41 UTC

So we train them on our local NVIDIA GPUs. But for the moment I don't see a need to distribute the training of the networks to GPUGRID if that's what you meant. It trains fast enough locally.


I cannot understand.
Are we crunching GpuGrid QM to "prepare" data for your local/internal GPU?

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 335
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 49606 - Posted: 5 Jun 2018 | 14:01:16 UTC - in response to Message 49605.
Last modified: 5 Jun 2018 | 14:01:56 UTC

Hm ok, maybe you are not familiar with machine learning. Sorry if I glossed over it. In machine learning and specifically supervised learning as in this project you "teach" a network to replicate some ground-truth calculations (in this case QM energy/force calculations).

This means that we take some molecules, position their atoms in 3D space and you guys calculate with QM the energy and forces of this conformation of this molecule.

Then I locally on my computer show a network only the positions of the atoms in space and ask it to predict the energy and forces that you calculated for us (from the QM). This might sound pointless because why are you predicting stuff you already know? Well the great thing about networks is that they are very good interpolators, so if I now give it a molecule (more or less) similar to the ones it was trained on and ask it what is the energy of this molecule, the network will give me an incredibly good estimate of the energy/forces in a few microseconds while with QM I might need minutes to do the same.

Does this clarify it?

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 427
Credit: 3,144,121,066
RAC: 7,442,334
Level
Arg
Scientific publications
watwat
Message 49607 - Posted: 5 Jun 2018 | 18:30:10 UTC - in response to Message 49606.

Thank you Stefan, this is a fantastic explanation!

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 155
Credit: 135,281,463
RAC: 1,053,499
Level
Cys
Scientific publications
wat
Message 49608 - Posted: 6 Jun 2018 | 1:44:43 UTC - in response to Message 49607.

+1

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 110
Credit: 267,632
RAC: 222
Level

Scientific publications
wat
Message 49619 - Posted: 6 Jun 2018 | 13:18:37 UTC - in response to Message 49606.

Does this clarify it?


Very, very clear.
Thank you!!

Jim1348
Send message
Joined: 28 Jul 12
Posts: 642
Credit: 1,210,725,024
RAC: 66,314
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 49729 - Posted: 27 Jun 2018 | 16:28:14 UTC - in response to Message 49606.

For some reason, that approach appeals to me a lot as an ideal distributed computing project. It allows you to offload the slow calculations, while you retain the flexibility of asking a lot of different questions on the returned data that you can investigate with your high-speed calculations. It does not appear to require excessive bandwidth to shuttle the data back and forth (though I have a lot if needed), nor does it require a lot of memory (though I have plenty of that too if you need it).

Also, it does not appear to be tied to a particular type of molecule or disease, but should lend itself to a wide range of subjects. And lastly, aside from a few startup glitches, it seems to be reliable on home computers, and we need not constantly fight to keep it running in the face of errors or crashes. I hope, and expect, that you will achieve significant success with it.

Tomas Brada
Send message
Joined: 3 Nov 15
Posts: 38
Credit: 2,015,431
RAC: 478
Level
Ala
Scientific publications
wat
Message 50855 - Posted: 11 Nov 2018 | 18:47:01 UTC

I am running QC on my box currently and there does not appear any startup glitches. It can run two and it can start multiple at the same time. I see there is some usage of flock in the app, and that seems to solve the startup crashes that everyone is complaining. So, Good job.
Pity that there are no more amd gpu tasks, after I acquired two such gpus.
____________

Post to thread

Message boards : News : CPU jobs on Linux