Message boards : News : All acemd3 apps updated (210)
Author | Message |
---|---|
Currently there should be no major *known* bugs. We should cover Win64 and Linux, with reasonably recent cards. | |
ID: 52852 | Rating: 0 | rate:
![]() ![]() ![]() | |
Yes, I have had several task failures today when I never had any. Validated three test tasks with the new 2.10 app and one normal task. | |
ID: 52853 | Rating: 0 | rate:
![]() ![]() ![]() | |
The DHFR210 set was botched because old versions were still lurking around. I deprecated all the old apps now. The 210a set was created after this so should be ok. | |
ID: 52854 | Rating: 0 | rate:
![]() ![]() ![]() | |
CUDA version supported by your driver, rather than other heuristics Which version is recommended as the minimum now? MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 52857 | Rating: 0 | rate:
![]() ![]() ![]() | |
Looking at the supported applications page - http://www.gpugrid.net/apps.php | |
ID: 52860 | Rating: 0 | rate:
![]() ![]() ![]() | |
Which version is recommended as the minimum now? As per Nvidia deployment documentation (previously posted by Keith Myers): https://docs.nvidia.com/deploy/cuda-compatibility/index.html CUDA80 Minimum Driver r367.48 or higher CUDA92 Minimum Driver r396.26 or higher CUDA100 Minimum Driver r410.48 or higher CUDA101 Minimum Driver r418.39 or higher | |
ID: 52861 | Rating: 0 | rate:
![]() ![]() ![]() | |
More failures because the old app (206) is still being sent out despite being deprecated. | |
ID: 52863 | Rating: 0 | rate:
![]() ![]() ![]() | |
Which version is recommended as the minimum now? Exactly. Updated drivers are necessary for RTX users. They should go for r418.39 or higher. | |
ID: 52864 | Rating: 0 | rate:
![]() ![]() ![]() | |
Seems to be working. I added a FAQ item. Old WUs may still fail. | |
ID: 52866 | Rating: 0 | rate:
![]() ![]() ![]() | |
It also depends on what generation of card you have. My GTX 980 with the 430.26 drivers running under Ubuntu 18.04 is getting the CUDA 100 work units. It ran a total time of 43 minutes for a a81-TONI_TESTTESTLONG210. But it uses a whole CPU core, so I just reserve one for it. | |
ID: 52867 | Rating: 0 | rate:
![]() ![]() ![]() | |
Now we need 10,000 WUs loaded. | |
ID: 52869 | Rating: 0 | rate:
![]() ![]() ![]() | |
Now we need 10,000 WUs loaded. +1 | |
ID: 52870 | Rating: 0 | rate:
![]() ![]() ![]() | |
Over the past several days I have received 4 ACEMD 210 WU's on two Linux machines with 3 GTX-1060's and all validated fine (3 x 7,500 and 1 x 75,000 points). Linux machines awaiting production WU's anytime. | |
ID: 52871 | Rating: 0 | rate:
![]() ![]() ![]() | |
Could you please answer these 2 questions... | |
ID: 52877 | Rating: 0 | rate:
![]() ![]() ![]() | |
As far as I know, you still can't resume a task on a different card type. I solved it by changing my preferences to switch among apps to 360 minutes vice the default 60 minutes and that solves the issue. The task starts and finishes on the same card. Haven't seen any task require that long to finish yet but probably will be adequate until we get the app declared to Main and start getting Long tasks again with the new apps. | |
ID: 52879 | Rating: 0 | rate:
![]() ![]() ![]() | |
Sadly, one still can't restart between on different card types. Could you please answer these 2 questions... | |
ID: 52880 | Rating: 0 | rate:
![]() ![]() ![]() | |
Are there any plans to allow the app to resume on a different GPU? | |
ID: 52881 | Rating: 0 | rate:
![]() ![]() ![]() | |
We looked into it, but do not know if and when there will be progress on the front. For the time being, I've amended the FAQ with a pointer on gpu exclusion. | |
ID: 52882 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thank you. That is suitable, and I plan on implementing that approach shortly for one of my systems that gets suspended/resumed a lot. Did you know I'm responsible for exclude_gpu existing? ;) | |
ID: 52883 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thank you. That is suitable, and I plan on implementing that approach shortly for one of my systems that gets suspended/resumed a lot. Did you know I'm responsible for exclude_gpu existing? ;) No I didn't and let me add: well done :) | |
ID: 52884 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thanks :) | |
ID: 52885 | Rating: 0 | rate:
![]() ![]() ![]() | |
Toni, a minor request for the enxt version: the previous one had some nice information in the Stderr output, i.e. GPU, driver, clocks and natoms. The latter was useful for determining small performance differences, as the "credits per time" always depended on the number of atoms in a simulation. So please include whatever you can port without too much trouble. | |
ID: 52897 | Rating: 0 | rate:
![]() ![]() ![]() | |
I would like to ask: As I understand a GTX 670 with the latest driver: 440.97 should work with the acemd3 app, as this driver and confirmed by BOINC has the latest CUDA tools 10.2. Is this correct? | |
ID: 52912 | Rating: 0 | rate:
![]() ![]() ![]() | |
Are ANY of the 9.22 or 9.23 apps working? I thought I saw mention that none of them work anymore because their license expired again. | |
ID: 52913 | Rating: 0 | rate:
![]() ![]() ![]() | |
latest driver: 440.97 should work with the acemd3 app 440.97 is working fine for me but I don't have a 670. ____________ ![]() | |
ID: 52914 | Rating: 0 | rate:
![]() ![]() ![]() | |
The 9.23 app works on this computer: http://www.gpugrid.net/show_host_detail.php?hostid=441816, but not on this: http://www.gpugrid.net/show_host_detail.php?hostid=486229 | |
ID: 52915 | Rating: 0 | rate:
![]() ![]() ![]() | |
Any reason why you don't want to move on to the acemd3 app? Other than there is no work for it again. | |
ID: 52916 | Rating: 0 | rate:
![]() ![]() ![]() | |
The 9.23 app works on this computer: http://www.gpugrid.net/show_host_detail.php?hostid=441816, but not on this: http://www.gpugrid.net/show_host_detail.php?hostid=486229The second one is 9.22 (CUDA 6.5). It's license got expired. | |
ID: 52917 | Rating: 0 | rate:
![]() ![]() ![]() | |
The 9.23 app works on this computer: http://www.gpugrid.net/show_host_detail.php?hostid=441816, but not on this: http://www.gpugrid.net/show_host_detail.php?hostid=486229The second one is 9.22 (CUDA 6.5). It's license got expired. Thanks Zoltan. I thought I saw somewhere that some(one) of the apps had an expired license again. | |
ID: 52918 | Rating: 0 | rate:
![]() ![]() ![]() | |
That is why I thought, installing the newest driver would automatically trigger that BOINC would download CUDA80 9.23 app until acemd3 is wildly available. Or just dismantle this particular computer as it is not a very efficient GPU anymore and pass it to a user which still uses GTX5XX for another BOINC project.The 9.23 app works on this computer: http://www.gpugrid.net/show_host_detail.php?hostid=441816, but not on this: http://www.gpugrid.net/show_host_detail.php?hostid=486229The second one is 9.22 (CUDA 6.5). It's license got expired. | |
ID: 52919 | Rating: 0 | rate:
![]() ![]() ![]() | |
Well the project should send out the available 9.23 CUDA80 app for applicable hardware . . . if they have configured the scheduler correctly for deprecating the CUDA65 9.22 app. Obviously that hasn't happened. | |
ID: 52920 | Rating: 0 | rate:
![]() ![]() ![]() | |
It's gotten too confusing keeping track of what runs on what. I'm setting my hosts to normal preferences, which for me is long runs and ACEMD3. The server shouldn't be sending the wrong apps to the wrong hosts. Any failures I have will be dealt with via the normal BOINC mechanisms. | |
ID: 52921 | Rating: 0 | rate:
![]() ![]() ![]() | |
It's gotten too confusing keeping track of what runs on what. I'm setting my hosts to normal preferences, which for me is long runs and ACEMD3. The server shouldn't be sending the wrong apps to the wrong hosts. Any failures I have will be dealt with via the normal BOINC mechanisms. I think it's the sensible approach. t | |
ID: 52922 | Rating: 0 | rate:
![]() ![]() ![]() | |
It's gotten too confusing keeping track of what runs on what. I'm setting my hosts to normal preferences, which for me is long runs and ACEMD3. The server shouldn't be sending the wrong apps to the wrong hosts. Any failures I have will be dealt with via the normal BOINC mechanisms. Is this policy of app assignment sill in function? This might be the reason for sending the 9.22 (CUDA6.5) app to hosts with CC3.0 cards. It's time to deprecate these apps, if their license in fact got expired. I've saved a WU, which has failed 7 times before my host picked it up: 9.22 (CUDA65) app: 4 times Turing card: 2 times exit code -55: 1 time It's time to release the new app, and deprecate the not working ones, as this project spamming itself to the void. | |
ID: 52924 | Rating: 0 | rate:
![]() ![]() ![]() | |
Now that the queues run completely dry, it's time to deprecate the old app, release the new, and fill up the queue. | |
ID: 52995 | Rating: 0 | rate:
![]() ![]() ![]() | |
Now that the queues run completely dry, it's time to deprecate the old app, release the new, and fill up the queue. you are a real optimist, Zoltan :-))) | |
ID: 52996 | Rating: 0 | rate:
![]() ![]() ![]() | |
Now that the queues run completely dry, it's time to deprecate the old app, release the new, and fill up the queue. I'll second that. I don't have any Windows machines so all my Linux hosts have crunching E@H and WCG only since May. | |
ID: 53000 | Rating: 0 | rate:
![]() ![]() ![]() | |
They are dependent on only a few people for work. I am sure some of them have teaching obligations and other course work. | |
ID: 53001 | Rating: 0 | rate:
![]() ![]() ![]() | |
Message boards : News : All acemd3 apps updated (210)