Advanced search

Message boards : News : App update 17 April 2017

Author Message
Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 46981 - Posted: 17 Apr 2017 | 19:49:15 UTC

Dear All,

If you've been following other threads on the forums, you'll know that the Windows versions of the science application have recently been updated. There are now two versions, 849 (CUDA 6.5) and 918 (CUDA 8.0), which will be assigned to hosts as follows:

* 849
- Windows XP (32 or 64 bit) and any GPU >= sm 3.0
- Any 64bit Windows with a Kepler 3.0 device

* 918
- Any 64bit Windows, Vista or later with any GPU > sm 3.0 and driver >= 370.0

The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80 that affects only that hardware version. When that's fixed, hosts with a non-XP Windows will get 918.

If you see any behaviour that deviates from this, please report it here.

If you think you should be getting work, but aren't please check that your system complies with the above, in particular the driver revision and that the OS is 64 bit.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1787
Credit: 9,544,192,994
RAC: 2,920,810
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46983 - Posted: 17 Apr 2017 | 21:02:47 UTC - in response to Message 46981.
Last modified: 17 Apr 2017 | 21:06:16 UTC

I've checked this on one of my Windows XP x64 hosts, and it has received the v9.18 (CUDA8.0) client.
Tasklist
Host details
EDIT: you may see this host as a Windows 10 PC, as right now I'm updating the "spare" Windows 10 OS on my hosts. But it had received the CUDA8.0 client before I begin to work with this host (i.e. under Windows XP x64, GTX 980 Ti, driver v358.50).

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 329
Credit: 3,718,485,509
RAC: 2,863,959
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46985 - Posted: 17 Apr 2017 | 21:51:05 UTC

I have received two 9.18 CUDA 8.0 tasks on my xp computer, they errored out:


http://www.gpugrid.net/result.php?resultid=16246847

http://www.gpugrid.net/result.php?resultid=16246846



The computer has driver 355.82 CUDA 7.5.





Profile DrBob
Send message
Joined: 1 Sep 08
Posts: 3
Credit: 81,143,789
RAC: 464,856
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 46989 - Posted: 17 Apr 2017 | 23:14:19 UTC
Last modified: 17 Apr 2017 | 23:36:15 UTC

I have 2 GTX460 cards (Hosts 317733 & 208443) that have been unable to get work since the changes over the weekend.
Both are running Driver ver. 378.92. According to GPU-Z these cards are sm 5.0.

The project will not send work saying no tasks available.

4/17/2017 6:07:42 PM | GPUGRID | update requested by user
4/17/2017 6:07:46 PM | GPUGRID | Sending scheduler request: Requested by user.
4/17/2017 6:07:46 PM | GPUGRID | Requesting new tasks for NVIDIA GPU
4/17/2017 6:07:49 PM | GPUGRID | Scheduler request completed: got 0 new tasks
4/17/2017 6:07:49 PM | GPUGRID | No tasks sent
4/17/2017 6:07:49 PM | GPUGRID | No tasks are available for Short runs (2-3 hours on fastest card)
4/17/2017 6:07:49 PM | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)

Other computers running GTX750Ti & GTX1050Ti are reciving work without any problems.

Carl
Send message
Joined: 2 May 13
Posts: 7
Credit: 786,560,489
RAC: 372,326
Level
Glu
Scientific publications
watwatwatwatwatwatwatwat
Message 46990 - Posted: 18 Apr 2017 | 0:01:56 UTC

I'm getting the same error of not receiving tasks for a GTX 570 with 381.65 Windows 8.1 However no problems with my other cards: GTX 770, and GTX 660TI.

Since the 570 and 460 cards are Fermi based, I am guessing that they might not be supported anymore. But that is speculation on my part as I am not an advanced computer user.

Erich56
Send message
Joined: 1 Jan 15
Posts: 311
Credit: 1,216,571,927
RAC: 2,954,108
Level
Met
Scientific publications
watwat
Message 46998 - Posted: 18 Apr 2017 | 5:18:10 UTC
Last modified: 18 Apr 2017 | 5:20:56 UTC

Based on the recent information that GPUGRID will run on Windows XP for one more year, I tried to continue crunching on my XP machine (driver 361.75).
However, BOINC says

18/04/2017 07:14:02 | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)

although the Project Status Page shows 300+ unsent tasks.

What can I do to get the thing work?

Erich56
Send message
Joined: 1 Jan 15
Posts: 311
Credit: 1,216,571,927
RAC: 2,954,108
Level
Met
Scientific publications
watwat
Message 47015 - Posted: 18 Apr 2017 | 15:09:31 UTC

What can I do to get the thing work?

I tried again late afternoon, and could download new tasks :-)

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1787
Credit: 9,544,192,994
RAC: 2,920,810
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47023 - Posted: 18 Apr 2017 | 22:22:59 UTC

I confirm too, that my Windows XP x64 hosts have received the 8.49 app, and it's working.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1021
Credit: 972,419,814
RAC: 1,169,942
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47040 - Posted: 20 Apr 2017 | 0:58:53 UTC
Last modified: 20 Apr 2017 | 1:00:48 UTC

Suspending, with the 918 app, is not working correctly! It sometimes takes ~20+ seconds to see the app respond to a suspend request. It should be < 3 seconds!

Also, you mention "The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80" ....... so:
- Who is responsible for fixing that (you, or NVIDIA?)
- What steps are being taken to get it fixed (have you contacted the right people?)

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 329
Credit: 3,718,485,509
RAC: 2,863,959
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47042 - Posted: 20 Apr 2017 | 1:34:33 UTC - in response to Message 47040.

Suspending, with the 918 app, is not working correctly! It sometimes takes ~20+ seconds to see the app respond to a suspend request. It should be < 3 seconds!

Also, you mention "The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80" ....... so:
- Who is responsible for fixing that (you, or NVIDIA?)
- What steps are being taken to get it fixed (have you contacted the right people?)


I have the same issue on my windows 10 computer, running GTX980ti cards.



eXaPower
Send message
Joined: 25 Sep 13
Posts: 260
Credit: 654,256,992
RAC: 3,023,166
Level
Lys
Scientific publications
watwatwatwatwat
Message 47049 - Posted: 20 Apr 2017 | 12:11:10 UTC - in response to Message 47042.

Suspending, with the 918 app, is not working correctly! It sometimes takes ~20+ seconds to see the app respond to a suspend request. It should be < 3 seconds!

Also, you mention "The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80" ....... so:
- Who is responsible for fixing that (you, or NVIDIA?)
- What steps are being taken to get it fixed (have you contacted the right people?)


I have the same issue on my windows 10 computer, running GTX980ti cards.




Suspending has taken up to 5min on Win8.1.
Resume/suspend an issue where (15) WU error upon resume on same or different GPU.
WU need to run without interruption to get a proper validation.

Wiyosaya
Send message
Joined: 22 Nov 09
Posts: 101
Credit: 137,753,403
RAC: 141
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47066 - Posted: 22 Apr 2017 | 5:33:09 UTC - in response to Message 47049.

I upgraded both of my PCs to the 381.65 driver, and I am no longer getting work. It is a bit unclear to me whether I should be getting work. From what I understand, my 460 and 580 are fermi based, but from the announcement, I am lead to believe that Fermi is still supported. I do realize that there are others using similar cards and that they are not receiving work, either, so I am guessing that these cards are now no longer supported.

Matt, is it official that Fermi is no longer supported for gpugrid?

Thanks.

____________

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1787
Credit: 9,544,192,994
RAC: 2,920,810
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47067 - Posted: 22 Apr 2017 | 11:35:02 UTC - in response to Message 47066.
Last modified: 22 Apr 2017 | 11:36:17 UTC

is it official that Fermi is no longer supported for gpugrid?
Yes.
Fermi (GTX 4xx & GTX 5xx series) is CC2.0 and CC2.1 which is below the minimum required CC3.0 therefore they are no longer supported for GPUGrid.
Due to some undisclosed "compiler problems" the the lesser Kepler based cards (CC3.0, GTX 660Ti) will fail all tasks.
See the compute capability (CC, also known as Shader Model (SM)) table on Wikipedia.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1021
Credit: 972,419,814
RAC: 1,169,942
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47129 - Posted: 28 Apr 2017 | 3:43:06 UTC - in response to Message 47067.

MJH said:

15 Apr 2017 | 21:43:26 UTC
http://www.gpugrid.net/forum_thread.php?id=4545&nowrap=true#46932

For some reason the sm 3.0 support (and only that sm version) is broken.


17 Apr 2017 | 19:49:15 UTC
http://www.gpugrid.net/forum_thread.php?id=4551&nowrap=true#46981
The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80 that affects only that hardware version. When that's fixed, hosts with a non-XP Windows will get 918.


.....
But I don't know what that means!

Is it a problem that GPUGrid must fix, or is it a problem that NVIDIA must fix?
I feel like nobody is trying to fix it.

Wiyosaya
Send message
Joined: 22 Nov 09
Posts: 101
Credit: 137,753,403
RAC: 141
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47136 - Posted: 28 Apr 2017 | 23:01:43 UTC - in response to Message 47067.

is it official that Fermi is no longer supported for gpugrid?
Yes.
Fermi (GTX 4xx & GTX 5xx series) is CC2.0 and CC2.1 which is below the minimum required CC3.0 therefore they are no longer supported for GPUGrid.
Due to some undisclosed "compiler problems" the the lesser Kepler based cards (CC3.0, GTX 660Ti) will fail all tasks.
See the compute capability (CC, also known as Shader Model (SM)) table on Wikipedia.

Thanks, Retvari. The link will help me pick a new card later this year. I'll probably get something off of e-bay, but I am not sure what yet. I will, however, be considering cards that have more recent SM implementations.
____________

Carl
Send message
Joined: 2 May 13
Posts: 7
Credit: 786,560,489
RAC: 372,326
Level
Glu
Scientific publications
watwatwatwatwatwatwatwat
Message 47148 - Posted: 1 May 2017 | 4:38:19 UTC - in response to Message 47067.

is it official that Fermi is no longer supported for gpugrid?
Yes.
Fermi (GTX 4xx & GTX 5xx series) is CC2.0 and CC2.1 which is below the minimum required CC3.0 therefore they are no longer supported for GPUGrid.
Due to some undisclosed "compiler problems" the the lesser Kepler based cards (CC3.0, GTX 660Ti) will fail all tasks.
See the compute capability (CC, also known as Shader Model (SM)) table on Wikipedia.


What is interesting is that my GTX 660TI is working just fine.

Name e84s30_e74s25p0f47-PABLO_P04637_0_IDP-0-1-RND6466_1
Workunit 12522600
Created 27 Apr 2017 | 14:41:13 UTC
Sent 27 Apr 2017 | 16:27:39 UTC
Received 28 Apr 2017 | 21:56:18 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 151018
Report deadline 2 May 2017 | 16:27:39 UTC
Run time 70,132.96
CPU time 16,657.83
Validate state Valid
Credit 210,625.00
Application version Long runs (8-12 hours on fastest card) v8.49 (cuda65)

Stderr output
<core_client_version>7.4.36</core_client_version>
<![CDATA[
<stderr_txt>
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 0 :
# Name : GeForce GTX 660 Ti
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:02:00.0
# Device clock : 1045MHz
# Memory clock : 3004MHz
# Memory width : 192bit
# Driver version : r381_64 : 38165
# GPU 0 : 65C
# GPU 0 : 67C
# GPU 0 : 68C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 0 :
# Name : GeForce GTX 660 Ti
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:02:00.0
# Device clock : 1045MHz
# Memory clock : 3004MHz
# Memory width : 192bit
# Driver version : r381_64 : 38165
# GPU 0 : 50C
# GPU 0 : 55C
# GPU 0 : 59C
# GPU 0 : 61C
# GPU 0 : 64C
# GPU 0 : 65C
# GPU 0 : 66C
# GPU 0 : 67C
# GPU 0 : 68C
# Time per step (avg over 12035000 steps): 5.614 ms
# Approximate elapsed time for entire WU: 70171.554 s
# PERFORMANCE: 44910 Natoms 5.614 ns/day 0.000 ms/step 0.000 us/step/atom
14:49:34 (5348): called boinc_finish

</stderr_txt>
]]>

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 748
Credit: 1,207,854,595
RAC: 1,599,410
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47149 - Posted: 1 May 2017 | 6:32:32 UTC - in response to Message 47148.
Last modified: 1 May 2017 | 6:48:39 UTC

What is interesting is that my GTX 660TI is working just fine.

I think that's because you have a nice simple setup with just one GPU per machine. BOINC can see exactly what you've got, and GPUGrid have set things up properly to avoid sending that machine the app which is causing the problems. Instead, you got

Application version Long runs (8-12 hours on fastest card) v8.49 (cuda65)

It seems to be the people who are running multiple cards with mixed compute capabilities in the same computer that are hitting snags. It's a long-standing and well known weakness in BOINC that the client only tells the server about the 'best' card in a system. If a CC 3.5 or higher card is present alongside the GTX 660Ti, BOINC will assign the application suitable for that newer card instead, and that's what causes the problem if the task is run on the secondary card.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1021
Credit: 972,419,814
RAC: 1,169,942
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47167 - Posted: 4 May 2017 | 5:29:13 UTC
Last modified: 4 May 2017 | 5:37:45 UTC

MJH (et. al):

I have concluded my exhaustive Cuda 8.0 SDK testing. On my Win10 x64 Build 16184 PC (with GTX970, GTX660Ti, GTX660Ti), I installed VS2015 Community, installed the Cuda 8.0 Toolkit and samples, installed the DirectX SDK, then built all of the Cuda solutions.

There are 155 Cuda samples that I was able to compile and test with. And I went through them, 2 times:
1) 381.89 - GTX970, GTX660Ti, GTX660Ti
2) 381.89 - GTX660Ti, GTX660Ti (I pulled the GTX970 out of the system)

Out of the 155 samples, they all passed on both runs.. except VFlockingD3D10 did not look correct on my GTX660Ti but looked fine on my GTX970. All other calculations and samples worked fine, even on a GTX660Ti.

This leads me to believe that the GPUGrid problem with the "9.18 (cuda80)" app, where it errors out immediately on a system that has a CC3.0/SM3 GPU .... might not be an NVIDIA problem. It might be a problem with your app.

Is it possible you are calling some method or function, that isn't supported by CC3.0/SM3?

I'm desperately wanting you to provide more info. I'm spending considerable effort to help you solve this, yet my questions to you go on unanswered. I hope you're making progress with a fix - please consider chiming in with your findings.

Jacob Klein

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1021
Credit: 972,419,814
RAC: 1,169,942
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47219 - Posted: 15 May 2017 | 3:04:02 UTC

Month 2.
The frustration continues.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 260
Credit: 654,256,992
RAC: 3,023,166
Level
Lys
Scientific publications
watwatwatwatwat
Message 47272 - Posted: 18 May 2017 | 12:51:28 UTC - in response to Message 47049.

Suspending, with the 918 app, is not working correctly! It sometimes takes ~20+ seconds to see the app respond to a suspend request. It should be < 3 seconds!

Also, you mention "The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80" ....... so:
- Who is responsible for fixing that (you, or NVIDIA?)
- What steps are being taken to get it fixed (have you contacted the right people?)


I have the same issue on my windows 10 computer, running GTX980ti cards.




Suspending has taken up to 5min on Win8.1.
Resume/suspend an issue where (15) WU error upon resume on same or different GPU.
WU need to run without interruption to get a proper validation.

13 days ago I updated to 382.05 from 381.89 - today my 1080/1070/1060/970/970 (z87 system) has 121 consecutive valid 9.18 tasks.
Suspend / resume working without issue. WU are taking 1~8 seconds to suspend.
Maybe suspend problem been resolved in most recent 381.99.* branch driver?

382.05 is the best driver for Win8.1 in several months (369.00). I'm now able to leave my system unsupervised for extended periods of time.
(Just in time for humid summertime crunching.)
Prior couple of branches I had a daily random reset - previously thinking might of been a hardware issue or OS problem since it's only a bone stock 2015 Win8.1 with no updates. IMHO: Win8.1 is faster than Win 7 or 10 running (multi) GPU compute.

Erich56
Send message
Joined: 1 Jan 15
Posts: 311
Credit: 1,216,571,927
RAC: 2,954,108
Level
Met
Scientific publications
watwat
Message 47273 - Posted: 18 May 2017 | 15:40:26 UTC

what happens to my two hosts on Windows 10 with acemd 918.80 and driver 381.65, you can read here:

http://www.gpugrid.net/forum_thread.php?id=4571&nowrap=true#47261

NOT amusing at all :-( I cannot leave the two PCs unattended for lengthy time.

Behemot
Send message
Joined: 16 Jul 15
Posts: 1
Credit: 686,150
RAC: 4
Level
Gly
Scientific publications
wat
Message 47280 - Posted: 18 May 2017 | 20:47:19 UTC

OK, finally know why I am not getting any work for GT 610. So, deleting the project from my client, no more reason to have it there.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1021
Credit: 972,419,814
RAC: 1,169,942
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47283 - Posted: 19 May 2017 | 11:04:44 UTC
Last modified: 19 May 2017 | 11:06:20 UTC

GPUGrid:

Regarding the problem of your app immediately crashing on CC3.0/SM3 GPUs (like my 2 "GTX 660 Ti" GPUs)...

Is it a problem that GPUGrid must fix, or is it a problem that NVIDIA must fix?

I feel like nobody is trying to fix it. And if it is something NVIDIA must fix, and if GPUGrid gave me enough info to identify the problem, then I might be able to work with NVIDIA to fix it.

But you guys aren't forthcoming with details.

When can we expect some answers? My main question, above, is still unanswered.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1021
Credit: 972,419,814
RAC: 1,169,942
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47315 - Posted: 23 May 2017 | 10:59:57 UTC

Hello?

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1021
Credit: 972,419,814
RAC: 1,169,942
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47350 - Posted: 1 Jun 2017 | 4:26:51 UTC

*tap tap* Is this thing on?

Post to thread

Message boards : News : App update 17 April 2017