Advanced search

Message boards : News : App update 17 April 2017

Author Message
Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 46981 - Posted: 17 Apr 2017 | 19:49:15 UTC

Dear All,

If you've been following other threads on the forums, you'll know that the Windows versions of the science application have recently been updated. There are now two versions, 849 (CUDA 6.5) and 918 (CUDA 8.0), which will be assigned to hosts as follows:

* 849
- Windows XP (32 or 64 bit) and any GPU >= sm 3.0
- Any 64bit Windows with a Kepler 3.0 device

* 918
- Any 64bit Windows, Vista or later with any GPU > sm 3.0 and driver >= 370.0

The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80 that affects only that hardware version. When that's fixed, hosts with a non-XP Windows will get 918.

If you see any behaviour that deviates from this, please report it here.

If you think you should be getting work, but aren't please check that your system complies with the above, in particular the driver revision and that the OS is 64 bit.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,630,096,894
RAC: 9,819,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46983 - Posted: 17 Apr 2017 | 21:02:47 UTC - in response to Message 46981.
Last modified: 17 Apr 2017 | 21:06:16 UTC

I've checked this on one of my Windows XP x64 hosts, and it has received the v9.18 (CUDA8.0) client.
Tasklist
Host details
EDIT: you may see this host as a Windows 10 PC, as right now I'm updating the "spare" Windows 10 OS on my hosts. But it had received the CUDA8.0 client before I begin to work with this host (i.e. under Windows XP x64, GTX 980 Ti, driver v358.50).

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 340
Credit: 3,819,818,009
RAC: 929,462
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46985 - Posted: 17 Apr 2017 | 21:51:05 UTC

I have received two 9.18 CUDA 8.0 tasks on my xp computer, they errored out:


http://www.gpugrid.net/result.php?resultid=16246847

http://www.gpugrid.net/result.php?resultid=16246846



The computer has driver 355.82 CUDA 7.5.





Profile DrBob
Send message
Joined: 1 Sep 08
Posts: 3
Credit: 100,377,839
RAC: 7,607
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 46989 - Posted: 17 Apr 2017 | 23:14:19 UTC
Last modified: 17 Apr 2017 | 23:36:15 UTC

I have 2 GTX460 cards (Hosts 317733 & 208443) that have been unable to get work since the changes over the weekend.
Both are running Driver ver. 378.92. According to GPU-Z these cards are sm 5.0.

The project will not send work saying no tasks available.

4/17/2017 6:07:42 PM | GPUGRID | update requested by user
4/17/2017 6:07:46 PM | GPUGRID | Sending scheduler request: Requested by user.
4/17/2017 6:07:46 PM | GPUGRID | Requesting new tasks for NVIDIA GPU
4/17/2017 6:07:49 PM | GPUGRID | Scheduler request completed: got 0 new tasks
4/17/2017 6:07:49 PM | GPUGRID | No tasks sent
4/17/2017 6:07:49 PM | GPUGRID | No tasks are available for Short runs (2-3 hours on fastest card)
4/17/2017 6:07:49 PM | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)

Other computers running GTX750Ti & GTX1050Ti are reciving work without any problems.

Carl
Send message
Joined: 2 May 13
Posts: 7
Credit: 839,018,214
RAC: 458,047
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwat
Message 46990 - Posted: 18 Apr 2017 | 0:01:56 UTC

I'm getting the same error of not receiving tasks for a GTX 570 with 381.65 Windows 8.1 However no problems with my other cards: GTX 770, and GTX 660TI.

Since the 570 and 460 cards are Fermi based, I am guessing that they might not be supported anymore. But that is speculation on my part as I am not an advanced computer user.

Erich56
Send message
Joined: 1 Jan 15
Posts: 371
Credit: 1,670,919,877
RAC: 2,920,998
Level
His
Scientific publications
watwatwat
Message 46998 - Posted: 18 Apr 2017 | 5:18:10 UTC
Last modified: 18 Apr 2017 | 5:20:56 UTC

Based on the recent information that GPUGRID will run on Windows XP for one more year, I tried to continue crunching on my XP machine (driver 361.75).
However, BOINC says

18/04/2017 07:14:02 | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)

although the Project Status Page shows 300+ unsent tasks.

What can I do to get the thing work?

Erich56
Send message
Joined: 1 Jan 15
Posts: 371
Credit: 1,670,919,877
RAC: 2,920,998
Level
His
Scientific publications
watwatwat
Message 47015 - Posted: 18 Apr 2017 | 15:09:31 UTC

What can I do to get the thing work?

I tried again late afternoon, and could download new tasks :-)

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,630,096,894
RAC: 9,819,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47023 - Posted: 18 Apr 2017 | 22:22:59 UTC

I confirm too, that my Windows XP x64 hosts have received the 8.49 app, and it's working.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47040 - Posted: 20 Apr 2017 | 0:58:53 UTC
Last modified: 20 Apr 2017 | 1:00:48 UTC

Suspending, with the 918 app, is not working correctly! It sometimes takes ~20+ seconds to see the app respond to a suspend request. It should be < 3 seconds!

Also, you mention "The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80" ....... so:
- Who is responsible for fixing that (you, or NVIDIA?)
- What steps are being taken to get it fixed (have you contacted the right people?)

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 340
Credit: 3,819,818,009
RAC: 929,462
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47042 - Posted: 20 Apr 2017 | 1:34:33 UTC - in response to Message 47040.

Suspending, with the 918 app, is not working correctly! It sometimes takes ~20+ seconds to see the app respond to a suspend request. It should be < 3 seconds!

Also, you mention "The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80" ....... so:
- Who is responsible for fixing that (you, or NVIDIA?)
- What steps are being taken to get it fixed (have you contacted the right people?)


I have the same issue on my windows 10 computer, running GTX980ti cards.



eXaPower
Send message
Joined: 25 Sep 13
Posts: 265
Credit: 1,043,270,117
RAC: 1,772,186
Level
Met
Scientific publications
watwatwatwatwatwat
Message 47049 - Posted: 20 Apr 2017 | 12:11:10 UTC - in response to Message 47042.

Suspending, with the 918 app, is not working correctly! It sometimes takes ~20+ seconds to see the app respond to a suspend request. It should be < 3 seconds!

Also, you mention "The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80" ....... so:
- Who is responsible for fixing that (you, or NVIDIA?)
- What steps are being taken to get it fixed (have you contacted the right people?)


I have the same issue on my windows 10 computer, running GTX980ti cards.




Suspending has taken up to 5min on Win8.1.
Resume/suspend an issue where (15) WU error upon resume on same or different GPU.
WU need to run without interruption to get a proper validation.

Wiyosaya
Send message
Joined: 22 Nov 09
Posts: 111
Credit: 171,819,453
RAC: 478,544
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47066 - Posted: 22 Apr 2017 | 5:33:09 UTC - in response to Message 47049.

I upgraded both of my PCs to the 381.65 driver, and I am no longer getting work. It is a bit unclear to me whether I should be getting work. From what I understand, my 460 and 580 are fermi based, but from the announcement, I am lead to believe that Fermi is still supported. I do realize that there are others using similar cards and that they are not receiving work, either, so I am guessing that these cards are now no longer supported.

Matt, is it official that Fermi is no longer supported for gpugrid?

Thanks.

____________

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,630,096,894
RAC: 9,819,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47067 - Posted: 22 Apr 2017 | 11:35:02 UTC - in response to Message 47066.
Last modified: 22 Apr 2017 | 11:36:17 UTC

is it official that Fermi is no longer supported for gpugrid?
Yes.
Fermi (GTX 4xx & GTX 5xx series) is CC2.0 and CC2.1 which is below the minimum required CC3.0 therefore they are no longer supported for GPUGrid.
Due to some undisclosed "compiler problems" the the lesser Kepler based cards (CC3.0, GTX 660Ti) will fail all tasks.
See the compute capability (CC, also known as Shader Model (SM)) table on Wikipedia.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47129 - Posted: 28 Apr 2017 | 3:43:06 UTC - in response to Message 47067.

MJH said:

15 Apr 2017 | 21:43:26 UTC
http://www.gpugrid.net/forum_thread.php?id=4545&nowrap=true#46932

For some reason the sm 3.0 support (and only that sm version) is broken.


17 Apr 2017 | 19:49:15 UTC
http://www.gpugrid.net/forum_thread.php?id=4551&nowrap=true#46981
The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80 that affects only that hardware version. When that's fixed, hosts with a non-XP Windows will get 918.


.....
But I don't know what that means!

Is it a problem that GPUGrid must fix, or is it a problem that NVIDIA must fix?
I feel like nobody is trying to fix it.

Wiyosaya
Send message
Joined: 22 Nov 09
Posts: 111
Credit: 171,819,453
RAC: 478,544
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47136 - Posted: 28 Apr 2017 | 23:01:43 UTC - in response to Message 47067.

is it official that Fermi is no longer supported for gpugrid?
Yes.
Fermi (GTX 4xx & GTX 5xx series) is CC2.0 and CC2.1 which is below the minimum required CC3.0 therefore they are no longer supported for GPUGrid.
Due to some undisclosed "compiler problems" the the lesser Kepler based cards (CC3.0, GTX 660Ti) will fail all tasks.
See the compute capability (CC, also known as Shader Model (SM)) table on Wikipedia.

Thanks, Retvari. The link will help me pick a new card later this year. I'll probably get something off of e-bay, but I am not sure what yet. I will, however, be considering cards that have more recent SM implementations.
____________

Carl
Send message
Joined: 2 May 13
Posts: 7
Credit: 839,018,214
RAC: 458,047
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwat
Message 47148 - Posted: 1 May 2017 | 4:38:19 UTC - in response to Message 47067.

is it official that Fermi is no longer supported for gpugrid?
Yes.
Fermi (GTX 4xx & GTX 5xx series) is CC2.0 and CC2.1 which is below the minimum required CC3.0 therefore they are no longer supported for GPUGrid.
Due to some undisclosed "compiler problems" the the lesser Kepler based cards (CC3.0, GTX 660Ti) will fail all tasks.
See the compute capability (CC, also known as Shader Model (SM)) table on Wikipedia.


What is interesting is that my GTX 660TI is working just fine.

Name e84s30_e74s25p0f47-PABLO_P04637_0_IDP-0-1-RND6466_1
Workunit 12522600
Created 27 Apr 2017 | 14:41:13 UTC
Sent 27 Apr 2017 | 16:27:39 UTC
Received 28 Apr 2017 | 21:56:18 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 151018
Report deadline 2 May 2017 | 16:27:39 UTC
Run time 70,132.96
CPU time 16,657.83
Validate state Valid
Credit 210,625.00
Application version Long runs (8-12 hours on fastest card) v8.49 (cuda65)

Stderr output
<core_client_version>7.4.36</core_client_version>
<![CDATA[
<stderr_txt>
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 0 :
# Name : GeForce GTX 660 Ti
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:02:00.0
# Device clock : 1045MHz
# Memory clock : 3004MHz
# Memory width : 192bit
# Driver version : r381_64 : 38165
# GPU 0 : 65C
# GPU 0 : 67C
# GPU 0 : 68C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 0 :
# Name : GeForce GTX 660 Ti
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:02:00.0
# Device clock : 1045MHz
# Memory clock : 3004MHz
# Memory width : 192bit
# Driver version : r381_64 : 38165
# GPU 0 : 50C
# GPU 0 : 55C
# GPU 0 : 59C
# GPU 0 : 61C
# GPU 0 : 64C
# GPU 0 : 65C
# GPU 0 : 66C
# GPU 0 : 67C
# GPU 0 : 68C
# Time per step (avg over 12035000 steps): 5.614 ms
# Approximate elapsed time for entire WU: 70171.554 s
# PERFORMANCE: 44910 Natoms 5.614 ns/day 0.000 ms/step 0.000 us/step/atom
14:49:34 (5348): called boinc_finish

</stderr_txt>
]]>

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 788
Credit: 1,422,060,845
RAC: 1,410,932
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47149 - Posted: 1 May 2017 | 6:32:32 UTC - in response to Message 47148.
Last modified: 1 May 2017 | 6:48:39 UTC

What is interesting is that my GTX 660TI is working just fine.

I think that's because you have a nice simple setup with just one GPU per machine. BOINC can see exactly what you've got, and GPUGrid have set things up properly to avoid sending that machine the app which is causing the problems. Instead, you got

Application version Long runs (8-12 hours on fastest card) v8.49 (cuda65)

It seems to be the people who are running multiple cards with mixed compute capabilities in the same computer that are hitting snags. It's a long-standing and well known weakness in BOINC that the client only tells the server about the 'best' card in a system. If a CC 3.5 or higher card is present alongside the GTX 660Ti, BOINC will assign the application suitable for that newer card instead, and that's what causes the problem if the task is run on the secondary card.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47167 - Posted: 4 May 2017 | 5:29:13 UTC
Last modified: 4 May 2017 | 5:37:45 UTC

MJH (et. al):

I have concluded my exhaustive Cuda 8.0 SDK testing. On my Win10 x64 Build 16184 PC (with GTX970, GTX660Ti, GTX660Ti), I installed VS2015 Community, installed the Cuda 8.0 Toolkit and samples, installed the DirectX SDK, then built all of the Cuda solutions.

There are 155 Cuda samples that I was able to compile and test with. And I went through them, 2 times:
1) 381.89 - GTX970, GTX660Ti, GTX660Ti
2) 381.89 - GTX660Ti, GTX660Ti (I pulled the GTX970 out of the system)

Out of the 155 samples, they all passed on both runs.. except VFlockingD3D10 did not look correct on my GTX660Ti but looked fine on my GTX970. All other calculations and samples worked fine, even on a GTX660Ti.

This leads me to believe that the GPUGrid problem with the "9.18 (cuda80)" app, where it errors out immediately on a system that has a CC3.0/SM3 GPU .... might not be an NVIDIA problem. It might be a problem with your app.

Is it possible you are calling some method or function, that isn't supported by CC3.0/SM3?

I'm desperately wanting you to provide more info. I'm spending considerable effort to help you solve this, yet my questions to you go on unanswered. I hope you're making progress with a fix - please consider chiming in with your findings.

Jacob Klein

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47219 - Posted: 15 May 2017 | 3:04:02 UTC

Month 2.
The frustration continues.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 265
Credit: 1,043,270,117
RAC: 1,772,186
Level
Met
Scientific publications
watwatwatwatwatwat
Message 47272 - Posted: 18 May 2017 | 12:51:28 UTC - in response to Message 47049.

Suspending, with the 918 app, is not working correctly! It sometimes takes ~20+ seconds to see the app respond to a suspend request. It should be < 3 seconds!

Also, you mention "The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80" ....... so:
- Who is responsible for fixing that (you, or NVIDIA?)
- What steps are being taken to get it fixed (have you contacted the right people?)


I have the same issue on my windows 10 computer, running GTX980ti cards.




Suspending has taken up to 5min on Win8.1.
Resume/suspend an issue where (15) WU error upon resume on same or different GPU.
WU need to run without interruption to get a proper validation.

13 days ago I updated to 382.05 from 381.89 - today my 1080/1070/1060/970/970 (z87 system) has 121 consecutive valid 9.18 tasks.
Suspend / resume working without issue. WU are taking 1~8 seconds to suspend.
Maybe suspend problem been resolved in most recent 381.99.* branch driver?

382.05 is the best driver for Win8.1 in several months (369.00). I'm now able to leave my system unsupervised for extended periods of time.
(Just in time for humid summertime crunching.)
Prior couple of branches I had a daily random reset - previously thinking might of been a hardware issue or OS problem since it's only a bone stock 2015 Win8.1 with no updates. IMHO: Win8.1 is faster than Win 7 or 10 running (multi) GPU compute.

Erich56
Send message
Joined: 1 Jan 15
Posts: 371
Credit: 1,670,919,877
RAC: 2,920,998
Level
His
Scientific publications
watwatwat
Message 47273 - Posted: 18 May 2017 | 15:40:26 UTC

what happens to my two hosts on Windows 10 with acemd 918.80 and driver 381.65, you can read here:

http://www.gpugrid.net/forum_thread.php?id=4571&nowrap=true#47261

NOT amusing at all :-( I cannot leave the two PCs unattended for lengthy time.

Behemot
Send message
Joined: 16 Jul 15
Posts: 1
Credit: 2,750,975
RAC: 120
Level
Ala
Scientific publications
wat
Message 47280 - Posted: 18 May 2017 | 20:47:19 UTC

OK, finally know why I am not getting any work for GT 610. So, deleting the project from my client, no more reason to have it there.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47283 - Posted: 19 May 2017 | 11:04:44 UTC
Last modified: 19 May 2017 | 11:06:20 UTC

GPUGrid:

Regarding the problem of your app immediately crashing on CC3.0/SM3 GPUs (like my 2 "GTX 660 Ti" GPUs)...

Is it a problem that GPUGrid must fix, or is it a problem that NVIDIA must fix?

I feel like nobody is trying to fix it. And if it is something NVIDIA must fix, and if GPUGrid gave me enough info to identify the problem, then I might be able to work with NVIDIA to fix it.

But you guys aren't forthcoming with details.

When can we expect some answers? My main question, above, is still unanswered.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47315 - Posted: 23 May 2017 | 10:59:57 UTC

Hello?

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47350 - Posted: 1 Jun 2017 | 4:26:51 UTC

*tap tap* Is this thing on?

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47497 - Posted: 24 Jun 2017 | 5:24:46 UTC
Last modified: 24 Jun 2017 | 5:26:36 UTC

5 weeks have now gone by, with no response to my questions!
I hope I'm not being rude by asking admins to follow up, but needless to say I'm miffed about this lack of communication!
Especially since my GTX 660 Ti GPUs are still unusable due to bugs with your most recent 9.18 app version. :(

How may we help you resolve the bugs?

GPUGrid:

Regarding the problem of your app immediately crashing on CC3.0/SM3 GPUs (like my 2 "GTX 660 Ti" GPUs)...

Is it a problem that GPUGrid must fix, or is it a problem that NVIDIA must fix?

I feel like nobody is trying to fix it. And if it is something NVIDIA must fix, and if GPUGrid gave me enough info to identify the problem, then I might be able to work with NVIDIA to fix it.

But you guys aren't forthcoming with details.

When can we expect some answers? My main question, above, is still unanswered.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47540 - Posted: 6 Jul 2017 | 3:06:04 UTC

I'd love to help, somehow.

kain
Send message
Joined: 3 Sep 14
Posts: 110
Credit: 132,491,966
RAC: 18,921
Level
Cys
Scientific publications
watwatwatwat
Message 47543 - Posted: 6 Jul 2017 | 9:56:17 UTC

I think that the only possible way to help the team is making a lot of donations. They for sure need a lot of money to hire someone with good programming skills. Could we as a community aford that? Probably. Would we? I don't think so.
So it is time to accept the fact that the team is too small and too busy to do everything we want them to do. As long as there are new publications provig that the most important thing (science ftw!) is covered I will support the GPUGRID.


BTW it is a common problem, my girlfriend is a scientist, not working with BOINC but facing exacly the same problem.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 460
Credit: 1,130,761,180
RAC: 18,722
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 47544 - Posted: 6 Jul 2017 | 12:12:49 UTC - in response to Message 47543.

And they appear to have enough support, even with the reduced output on Windows. The unsent tasks are down to zero, probably due to the summer lull. There is no point in increased crunching power for the moment.

Profile Logan Carr
Send message
Joined: 12 Aug 15
Posts: 207
Credit: 30,647,175
RAC: 32,853
Level
Val
Scientific publications
wat
Message 47580 - Posted: 11 Jul 2017 | 14:55:17 UTC - in response to Message 47543.

I think that the only possible way to help the team is making a lot of donations. They for sure need a lot of money to hire someone with good programming skills. Could we as a community aford that? Probably. Would we? I don't think so.
So it is time to accept the fact that the team is too small and too busy to do everything we want them to do. As long as there are new publications provig that the most important thing (science ftw!) is covered I will support the GPUGRID.


BTW it is a common problem, my girlfriend is a scientist, not working with BOINC but facing exacly the same problem.



Well put!
____________
Cruncher/Learner in progress.

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 258
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 47585 - Posted: 12 Jul 2017 | 13:19:39 UTC

Jacob I'm really sorry for this. Matt has effectively left our team so now we have hired a new person to work on Acemd and it's replacement in Acemd3. I doubt it will be very efficient to solve this issue when we can hopefully transition to Acemd3 which will use a more stable code base. If it's something urgent or really annoying such as this case feel free to PM me where I will definitely see it sooner.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,306,745,931
RAC: 5,318,809
Level
Met
Scientific publications
watwat
Message 47587 - Posted: 12 Jul 2017 | 13:28:01 UTC - in response to Message 47585.

Jacob I'm really sorry for this. Matt has effectively left our team so now we have hired a new person to work on Acemd and it's replacement in Acemd3. I doubt it will be very efficient to solve this issue when we can hopefully transition to Acemd3 which will use a more stable code base. If it's something urgent or really annoying such as this case feel free to PM me where I will definitely see it sooner.

I'm sorry you've had to take on the PR role as a scientist. I think having a meeting with the other scientists to allow them to more frequently check the forums and server status would mitigate many of these problems.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47588 - Posted: 12 Jul 2017 | 13:32:03 UTC - in response to Message 47585.
Last modified: 12 Jul 2017 | 13:55:14 UTC

Jacob I'm really sorry for this. Matt has effectively left our team so now we have hired a new person to work on Acemd and it's replacement in Acemd3. I doubt it will be very efficient to solve this issue when we can hopefully transition to Acemd3 which will use a more stable code base. If it's something urgent or really annoying such as this case feel free to PM me where I will definitely see it sooner.


Thanks for the reply, Stefan. It was helpful to me!

If you decide to solve it, and would like help to reproduce a problem or test a fix, I'm an all-star at both, and would do my best to help. I'm also an active BOINC alpha tester, and work with the BOINC devs sometimes.

In the meantime, we eagerly await the promising Acemd3.

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 258
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 47589 - Posted: 12 Jul 2017 | 13:38:48 UTC

Yeah, I talked with Adria and Pablo today. They should be having a more active role from now on in the forums.

ChristianVirtual
Send message
Joined: 16 Aug 14
Posts: 13
Credit: 281,790,075
RAC: 34,094
Level
Asn
Scientific publications
watwat
Message 47590 - Posted: 12 Jul 2017 | 15:00:52 UTC

That would gen great; nothing worst then silence. It's ok to have problems, but sharing it will reduce noise and can activate support.

I'm glad it's working again as I prefer this GPu project to others on BOINC

ChristianVirtual
Send message
Joined: 16 Aug 14
Posts: 13
Credit: 281,790,075
RAC: 34,094
Level
Asn
Scientific publications
watwat
Message 47591 - Posted: 12 Jul 2017 | 15:01:19 UTC

That would be great; nothing worst then silence. It's ok to have problems, but sharing it will reduce noise and can activate support.

I'm glad it's working again as I prefer this GPu project to others on BOINC

Erich56
Send message
Joined: 1 Jan 15
Posts: 371
Credit: 1,670,919,877
RAC: 2,920,998
Level
His
Scientific publications
watwatwat
Message 47592 - Posted: 13 Jul 2017 | 5:13:32 UTC - in response to Message 47585.

Matt has effectively left our team so now we have hired a new person to work on Acemd and it's replacement in Acemd3.

As you may remember, last April Matt announced that Windows XP support will definitely be stopped in April 2018.
I strongly had the feeling that this was a decision made by himself, as he probably planned not take his time to put together another crunching software for XP.

Now that Matt has left GPUGRID, I hope that the team and the new person see the situation differently, and that the crunchers who still use Windows XP for good reason will get a crunching software beyond April 2018.

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 258
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 47594 - Posted: 13 Jul 2017 | 9:30:25 UTC - in response to Message 47592.

If we manage to make the switch to Acemd3 it will be based on OpenMM so it will only depend on where OpenMM can run on.

(Ryle)
Send message
Joined: 7 Jun 09
Posts: 17
Credit: 671,880,212
RAC: 1,201,007
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 47595 - Posted: 13 Jul 2017 | 14:21:01 UTC - in response to Message 47594.

Sounds interesting, I hope you can do it Stefan. I read briefly on OpenMM's site, it suggests a support for AMD, so maybe you will begin support for that? Not that I have any AMD cards myself, but it could open up for a larger userbase, which could give a shorter turnaround time for workunits in the end. A plus for your research I take it.

As for windowsXP, I don't use it myself, but maybe the more tech savvy users on XP, should consider a switch to linux eventually. I understand some users have trouble with the learning curve, if they are not familiar with linux though.

Erich56
Send message
Joined: 1 Jan 15
Posts: 371
Credit: 1,670,919,877
RAC: 2,920,998
Level
His
Scientific publications
watwatwat
Message 47596 - Posted: 13 Jul 2017 | 15:04:01 UTC - in response to Message 47595.

As for windowsXP, I don't use it myself, but maybe the more tech savvy users on XP, should consider a switch to linux eventually. ...

As I wrote in another thread recently, obviously Swan_Sync is NOT possible with Linux (in Windows - this experience I've made myself - it's essential for GPUGRID crunching). That's why some crunchers have not switched so far.
On the other hand, one of the replies I got for this recent posting was that Swan_Sync is NOT necessary with Linux.
So, no idea what's correct and what not.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,306,745,931
RAC: 5,318,809
Level
Met
Scientific publications
watwat
Message 47597 - Posted: 13 Jul 2017 | 16:28:29 UTC

The main concern with GPUGrid is GPU utilization. SWAN_SYNC is a way to mitigate this with windows but definitely not solve it. From what I've found, using linux under the 9.14 application you will see 90%+ on high end GPUs with an appropriately fast cpu. On windows it is hard to match this, even with these techniques. It almost doesn't matter that SWAN_SYNC isn't on linux because the utilization is so high.

Erich56
Send message
Joined: 1 Jan 15
Posts: 371
Credit: 1,670,919,877
RAC: 2,920,998
Level
His
Scientific publications
watwatwat
Message 47598 - Posted: 13 Jul 2017 | 16:48:09 UTC

My experience with (or without) Swan_Sync in Windows is the following:
when I newly installed BOINC on one of my PCs recently and ran GPUGRID, I was wondering why in the Windows Task Manager, a CPU utilization by acemd.exe of only a few percent was shown.
I had forgotten the Swan_Sync setting.
After I had set Swan_Sync, acemd.exe was using nearly 100% of a CPU core.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 788
Credit: 1,422,060,845
RAC: 1,410,932
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47599 - Posted: 13 Jul 2017 | 16:53:13 UTC - in response to Message 47598.

After I had set Swan_Sync, acemd.exe was using nearly 100% of a CPU core.

That would be a show-stopper for me: I use my CPUs for other purposes. And I grumble at developers who wrote OpenCL apps for NVidia cards, for the same reason.

Not saying that either of us is right or wrong: it's just that different users have different priorities. It might be nice to have a definitive FAQ on Swan_Sync (if there isn't one already), but it should be left up to each user to apply it or not as they choose.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47600 - Posted: 13 Jul 2017 | 17:18:27 UTC

Yup, Swan_Sync on Windows was also something I tried, verified worked, then immediately undid. I like my CPUs keeping as busy as possible doing other projects, rather than "spin-waiting" on GPU kernels to make them only slightly faster.

Felix_M_
Send message
Joined: 29 Jan 16
Posts: 2
Credit: 1,814,800
RAC: 788
Level
Ala
Scientific publications
wat
Message 47601 - Posted: 13 Jul 2017 | 18:00:16 UTC

can we exspect short run tasks to come in the near future?
____________

Erich56
Send message
Joined: 1 Jan 15
Posts: 371
Credit: 1,670,919,877
RAC: 2,920,998
Level
His
Scientific publications
watwatwat
Message 47602 - Posted: 13 Jul 2017 | 18:21:30 UTC - in response to Message 47599.

After I had set Swan_Sync, acemd.exe was using nearly 100% of a CPU core.

That would be a show-stopper for me: I use my CPUs for other purposes.

Well, on all my PCs (all having a different number of CPU cores) with which I do GPU crunching, I dedicate 1 CPU core to 1 GPU. All other CPU cores are free for other activities.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47603 - Posted: 13 Jul 2017 | 20:05:23 UTC - in response to Message 47602.

After I had set Swan_Sync, acemd.exe was using nearly 100% of a CPU core.

That would be a show-stopper for me: I use my CPUs for other purposes.

Well, on all my PCs (all having a different number of CPU cores) with which I do GPU crunching, I dedicate 1 CPU core to 1 GPU. All other CPU cores are free for other activities.


Yes. And we're saying that you end up "spending" some CPU cycles, in order to get GPU gains... whereas we are on the opposite end of the spectrum, wanting to use as little CPU as possible to keep the GPU going, so that the CPU (even the unused cycles from the GPU jobs) can be utilized.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,630,096,894
RAC: 9,819,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47604 - Posted: 13 Jul 2017 | 20:46:53 UTC - in response to Message 47597.
Last modified: 13 Jul 2017 | 20:52:27 UTC

The main concern with GPUGrid is GPU utilization.
The main concern of GPUGrid depends on the user's mind, not on GPUGrid.

SWAN_SYNC is a way to mitigate this with windows but definitely not solve it.
GPU utilization without SWAN_SYNC is improved over newer CUDA drivers, but it depends on many other factors like the WDDM (which can't be turned off, that makes SWAN_SYNC less effective on modern Windows OSes) or the workunit itself (some workunits could gain up to 15% performance boost in the past by applying SWAN_SYNC even on Linux, so I assume that it's possible that some future workunits will be similar)

From what I've found, using linux under the 9.14 application you will see 90%+ on high end GPUs with an appropriately fast cpu. On windows it is hard to match this, even with these techniques.
Wrong: I have 96% GPU utilization on Windows XP x64, i7-4770k, GTX 980Ti, PABLO_P01106_0_IDP workunit, 1 CPU task;
98% GPU utilization on Windows XP x64, i3-4360, GTX 980Ti, PABLO_all_data_goal_KIX_CMYB-0 workunit, no CPU tasks

It almost doesn't matter that SWAN_SYNC isn't on linux because the utilization is so high.
It's almost true, because the present workunits do not use that much CPU-GPU interaction. Since we can't turn on SWAN_SYNC under Linux, we can't measure the performance gain, but there could be some, even with the present workunits. The main reason for me to still use Windows XP for crunching is that I want to optimize all resources of my PCs to make the GPUGrid app run as fast as it could, so the lack of SWAN_SYNC under Linux is a show-stopper for me.

[CSF] Aleksey Belkov
Avatar
Send message
Joined: 26 Dec 13
Posts: 28
Credit: 383,650,651
RAC: 541,126
Level
Asp
Scientific publications
watwatwatwatwatwat
Message 47606 - Posted: 14 Jul 2017 | 12:56:46 UTC

I haven't seen much difference(in tasks's execution speed) between using SWAN_SYNC and increasing priority(High:13) of process acemd-918-80.exe.
This approach eliminates the impact of any other programs on the GPU tasks.
IMG_01
I suggest to test the option of changing the priority.
That will free up additional processor core(if you specify CPU < 1.0 for GPU tasks) and would make the system more responsive(on my feeling).
To Automate the increase of priority of GPUGRID tasks use Process Hacker.
Using the shortcut menu we increasing the priority and save configuration.
IMG_02
Since Process Hacker must be running when new task started , it is best to run Process Hacker at system startup(in minimized state).
IMG_03

kain
Send message
Joined: 3 Sep 14
Posts: 110
Credit: 132,491,966
RAC: 18,921
Level
Cys
Scientific publications
watwatwatwat
Message 47607 - Posted: 14 Jul 2017 | 13:10:58 UTC

Sorry to interrupt, I just want to say that I have made a small (10 euros) donation. When I am sure that everything goes well I will make another.

I would like to encourage everyone to do at least a smallest one. You won't even notice this 10 euros, but en masse it could make a big difference for the team.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 38
Credit: 49,414,670
RAC: 1,398,720
Level
Val
Scientific publications
wat
Message 47608 - Posted: 14 Jul 2017 | 20:04:03 UTC

It's good to see more forum participation from project admins/scientists. Even a "We'll take a look at xx issue" comment here and there gives end users the confidence to keep donating processor cycles.

I still find it odd that a setting like swan_sync needs to be used at all. Other GPU apps do not need it. BOINC or FAH. Typically CPU apps get the Idle priority and GPU apps get Below Normal (1 step higher than idle) which will give CPU priority to GPU apps when both are running. I don't recall what it is set at and I'm not next to one of my PCs.

As long as the exes are different names I separate the GPU exe to it's own CPU core/thread affinity using Process Lasso. That is usually enough in Windows. Linux works well enough to not really need a program like Process Lasso. I would think if it has a higher priority and/or an open thread it would be like a virtual swan_sync setting.

Erich56
Send message
Joined: 1 Jan 15
Posts: 371
Credit: 1,670,919,877
RAC: 2,920,998
Level
His
Scientific publications
watwatwat
Message 47609 - Posted: 14 Jul 2017 | 20:18:58 UTC - in response to Message 47592.

Now that Matt has left GPUGRID, I hope that the team and the new person see the situation differently, and that the crunchers who still use Windows XP for good reason will get a crunching software beyond April 2018.

If we manage to make the switch to Acemd3 it will be based on OpenMM so it will only depend on where OpenMM can run on.

do any of the experts here know whether OpenMM can be run on WindowsXP ?
I was searching for this kind of information, but did not find anything.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 788
Credit: 1,422,060,845
RAC: 1,410,932
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47610 - Posted: 14 Jul 2017 | 21:04:53 UTC - in response to Message 47608.

I still find it odd that a setting like swan_sync needs to be used at all.

swan_sync isn't needed at all - I run GPUGrid under Windows without using it. But it is available for use, for those members who wish to arrange their processing priority that way.

I also use Process Lasso - there's one particular application, from one other BOINC project, which gains something like a six-fold increase in productivity from being run at real time priority, without having any ill-effects on the general use of the computer (Both GPUGrid and that other application are running on this computer as I type).

So I have nothing against using productivity tweaks where available: but they are choices, rather than requirements.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,630,096,894
RAC: 9,819,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47611 - Posted: 14 Jul 2017 | 21:18:53 UTC - in response to Message 47608.
Last modified: 14 Jul 2017 | 21:24:02 UTC

I still find it odd that a setting like swan_sync needs to be used at all.
It doesn't need to be used. This is merely an option put in the hands of the user, to use it if they want to optimize their system to maximize the performance of GPUGrid (at the expense of a CPU task).

Other GPU apps do not need it. BOINC or FAH.
High GPU usage readings of other GPU apps are usually misleading. You should compare the power consumption of your GPU while running different GPU apps to get a more realistic measurement of the amount of operations done per second. Other GPU apps have lower power consumption while showing higher GPU utilization than GPUGrid.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 788
Credit: 1,422,060,845
RAC: 1,410,932
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47612 - Posted: 14 Jul 2017 | 21:25:43 UTC - in response to Message 47611.

I think I read somewhere that the utilisation figures for NVidia GPUs are taken from the first shader multiplex (or render output unit, which seems to be the latest terminology): the rest of the card isn't measured or reported.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 40
Credit: 80,281
RAC: 0
Level

Scientific publications
wat
Message 47624 - Posted: 18 Jul 2017 | 7:31:15 UTC - in response to Message 47592.

I hope that the team and the new person see the situation differently, and that the crunchers who still use Windows XP for good reason will get a crunching software beyond April 2018.


Absolutely it does NOT make any sense to crunch with XP in 2017.
2018.....why not 2020 with XP?

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 40
Credit: 80,281
RAC: 0
Level

Scientific publications
wat
Message 47625 - Posted: 18 Jul 2017 | 7:37:35 UTC - in response to Message 47609.
Last modified: 18 Jul 2017 | 7:37:52 UTC

do any of the experts here know whether OpenMM can be run on WindowsXP ?
I was searching for this kind of information, but did not find anything.


From OpenMM site:
OpenMM is optimized for the latest generation of compute hardware, including AMD (via OpenCL) and NVIDIA (via CUDA) GPUs. We also heavily optimize for CPUs using intrinsics

Latest hw has very bad support for XP.
Maybe it runs, but with very poor performances

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 40
Credit: 80,281
RAC: 0
Level

Scientific publications
wat
Message 47626 - Posted: 18 Jul 2017 | 7:40:48 UTC - in response to Message 47595.

Sounds interesting, I hope you can do it Stefan. I read briefly on OpenMM's site, it suggests a support for AMD, so maybe you will begin support for that? Not that I have any AMD cards myself, but it could open up for a larger userbase, which could give a shorter turnaround time for workunits in the end. A plus for your research I take it.


It's a long story :-(
And today there are a lot of tools to help the opencl development from Cuda
For example: http://gpuopen.com/whats-new-hip-hcc-rocm-1-6/
But if team has no developers....

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,306,745,931
RAC: 5,318,809
Level
Met
Scientific publications
watwat
Message 47628 - Posted: 18 Jul 2017 | 10:57:42 UTC

You cannot install a pascal GPU into an XP machine, and there is no way to hack it like with the 980ti. For the future is 14nm and below, I don't see a point in running these 28nm GPUs for any longer than they have to.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 38
Credit: 49,414,670
RAC: 1,398,720
Level
Val
Scientific publications
wat
Message 47629 - Posted: 18 Jul 2017 | 13:59:59 UTC - in response to Message 47610.

I still find it odd that a setting like swan_sync needs to be used at all.

swan_sync isn't needed at all - I run GPUGrid under Windows without using it. But it is available for use, for those members who wish to arrange their processing priority that way.

I also use Process Lasso - there's one particular application, from one other BOINC project, which gains something like a six-fold increase in productivity from being run at real time priority, without having any ill-effects on the general use of the computer (Both GPUGrid and that other application are running on this computer as I type).

So I have nothing against using productivity tweaks where available: but they are choices, rather than requirements.


Fine I'll word it differently. It shouldn't be needed to fully utilize a GPU. As I mentioned, other projects take what is needed before CPU apps. GPUGrid should be no different.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,630,096,894
RAC: 9,819,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47630 - Posted: 18 Jul 2017 | 15:07:01 UTC - in response to Message 47629.

Fine I'll word it differently. It shouldn't be needed to fully utilize a GPU.
In an ideal world that would be this way. Unfortunately the real world is far from the ideal, in many aspects. The GPUGrid app is one of these aspects. However, I still think that even other projects show 99% GPU usage, the GPUGrid app does more FLOPS than the others while showing "only" 90% GPU usage. There are users who have to throttle the GPUGrid client to avoid overheating their GPU.

As I mentioned, other projects take what is needed before CPU apps. GPUGrid should be no different.
Why not? The science and the methods are different for each project.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,630,096,894
RAC: 9,819,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47631 - Posted: 18 Jul 2017 | 15:16:11 UTC - in response to Message 47624.
Last modified: 18 Jul 2017 | 15:28:16 UTC

Absolutely it does NOT make any sense to crunch with XP in 2017.
2018.....why not 2020 with XP?
Well, if you take a look at the performance page, you'll be surprised. Check the rankings of the PABLO_P01106_2 batch, and you'll find my GTX 980Ti at the 9th place, right above a GTX 1080, and a GTX TitanX (Pascal). If you check the rankings above my GTX 980Ti, you could see that a GTX 1080Ti (which costs 2.5 times as a 2nd hand GTX980Ti) is only 10-15% faster. So much for the WDDM (introduced in Windows Vista and all later versions have it). Oh, that GTX 980Ti runs under the obsolete Windows XP. I would instantly switch to Linux if the Linux app would have the SWAN_SYNC option. To be able to compare the performance of Windows XP with SWAN_SYNC, and Linux without SWAN_SYNC I will install Linux to one of my hosts, and put the same make / model GPU in it (MSI GTX980Ti Gaming 6G) as the one of my Windows XP host has.

John
Send message
Joined: 10 Jan 16
Posts: 2
Credit: 16,247,959
RAC: 7,293
Level
Pro
Scientific publications
wat
Message 47638 - Posted: 20 Jul 2017 | 20:35:35 UTC

I have noticed an issue where GPUGRID fails to resume properly after being suspended. When BOINC suspends all current tasks due to the CPU being busy for a few seconds (like when launching a large program), your task doesn't resume when the CPU is free again. It says it is running and the elapsed time changes correctly, but no progress is made. However, it will successfully resume after a GPU exclusive task has been run.

This is a serious issue as it wastes tremendous amount of time and results in the task being completed after the deadline which renders it useless.

The task in question was a 918 version.

Are you aware of this issue? Is there any additional information I can provide?

John

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,630,096,894
RAC: 9,819,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47639 - Posted: 20 Jul 2017 | 21:47:26 UTC - in response to Message 47638.

... GPUGRID fails to resume properly after being suspended. ... The task in question was a 918 version.

Are you aware of this issue?
It's a know bug, seems to depend on the workunit batch.
A similar bug is when you (or the BOINC manager) suspend(s) the task, it could take a while when it really happens.

Is there any additional information I can provide?
No.

Erich56
Send message
Joined: 1 Jan 15
Posts: 371
Credit: 1,670,919,877
RAC: 2,920,998
Level
His
Scientific publications
watwatwat
Message 47641 - Posted: 21 Jul 2017 | 5:06:05 UTC

It's really too bad that in April Matt put together, in a hurry, a rather buggy software, and short time later he left GPUGRID without eliminating all these bugs :-(

BTW, for this reason I had no other choice than abandoning crunching GPUGRID with one of my hosts :-(

mmonnin
Send message
Joined: 2 Jul 16
Posts: 38
Credit: 49,414,670
RAC: 1,398,720
Level
Val
Scientific publications
wat
Message 47709 - Posted: 29 Jul 2017 | 13:17:31 UTC - in response to Message 47630.

Fine I'll word it differently. It shouldn't be needed to fully utilize a GPU.
In an ideal world that would be this way. Unfortunately the real world is far from the ideal, in many aspects. The GPUGrid app is one of these aspects. However, I still think that even other projects show 99% GPU usage, the GPUGrid app does more FLOPS than the others while showing "only" 90% GPU usage. There are users who have to throttle the GPUGrid client to avoid overheating their GPU.

As I mentioned, other projects take what is needed before CPU apps. GPUGrid should be no different.
Why not? The science and the methods are different for each project.


The science and methods have nothing to with with the priority of the executable.

90% is a dream. Try in the 60s with a single task and low 80s with two tasks. Think of the flops that could be achieved if GPU util was above 95%.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 460
Credit: 1,130,761,180
RAC: 18,722
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 47710 - Posted: 29 Jul 2017 | 13:51:04 UTC - in response to Message 47709.

90% is a dream. Try in the 60s with a single task and low 80s with two tasks. Think of the flops that could be achieved if GPU util was above 95%.

I think it has long been recognized on some projects that the GPU% is just the value that some counter gives somewhere on the chip, but no one really knows what it is. If the software runs the desired science application as well as the compilers allow (of which I know nothing about), then it works well enough. And the scientists, worthy as they are, are not compiler specialists insofar as I can see, and they probably have better ways of spending their time anyway.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 788
Credit: 1,422,060,845
RAC: 1,410,932
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47712 - Posted: 29 Jul 2017 | 14:51:22 UTC - in response to Message 47709.

Think of the flops that could be achieved if GPU util was above 95%.

I think the total flops of a GPU driving all multiplexes at 85-90% (probably running into thermal or power throttling) would be much greater than a different app driving the first multiplex at 95% unthrottled, but leaving the rest to idle.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,630,096,894
RAC: 9,819,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47713 - Posted: 29 Jul 2017 | 15:38:02 UTC - in response to Message 47709.

The science and methods have nothing to with with the priority of the executable.
The GPUGrid app (just like any other GPU app) runs at "below normal" process priority, while CPU apps run at "low" process priority (which is one lower than "below normal"). Other processes usually run at "normal" piority.
SWAN_SYNC does not change the process priority level, it changes the behavior of the app.

With SWAN_SYNC off:
1. the app gives the control back to the OS
2. the GPU gives an interrupt (signaling that it has finished the tiny part of the calculation), and due to this interrupt the OS gives the control back to the GPUGrid app
3. The app reads the result, does some calculations in double precision (if needed), then sends the next piece of the calculation to the GPU.
4. it starts all over again, until the number of iterations has been reached

With SWAN_SYNC on:
1. the GPUGrid app continuously polls the GPU waiting for the GPU to finish the actual piece of calculation
2. the app reads the result, does some calculations in double precision (if needed), then sends the next piece of the calculation to the GPU.
3. it starts all over again, until the number of iterations has been reached

Now, with modern Windows OSes there's an extra time needed in every CPU-GPU interactions due to Windows Display Driver Model. That's why the GPUGrid app runs faster (has higher GPU utilization) under Windows XP and Linux.
This cycle has to run a couple of thousand times per second. If the given batch needs a lot of double precision calculations, the difference of overall processing speed between WDDM and non-WDDM OS will be larger, and with SWAN_SYNC on this difference is even higher (due to the missing step in the processing cycle)

Other GPU projects (like SETI@home, or Einstein@home) do not interact that frequently with the GPU, hence they can reach higher GPU utilization.

90% is a dream. Try in the 60s with a single task and low 80s with two tasks. Think of the flops that could be achieved if GPU util was above 95%.
I have 95% GPU usage under Windows XP, so it's not the fault of the GPUgrid app alone.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1067
Credit: 1,146,403,839
RAC: 1,089,717
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47714 - Posted: 29 Jul 2017 | 15:51:48 UTC - in response to Message 47713.

Excellent detail there, especially the juicy innards of process priorities and the SWAN_SYNC variable!

Just one point of clarification - I believe BOINC CPU apps are set to run at "Idle" priority by default, instead of "Low".

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 788
Credit: 1,422,060,845
RAC: 1,410,932
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47715 - Posted: 29 Jul 2017 | 17:12:26 UTC - in response to Message 47714.

Excellent detail there, especially the juicy innards of process priorities and the SWAN_SYNC variable!

Just one point of clarification - I believe BOINC CPU apps are set to run at "Idle" priority by default, instead of "Low".

According to Process Explorer, my CPU apps are running at a priority of 1, on a scale that goes up to 16.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,630,096,894
RAC: 9,819,606
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47717 - Posted: 29 Jul 2017 | 20:10:46 UTC - in response to Message 47714.

Excellent detail there, especially the juicy innards of process priorities and the SWAN_SYNC variable!

Just one point of clarification - I believe BOINC CPU apps are set to run at "Idle" priority by default, instead of "Low".
You are right.
This is my mistake, as I use a localized OS (in Hungarian) and it calls this priority level "alacsony" which I've translated back to English ("low"), while originally this priority level is called "Idle".

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,834,518,624
RAC: 292,156
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47718 - Posted: 30 Jul 2017 | 11:52:00 UTC - in response to Message 47717.

Assuming an OpenMM based Acemd3 app turns up, all & any user side tweaks, modifications & optimizations will need to be reassessed, primarily to see if they work. Coolbits, nice, SWAN_SYNC, process lasso, priority, HT ON/OFF, freeing up a CPU/thread...
Which GPU models will work best/are the best value would need to be reassessed. Some may no longer work, some might be better...
The importance of PCIE & CPU rates/strengths/weaknesses might change as might the operating systems we can use...
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Erich56
Send message
Joined: 1 Jan 15
Posts: 371
Credit: 1,670,919,877
RAC: 2,920,998
Level
His
Scientific publications
watwatwat
Message 47719 - Posted: 30 Jul 2017 | 12:00:21 UTC - in response to Message 47718.

The importance of PCIE & CPU rates/strengths/weaknesses might change as might the operating systems we can use...

no idea whether WDDM would then any longer play a role; and, subsequently, whether WindowsXP, due to it's lack of WDDM, would any longer have an advantage over the newer systems.

Besides the question whether OpenMM would run on WindowsXP at all.

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,834,518,624
RAC: 292,156
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47920 - Posted: 26 Sep 2017 | 16:52:43 UTC - in response to Message 47719.

Don't know what the situation is regarding a new app but if an OpenMM app was compiled using VS 2010, in theory it might still work on XP. The latest VS version doesn't support XP IIRC. Other factors (CUDA Drivers, a mainstream Volta release) might still prevent it working and if the WDDM overhead isn't an issue with OpenMM then supporting XP might be unnecessary.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Erich56
Send message
Joined: 1 Jan 15
Posts: 371
Credit: 1,670,919,877
RAC: 2,920,998
Level
His
Scientific publications
watwatwat
Message 47921 - Posted: 26 Sep 2017 | 18:47:45 UTC - in response to Message 47920.

... and if the WDDM overhead isn't an issue with OpenMM then supporting XP might be unnecessary.

good point! The question though is whether WDDM overhead indeed is NOT an issue with OpenMM.
Is there anyone who can answer this question for sure?

Post to thread

Message boards : News : App update 17 April 2017