Advanced search

Message boards : Number crunching : Kepler - Not fully using CPU?

Author Message
Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34890 - Posted: 4 Feb 2014 | 4:08:35 UTC
Last modified: 4 Feb 2014 | 4:11:11 UTC

I am currently running a Long-run task called "901x-SANTI_MARwtcap310-2-32-RND7131_0" using the "cuda55" plan class, using the 8.15 app version, on my Kepler GTX 660 Ti, using new drivers: 334.67 BETA, on Windows 8.1.

I expected Task Manager and Process Explorer to both show that this task would make the "acemd.815-55.exe" process fully utilize a virtual CPU core. However, it is only using a tiny portion (~25%) of the core. Furthermore, this GPU is sometimes not jumping to boost clock while this task is running, and is instead sometimes staying at standard 3D clock; utilization is at 85% with no other tasks running, and I expect the behavior is related to the CPU usage issue.

So... to my questions....
Did something break how SWAN SYNC is automatically selected? Why is the process not using a full virtual core on my Kepler GPU? Is the app broken, or is the driver broken, or is this behavior expected? Are certain GPUGrid tasks intentionally set up to not use a full core on Kepler GPUs?

Thanks,
Jacob

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34892 - Posted: 4 Feb 2014 | 19:02:26 UTC - in response to Message 34890.

Isn't it just the same issue we have been discussing in the other thread?
http://www.gpugrid.net/forum_thread.php?id=3561&nowrap=true#34709

Unless there is something very different about Win8.1, it is probably just due to the work unit (or portions thereof) being more difficult than the card can handle, so it down clocks. (I assume that affects the CPU too, but have never investigated that to any extent.)

The usual suspects to change are:

    Increase power limit
    Reduce GPU/Memory clocks
    Increase GPU core voltage



I have seen this problem a number of times on GTX 660s and 650 Ti, and have them all running perfectly now by doing the above.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34893 - Posted: 4 Feb 2014 | 19:04:50 UTC - in response to Message 34892.
Last modified: 4 Feb 2014 | 19:06:07 UTC

Jim,

This is a different issue altogether. For some reason, tasks on Kepler GPUs are no-longer utilizing the full CPU core.

My questions remain. I welcome answers.

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 34894 - Posted: 4 Feb 2014 | 19:41:04 UTC - in response to Message 34890.

Hello , I only crunch here part time and am not a super tech. I did notice though that you and Tomba (the other user who noticed same ) use the same beta driver. I have not upgraded my 770 or 780 yet(still 331.82) and still have the standard cpu usage on my recent tasks . So perhaps the driver ??

Rob

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34895 - Posted: 4 Feb 2014 | 19:44:51 UTC - in response to Message 34894.
Last modified: 4 Feb 2014 | 19:45:24 UTC

I am thinking it might be a conflict between the driver and the 8.15 GPUGrid application. I'm going to do some testing tonight, by installing the prior driver version, to prove/verify that.

We still will need someone from GPUGrid to pinpoint the exact nature of the problem (if it is a problem?), though. I can't report an issue to NVIDIA unless someone at GPUGrid confirms that the driver has a problem.

GPUGrid admins? Any input?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34896 - Posted: 4 Feb 2014 | 21:20:52 UTC - in response to Message 34893.
Last modified: 4 Feb 2014 | 21:21:13 UTC

All my tasks have been using a full CPU core for some time. The only exception being a beta that ran 4 days ago.

Your system contains a GTX460 (which is a Fermi, not a Kepler). When the task runs on the GTX460 it does not use a full CPU core (because it's a Fermi), and when a task starts on a GTX660Ti then stops and restarts on the GTX460 it will stop using the full CPU. This is normal:

GTX460, I5-SANTI_baxbimSPW2-38-62-RND6735_4 5106678 24 Jan 2014 | 6:44:07 UTC 24 Jan 2014 | 18:24:22 UTC Completed and validated 21,501.92 4,728.77 20,550.00 Short runs (2-3 hours on fastest card) v8.15 (cuda55)

GTX 660 Ti, I949-SANTI_baxbimSPW2-59-62-RND9438_0 5108086 24 Jan 2014 | 6:44:07 UTC 24 Jan 2014 | 9:47:14 UTC Completed and validated 10,793.23 10,738.45 20,550.00 Short runs (2-3 hours on fastest card) v8.15 (cuda55)
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34897 - Posted: 4 Feb 2014 | 21:27:03 UTC - in response to Message 34896.
Last modified: 4 Feb 2014 | 21:55:31 UTC

---------------------------------
skgiven:

I fully understand that, the way we are used to seeing it working, is that the GTX 660 Ti (a Kepler card) will use a full core (shown as 12.5 CPU in both Task Manager and in Process Explorer), whereas my GTX 460 does not (it uses about 3.25 CPU).

What I'm saying is that, with the 334.67 BETA drivers, the behavior has changed. With 334.67, all of the tasks use 3.25 CPU, even those running on the Kepler.

I've verified that 332.21 WHQL drivers exhibit the "normal" behavior, whereas the 334.67 BETA drivers exhibit this "new" behavior. Can you verify the new behavior using the new drivers?

So, again, my questions remain:
Did something break how SWAN SYNC is automatically selected? Why is the process not using a full virtual core on my Kepler GPU? Is the app broken, or is the driver broken, or is this behavior expected? Are certain GPUGrid tasks intentionally set up to not use a full core on Kepler GPUs?

---------------------------------
MJH?
GPUGrid Admins?
Is something not working as expected?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 851
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34898 - Posted: 4 Feb 2014 | 22:22:29 UTC
Last modified: 4 Feb 2014 | 22:24:04 UTC

I had this issue with older drivers on WinXPx64 (and on WinXPx86 too), but only with NOELIA_DIPEPT-0-2 workunits:
Task 7717507, 7718044, 7717962, 7717686, 7718047, 7717487.
The really strange part is that the two workunits following task 7717507 were ok without any intervention: task 7720966, 7722153
All of my hosts processed the subsequent NOELIA_DIPEPT-0-2 workunits normally.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34899 - Posted: 4 Feb 2014 | 22:23:46 UTC - in response to Message 34898.

For me, my issue seems related to the new driver version, and not batches of tasks.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34900 - Posted: 4 Feb 2014 | 22:45:57 UTC - in response to Message 34897.

The 8.15app was added to the long queue on the 23 Jan 2014.

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 34901 - Posted: 4 Feb 2014 | 22:56:57 UTC

Another odd thing I see in the runtimes when I read Tomba`s post on the server board(?) and checked his times. Post driver change cpu runtime is about 75% less but gpu run times pre & post driver change are about the same on similar wus.

It is odd that with 75% less cpu usage the completion time would stay about the same.

JK are you seeing same?

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34902 - Posted: 4 Feb 2014 | 23:00:38 UTC - in response to Message 34901.
Last modified: 4 Feb 2014 | 23:02:09 UTC

It's going to be hard for me to determine if the task run time is affected. I stop and start tasks many times before they are completed, and I have 2 heterogeneous GPUs, the 660 Ti and the 460. So, the same task usually has a portion of it ran using the 660 Ti, and a portion ran using the 460. So, I will likely be unable to determine if the new behavior of "Tasks are not using a full CPU on Kepler GPUs using 334.67 drivers" is going to affect my task run times.

But the behavior is new/different, and I'm still hoping that skgiven can confirm the behavior, and that someone can answer the questions, especially if it is expected behavior or not.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34903 - Posted: 4 Feb 2014 | 23:18:12 UTC - in response to Message 34902.
Last modified: 5 Feb 2014 | 11:15:14 UTC

The 'problem' is with the Beta driver; it does prevent a full CPU core from being used.

Manually applying SWAN_SYNC=0 does not work, after a restart.

I'm using 7.2.33 (x64), so it's nothing to do with the Boinc Beta.
I'm using W7, so nothing to do with Win 8.1.
I have a GTX670 and a GTX770 in the test system, so nothing to do with a Fermi GPU.
The issue applies to both of my GPU's.

To test the times, go to Boinc Tasks. Select a GPUGrid WU and click properties. Note down the times and after 10minutes do the same. If the runtime and CPU time are not very close and there is a large gap, you can see that this is happening.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34904 - Posted: 4 Feb 2014 | 23:23:17 UTC - in response to Message 34903.
Last modified: 4 Feb 2014 | 23:24:40 UTC

Thanks for testing, skgiven. I'm glad you reproduced the behavior.

I am still requiring the following questions to be answered:
1) Is it an actual problem, or is it correct behavior?
2) If it is a problem, is the problem in the driver, or is the problem in the application code that determines how the acemd process functions?

Can anyone close to the acemd 8.15 application answer those questions?

Thanks,
Jacob

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34905 - Posted: 4 Feb 2014 | 23:35:43 UTC - in response to Message 34904.
Last modified: 4 Feb 2014 | 23:37:53 UTC

It is an actual problem in that the application intends to force the use of a full GPU (for many reasons). However, how this change impacts upon different cards and setups remains to be seen. I suspect that if you did not use your CPU for anything it will not make much difference in WU performance, but it will take a couple of days of results to know for sure if it even impacts when you do crunch CPU tasks at other projects...

The app hasn't changed, but might need to (if this isn't a bug in the app).
The driver has changed. Nothing else on my system changed, other than how the ACEMD app performs.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34906 - Posted: 4 Feb 2014 | 23:39:19 UTC - in response to Message 34905.
Last modified: 4 Feb 2014 | 23:39:52 UTC

I know the app hasn't changed, but the driver has. Even if this means a problem has surfaced, we don't know whether the problem is in the app or in the driver.

I'll monitor my results, to at least try to get a feel of whether it affects my GPUGrid performance on my GTX 660 Ti.

I'm hoping an admin can chime in, so that we can determine if it's a bug or not, and if it, where the bug is.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34907 - Posted: 4 Feb 2014 | 23:56:19 UTC - in response to Message 34906.
Last modified: 5 Feb 2014 | 0:14:30 UTC

If it is a problem its with the driver. The app calls the drivers and it's a one way thing. The ACEMD based 8.15 app can use CUDA4.2 or CUDA5.5, however the behavior is the same for both. To me this suggests that the driver is testing some feature of the forthcoming CUDA6 which has changed the way something worked in both 4.2 and 5.5.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34908 - Posted: 4 Feb 2014 | 23:58:14 UTC - in response to Message 34907.
Last modified: 4 Feb 2014 | 23:58:28 UTC

Thanks for the information. I have put out a request for more info from NVIDIA, in their 334.67 driver feedback thread.

My post is here:
https://forums.geforce.com/default/topic/679611/geforce-drivers/official-nvidia-334-67-beta-display-driver-feedback-thread-released-1-27-14-/post/4112315/#4112315

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34910 - Posted: 5 Feb 2014 | 11:12:32 UTC - in response to Message 34908.
Last modified: 5 Feb 2014 | 13:12:49 UTC

Just a first glance at tasks as they come in, but it appears that tasks run just as fast. To me this suggests that the driver 'fixes' something; tasks take as long but 'use' less CPU. Obviously better, if the freed up processor power can be used for something else without any impact on the GPUGrid run times.

Both same type tasks on same system and same setup run only on the GTX770 (CUDA5.5):

Older driver only,
316x-SANTI_MARwtcap310-0-32-RND3599_0
Run time 29,309.21
CPU time 29,195.03

Older driver (for ~75%) + Beta driver (for ~25%),
824x-SANTI_MARwtcap310-3-32-RND7471_1
Run time 29,177.10
CPU time 22,138.07


Same type of tasks on same system and setup run only on the GTX770 :

Older driver only,
I78R4-NATHAN_KIDKIXc22_6-48-50-RND2248_0
Run time 28,945.29
CPU time 28,801.58

Beta driver only,
I71R6-NATHAN_KIDKIXc22_6-49-50-RND1077_0
Run time 29,050.67
CPU time 6,376.88


The estimated runtime/CPU time of a tasks I'm presently running is ~11h/3h.

This needs to be checked for other card types, and on other systems and setups (more and less CPU cores in use), and the short queue, but for my system it appears the driver has reduced the need for CPU polling. It's likely a CUDA improvement.
I have been running 5 CPU tasks, 2 GPU tasks and some NCI tasks. Yesterday the CPU usage was around 92%, today it's around 72%.

At some stage I will increase the CPU usage for CPU projects, and see how it affects the GPUGrid runtimes.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34912 - Posted: 5 Feb 2014 | 13:19:00 UTC - in response to Message 34910.
Last modified: 5 Feb 2014 | 13:20:44 UTC

Phenomenal information, skgiven, thanks for sharing. I love the angle of "perhaps it's a fix instead of a bug" :)

This has prompted me to change my app_config.xml cpu_usage values. I had been using 0.5, such that when a GPUGrid task was running on each GPU, and I knew a full core would be used because of the Kepler, the summed values would be 0.5+0.5=1.0 CPU, so BOINC considered a CPU used by the GPU tasks, and did not over-commit my system. But now, with these 334.67 drivers, I'm using a cpu_usage value of 0.2, in order to keep my system full committed. Essentially, this results in my system being able to run an additional CPU task!

I'll chime back in if I hear anything from NVIDIA.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 28
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34915 - Posted: 5 Feb 2014 | 15:37:55 UTC - in response to Message 34912.

Hello: Possibly it is the same issue that is sucediento in Linux for a month.

The low use (12% + -) of the CPU but with a performance almost equal to when the load was 100% of one CPU per GPU.

see:http://www.gpugrid.net/forum_thread.php?id=3601

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34917 - Posted: 5 Feb 2014 | 16:00:24 UTC - in response to Message 34915.

Carlesa, did your change in GPUGrid task behavior... coincide with an updated driver version? Here in the Windows world, we only noticed the new behavior when we started using 334.67 BETA drivers.

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34921 - Posted: 5 Feb 2014 | 17:12:23 UTC - in response to Message 34917.

Sorry for not providing input on this sooner but I've been swamped with other issues.

I have 2 Linux rigs crunching GPUgrid. One has a 670 and a 660Ti in it (shows as two GTX 660Ti on the website) with driver version 331.20 and it is showing 99 - 100% CPU usage on each of 2 tasks. Those are real cores, no HT on that CPU.

The other rig has one GTX 670 with driver 331.38 and it is showing 9 - 11% CPU usage. Again those are real cores, no HT on that CPU. I don't think 331.38 is a beta driver but I could be wrong.

Take this report with a bit of caution because something weird that I don't understand is going on. Due to events I'm not going to bother explaining because they're long and complicated, the rig with the older driver should actually have the older driver so something is fishy here bit I'm not sure what it is, yet. I'm either confused or NVIDIA released a driver and then retracted it or something. I'll get back with more details when I know more but I hope what I have provided so far sheds some light. I would be very happy to learn that we're getting the same production with less CPU time.

____________
BOINC <<--- credit whores, pedants, alien hunters

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34924 - Posted: 5 Feb 2014 | 17:53:12 UTC - in response to Message 34912.
Last modified: 5 Feb 2014 | 20:39:25 UTC

I changed my CPU usage setting to 100%. As my two present GPUGrid tasks state that they require 0.701 and 0.778 CPU's this means that I can run 7 CPU tasks, rather than 5 (as before the driver update) and the actual CPU usage is at ~97%. So in reality the GPU's are using around 0.5 CPU's to support each of my GTX670 and GTX770 cards. For lesser cards it would be less and for bigger cards it's going to be more.

I have a 210x-SANTI_MAR420cap310-0-32-RND0577_0 WU (CUDA5.5) that was under 3% complete when I changed the CPU settings.
It has been running while the system wide CPU usage is ~97% for a further 15% of its run. Unfortunately it appears that the run time will rise to around 38,000seconds which is significantly (~20%) longer than with the previous settings and driver:

313x-SANTI_MARwtcap310-2-32-RND3990_0 5134323 3 Feb 2014 | 13:53:54 UTC 3 Feb 2014 | 23:37:50 UTC Completed and validated 31,446.83 30,515.01 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55)

848x-SANTI_MARwtcap310-1-32-RND6732_0 5133355 3 Feb 2014 | 3:52:21 UTC 3 Feb 2014 | 14:53:19 UTC Completed and validated 31,495.69 31,043.62 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda42)

So, despite freeing a CPU thread or two, the GPUGrid WU's are every bit as dependent on CPU availability/responsiveness as before, and the more CPU apps are running the slower the GPUGrid app will run.

Using MSI Afterburner I can see that GPU usage is more jagged; typical when there is resource contention. Dropping from 100% CPU usage to 95% (6 CPU tasks in my case instead of 7) the GPU usage rose from roughly 70% to 86% and 81% and became less jagged (but still a little). Dropping to 75% made it slightly more linear and utilization rose by a further 1%. Dropping to 50% did the same, another ~1% gain in GPU usage and an almost perfectly linear line. With CPU usage set to 40% GPU usage was linear at 89% and 86%. This is typical of the way it use to be.

The only other thing you could fiddle at is the GPUGrid WU priorities...
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 28
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34925 - Posted: 5 Feb 2014 | 18:41:33 UTC - in response to Message 34917.

Carlesa, did your change in GPUGrid task behavior... coincide with an updated driver version? Here in the Windows world, we only noticed the new behavior when we started using 334.67 BETA drivers.



Hi, The difference, as commented on the information is in the version of Linux you use Ubuntu 13.10 or 14.04 of the Nvidia driver is the same in both cases see. 331.38.

I also use Windows 8.1 (same hardware) and normal operation using almost 100% of the CPU driver 332.21.

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34928 - Posted: 5 Feb 2014 | 22:01:06 UTC - in response to Message 34925.

Carlesa, did your change in GPUGrid task behavior... coincide with an updated driver version? Here in the Windows world, we only noticed the new behavior when we started using 334.67 BETA drivers.



Hi, The difference, as commented on the information is in the version of Linux you use Ubuntu 13.10 or 14.04 of the Nvidia driver is the same in both cases see. 331.38.


I am running Ubuntu 12.04 on both machines. I downloaded the drivers directly from NVIDIA, I did not install drivers from the "Additional drivers" utility. Anyway, the fishy thing I referred to in my previous post is probably irrelevant, the relevant point is that the newer driver seems to use a lot less CPU.

On both rigs I am running other non-GPU projects. I'll experiment with turning those off while allowing GPUgrid tasks to run to see if that affects CPU usage and/or runtimes. I can experiment with process priority (niceness) too.

____________
BOINC <<--- credit whores, pedants, alien hunters

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34930 - Posted: 6 Feb 2014 | 10:47:17 UTC - in response to Message 34928.
Last modified: 6 Feb 2014 | 11:33:39 UTC

... and fiddle at the GPUGrid WU priorities I did; used Process Hacker to set them all to high, including the I/O priorities.
Ran a 715x-SANTI_MARwtcap310-6-32-RND2030_0 WU with CPU usage at 95% in Boinc (88% going by task manager).
In effect while using 1 more CPU core to crunch with, the run time was less than with a similar WU I ran with the old drivers, 917x-SANTI_MARwtcap310-1-32-RND4455_0.

Beta drivers (GTX770),
Run time 28,202.99
CPU time 11,046.48

Old drivers (GTX770),
Run time 29,690.77
CPU time 29,412.32

So, that's a 5% improvement in the GPUGrid WU while using 1 more CPU core.

715x-SANTI_MARwtcap310-6-32-RND2030_0 5142448 5 Feb 2014 | 18:58:13 UTC 6 Feb 2014 | 4:11:30 UTC Completed and validated 28,202.99 11,046.48 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55)

917x-SANTI_MARwtcap310-1-32-RND4455_0 5132820 2 Feb 2014 | 22:00:26 UTC 3 Feb 2014 | 7:07:43 UTC Completed and validated 29,690.77 29,412.32 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55)


Another comparison of the Beta against the old driver, this time for a GTX670. This time priorities were set some time into the run, and again overall 1 more CPU core was used to crunch CPU tasks:

313x-SANTI_MARwtcap310-2-32-RND3990_0 5134323 3 Feb 2014 | 13:53:54 UTC 3 Feb 2014 | 23:37:50 UTC Completed and validated 31,446.83 30,515.01 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55)

533x-SANTI_MAR420cap310-0-32-RND9530_0 5139521 5 Feb 2014 | 23:34:59 UTC 6 Feb 2014 | 9:31:07 UTC Completed and validated 31,439.30 11,441.35 115,650.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55)

Although Process Hacker does not allow the saving of thread priority settings, it does allow you to save Task Priority. Fortunately this seems to be sufficient:

old drivers,
gluilex4x4-NOELIA_DIPEPT1-0-2-RND2366_1 5127910 4 Feb 2014 | 0:18:57 UTC 4 Feb 2014 | 8:45:15 UTC Completed and validated 26,130.57 25,997.61 93,000.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55)

Beta drivers,
tyrglux6x44-NOELIA_DIPEPT1-1-2-RND3665_0 5143384 6 Feb 2014 | 3:16:34 UTC 6 Feb 2014 | 11:22:11 UTC Completed and validated 25,796.40 13,062.85 93,000.00 Long runs (8-12 hours on fastest card) v8.15 (cuda55)

This NOELIA WU comparison suggests that with just the app priority set to high (not threads or I/O) the WU is slightly faster (1.3%) than the last one I ran on the same card with the old drivers. This is good news at it means this works on at least 2 WU types. Even if the tasks took the same length of time to complete, I'm getting a CPU thread out of it.

A few things to note:
Different GPU's use the CPU to different extents during the run, so expect some performance variation.
Different WU types use the CPU to different extents. For my W7 system it's presently from 29% to 46%. The actual CPU usage obviously depends on the GPU (small will be less and the GTX780Ti the most), and the CPU which could be 2GHz or 4GHz. So, some tasks might be 5% faster, others only 1%.
You need to save the priority for both with cuda4.2 and 5.5 versions of the app.

What to do next is to test with app priority only more thoroughly and then without app priority to make sure these results are not just due to having more free CPU cycles, then test again at 100% CPU in Boinc and with priority on. That will take a few days though.

PS. Setting priority is old hat but it's an opportunity to have another look now that things have changed with this beta driver.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 34931 - Posted: 6 Feb 2014 | 17:28:39 UTC - in response to Message 34930.

The purpose of the "SWAN_SYNC" knob is to set the CUDA runtime to use a low-CPU mode. In practice, this hasn't worked very well for quite some time (in terms of driver releases). Looks rather like "correct" behaviour is restored with version 334.

You should see lower CPU load, without diminished GPU performance.

MJH

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 851
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34933 - Posted: 6 Feb 2014 | 22:18:20 UTC - in response to Message 34931.

The purpose of the "SWAN_SYNC" knob is to set the CUDA runtime to use a low-CPU mode. In practice, this hasn't worked very well for quite some time (in terms of driver releases). Looks rather like "correct" behaviour is restored with version 334.

You should see lower CPU load, without diminished GPU performance.

MJH

I've gave a try to the v334.67 driver on my WinXPx64 / Core i7-4770k / 2x GTX 780Ti host, and the GPU usage on both cards dropped by 5% (the temperatures were also lower than before), so I've reverted to the v332.21 driver.

See task 7743473 & 7739626.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34934 - Posted: 7 Feb 2014 | 11:51:07 UTC - in response to Message 34933.
Last modified: 7 Feb 2014 | 11:56:21 UTC

The purpose of the "SWAN_SYNC" knob is to set the CUDA runtime to use a low-CPU mode. In practice, this hasn't worked very well for quite some time (in terms of driver releases). Looks rather like "correct" behaviour is restored with version 334.

You should see lower CPU load, without diminished GPU performance.

MJH

I've gave a try to the v334.67 driver on my WinXPx64 / Core i7-4770k / 2x GTX 780Ti host, and the GPU usage on both cards dropped by 5% (the temperatures were also lower than before), so I've reverted to the v332.21 driver.

See task 7743473 & 7739626.


I expect XP is different to W7 because the GPU utilization is higher on XP.
On W7 my GTX770's GPU usage is around 79% running a long SANTI_MAR (cuda5.5).
Did you try changing the priority, or perhaps you were doing that with the old drivers?

Performance seems to depend on what your settings are/were, and likely your OS (XP vs W7); when I first tested the Beta driver, the runtimes were the same but there was more CPU available. I didn't use the extra CPU at first. When I used some of this freed up CPU the GPUGrid runtimes increased. The fact that runtime is the same but CPU usage is less is still a better situation from the users point of view; the system is going to be more responsive to the user as it has more CPU threads available, but if you can't use these free CPU cycles without the detrimental impact on the GPUGrid WU's runtime then its not great.

The possible fix is to change the apps priority to high using Process Hacker (or similar). That way you might be able to use one more CPU cores/threads and get the same runtime performance for GPUGrid WU's (while utilizing more of your CPU at a CPU project).
Of course this undoes the apps default settings, so it might mean that system responsiveness isn't what it was. Again, this is likely to be different for different hardware; a GTX660 is more likely to be less responsive than a high end card.

To test the suggested priority fix, I have just started running 2 SANTI_MAR WU's with priority set to normal so I can compare these against WU's completed at high priority (again, 1 more CPU thread is being used than I was using with the non Beta drivers)...
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34935 - Posted: 7 Feb 2014 | 11:55:16 UTC - in response to Message 34934.
Last modified: 7 Feb 2014 | 12:06:59 UTC

Thank you for performing these tests, skgiven. I consider them very useful, and look forward to your additional results. It would be difficult for me to perform such tests, since I'd have to disable work on a heterogeneous GPU in order to ensure tasks stay on the Kepler. But I will surely use your results to modify my configuration appropriately.

So far, my configuration has been changed a little. I still run GPUGrid-only on my GTX 660 Ti and my GTX 460. And I run Albert/Einstein/SETI/Beta (A/E/S/B) only, on my GTS 240.

Old config: Set GPUGrid tasks to 0.5 CPU, since when running 2, I for sure was using a core on the Kepler. Set A/E/S/B to 0.5 CPU for most of their apps, since they don't use a full core. Use 100% CPUs. Result: 7 CPU tasks, 2 GPUGrid tasks, 1 A/E/S/B task; CPU slightly overloaded.

Current config: Set GPUGrid tasks to 0.4 CPU, to better reflect values I'm seeing. Set A/E/S/B to 0.3 CPU for most of their apps. Note: 0.4+0.4+0.3=1.1. Use 100% CPUs. Result: 7 CPU tasks, 2 GPUGrid tasks, 1 A/E/S/B task; CPU very slightly underloaded.

Note: There is 1 A/E app that I set to 1 CPU, "Gamma-ray pulsar search #2". And there is 1 S/B app that I set to 1.0 CPU, "AstroPulse v6". Process Monitor shows that those apps actually use a full core, which is why I have them setup to use 1.0 CPU. When one of those tasks run, the result is: 7 CPU tasks, 2 GPUGrid tasks, 1 1-core A/E/S/B task; CPU moderately overloaded.

Regards,
Jacob

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34936 - Posted: 7 Feb 2014 | 13:13:16 UTC
Last modified: 7 Feb 2014 | 13:26:32 UTC

I have been running the 334.67beta on WinXP (32-bit) long enough to complete a Long on both a GTX 660 (SANTI_MARwtcap310) and a GTX 650 Ti (SANTI_MAR422cap310). It seems fine on GPU usage, being its usual 97% on the 660 and 98% on the 650 Ti. The run times also seem identical to the previous driver (332.21); that is, they are no faster.

The only real difference is the CPU usage. It is now down to 17% on the 660, and 12% on the 650 Ti (as measured by BoincTasks). That is very nice, since that motherboard is an older P45 with an E8400 Core2 Duo at 3.0 GHz. The next step will be to try to run a single WCG project also on the CPU. We will see about that.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,789,586,851
RAC: 9,209,073
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34937 - Posted: 7 Feb 2014 | 14:45:10 UTC - in response to Message 34934.

I expect XP is different to W7 because the GPU utilization is higher on XP.

I suspect GPU utilization is higher on XP because the Windows driver model is so very different from Vista/Win 7.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34939 - Posted: 7 Feb 2014 | 22:43:19 UTC - in response to Message 34937.

I expect XP is different to W7 because the GPU utilization is higher on XP.

I suspect GPU utilization is higher on XP because the Windows driver model is so very different from Vista/Win 7.

It's been long established (not least by you) that the introduction of the WDDM is the root of the performance difference between XP and more recent versions of Windows (Vista onwards). The WDDM introduced a large CPU overhead, hence the latency increase. For some 'non-GPUgrid' CUDA apps it's negligible, for other apps it's as high as 20%. Here it's around 12.5% now for a high-ish end GPU, but is obviously dependent on the CPU and supporting hardware. The latency also impacts upon some OpenCL apps for ATI cards, but not any Boinc apps that I'm aware of.

In the case of the Beta vs older drivers there appears to be a difference between how the tasks perform under XP and W7. With XP it's apparently worse, though I think I know why (below).

So far as I can tell for W7, an 8thread Intel CPU and two High-ish end GPU's the run times are the same under the Beta driver if you don't change any Boinc settings, and hadn't saturated the CPU to begin with. In my case I was able to configure Boinc to use an extra thread to crunch CPU tasks on and set the app Priority to High keeping the GPUGrid run times the same or slightly better.

Zoltan, it occurred to me that the difference might actually be the CPU projects you were running; I'm just running WCG task that set the app priority to Idle. Many projects use higher settings, such as Normal.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 851
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 34952 - Posted: 9 Feb 2014 | 10:04:50 UTC - in response to Message 34939.
Last modified: 9 Feb 2014 | 10:47:53 UTC

Zoltan, it occurred to me that the difference might actually be the CPU projects you were running; I'm just running WCG task that set the app priority to Idle. Many projects use higher settings, such as Normal.

I'm running 5 SIMAP on the CPU, they are running at low priority (I didn't change anything else than the NVidia driver during this test).

GoodFodder
Send message
Joined: 4 Oct 12
Posts: 53
Credit: 333,467,496
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 34966 - Posted: 10 Feb 2014 | 12:56:20 UTC
Last modified: 10 Feb 2014 | 13:36:12 UTC

For those who may be interested I deployed 334.67 BETA drivers to my small machine with two gtx 650ti's.

Agree swan-sync appears to be enabled now as CPU usage dropped to 8% yet the run time for a long task appear to take only 2% longer - a good trade off for the reduced power usage, not to mention my GPU temps dropped a few degrees.

http://www.gpugrid.net/results.php?hostid=166813

I set Boinc to run with an 'above normal' priority using: http://www.efmer.eu/boinc/download.html

Incidently my machine is running headless; on this occasion I was able to install the drivers through an rdp session - though they do not take effect(as the cards cannot be seen). Hence once rebooted I just updated the cards directly through device manager.

GoodFodder
Send message
Joined: 4 Oct 12
Posts: 53
Credit: 333,467,496
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 34985 - Posted: 11 Feb 2014 | 22:50:13 UTC
Last modified: 11 Feb 2014 | 22:52:06 UTC

Measured the power draw at the wall with the 334.67 beta drivers on XP x86 and its about 8% less (176w) than the previous whql drivers (192w) whilst performance only dropped by about 2% - a good result in my opinion.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35018 - Posted: 13 Feb 2014 | 16:20:20 UTC - in response to Message 34985.

Just wanted to chime in that, after repeated pressure by me, ManuelG is going to find more information on what has changed in the driver that has affected Kepler CPU usage. He hopes to have such information within 48 hours.

https://forums.geforce.com/default/topic/679611/geforce-drivers/official-nvidia-334-67-beta-display-driver-feedback-thread-released-1-27-14-/post/4119864/#4119864

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35202 - Posted: 21 Feb 2014 | 12:48:59 UTC
Last modified: 21 Feb 2014 | 12:50:03 UTC

ManuelG said he had to file a bug, to have the devs even look at it.
https://forums.geforce.com/default/topic/690370/geforce-drivers/official-nvidia-334-89-whql-display-driver-feedback-thread-released-2-18-14-/post/4127820/#4127820

I tried to stress that it may not be a bug at all, and that we are simply looking for confirmation that something changed, and more information about that change:
https://forums.geforce.com/default/topic/690370/geforce-drivers/official-nvidia-334-89-whql-display-driver-feedback-thread-released-2-18-14-/post/4128112/#4128112

I'll let you guys know when I know more.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 328
Credit: 72,619,453
RAC: 28
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 35204 - Posted: 21 Feb 2014 | 14:50:11 UTC - in response to Message 35202.

Hello: In Windows 8.1 since I installed the latest Nvidia driver (a couple of days) 334.89 behaves like Linux, (for some time now) the CPU load is reduced to 10-20% (varies according to the task type) instead of 100% previously.

The overall performance of the GPU hardly altered and the change improves load / performance / consumption ratio.

Post to thread

Message boards : Number crunching : Kepler - Not fully using CPU?

//