Advanced search

Message boards : Number crunching : 1 Work Unit Per GPU

Author Message
Betting Slip
Send message
Joined: 5 Jan 09
Posts: 589
Credit: 2,044,516,875
RAC: 1,490,618
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46314 - Posted: 25 Jan 2017 | 17:05:18 UTC

I would like to suggest a 1 WU per GPU policy for this project in order to increase its efficient use of resources and speed up results.

This post is not about fine tuning the project to any particular user or group of users benefit although, ultimately I believe it will benefit everyone. It is about fine tuning the project for the projects benefit. Also, while there are many things that could be done server side to improve efficiency for the purpose of this discussion I would like you to concentrate on this one only.

As things stand we have hosts caching WU's for crunching hours in the future while we have other hosts that can't get any work standing idle. This doesn't make a lot of sense to me. In addition, when a handful of WU's become available they are often quickly gobbled up by hosts that already have one running thus adding it to their cache and not allowing idle hosts the chance to download it because of the vagaries of the BOINC backoff. You only get a WU if it's available when BOINC asks for it.

All this is akin to having 2,000 workers and 1,500 jobs instead of giving 1 job each to 1,500 and having 500 waiting for work, this project is giving 2 jobs to each of 750 and having 1,250 waiting for work. Doesn't make for speedy turnaround of work.

The above is only an example of what is taking place and the numbers are not definitive.

Apart from the unequal distribution of work, the system actually gets worse because a completed unit of work in most cases actually generates another WU thus supplying more work but it can't do that while it is sitting in a hosts cache so again less work available for idle hosts.

There is another problem, as we all know, there are hosts that continually "timeout" and the WU is resent after 5 days, there are other hosts which continually "error out" after long periods of time. These hosts can hold up the progression of 2 WU's at a time under the present policy. However, they would only halt the progression of 1 WU at a time and thus reducing their impact on the project if a 1 WU per GPU policy were implemented.

BOINC when downloaded and installed has cache size set by default and will download 2 WU's even if the card on that host is a slow one. Now if that card only got one it may complete within 24hrs thus speeding the project and earning a bonus however while downloading 2 WU's it will do neither and a lot of users will not even be aware it can be changed or the implications. In addition, there will be plenty of users running multiple projects on BOINC and have a slow card. They want a high cache on some projects but not GPUGrid. They can't because these settings are global and apply to all running projects.

I'm sure you're thinking that GPUGrid can't keep enough work in the queue to keep us all active now so why should we be concerned with speed and efficiency.
Firstly, I would suspect the scientist want their results back as quickly as possible so they can analyze the results and maybe issue more work on the back of the results. There is also another consideration, Stefan has mentioned collaboration with others which if fruitfull could lead to more work and boost demand on GPUGrid. Scientists at GPUGrid may have cause in the future to release a huge amount of WU's and need it to be running at MAX capacity.

In any case the time to fix a leaky roof is when the sun is shining not when it begins to rain and the roof is certainly leaking on this project, leaking resources.

I hope I have made my case clearly.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,332,297,781
RAC: 5,255,441
Level
Met
Scientific publications
watwat
Message 46316 - Posted: 25 Jan 2017 | 18:50:33 UTC

For people that are like TL:DR, Set in BOINC 0 days works and 0 additional days work. If you crunch 2 WU at a time per GPU, it's probably not the best time to do that, especially because the WU have such high utilization now. We just want to have more people crunching the WUs in parallel.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,332,297,781
RAC: 5,255,441
Level
Met
Scientific publications
watwat
Message 46317 - Posted: 25 Jan 2017 | 19:07:17 UTC

I also agree that the scientists should implement, at least temporarily, a 1 WU per GPU policy and only until it either errors out, finishes or times out does it get another. This would dramatically speed up work flow

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,679,584,594
RAC: 9,991,696
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46318 - Posted: 25 Jan 2017 | 22:03:23 UTC - in response to Message 46314.

You've made it perfectly clear.
I agree with you, but this reduction of maximum workunits per GPU is effective only when there is a shortage of workunits (which is quite frequent lately). So perhaps this setting should be different for every host, based on the turnaround time of that host. Every host should start with 1 workunit per GPU, but if the turnaround time is below 48 hours, the maximum is set to 2. When the turnaround time is above 48 hours, the maximum is set to 1.
The drawbacks of 1 workunit per GPU are:
1. You can't run two workunits on a single GPU simultaneously (the hosts configured to do so will have an idle GPU, so they have to be reconfigured)
2. It will reduce the RAC of every host, so it will reduce the overall RAC of GPUGrid, because the host won't receive a new GPUGrid workunit until the finished one is uploaded. Hosts without backup projects will be idle during the upload/download period, while hosts with backup projects will process (at least) one workunit from the backup project between GPUGrid workunits, as it will force the host to get work from the backup project(s) until the finished GPUGrid workunit is uploaded. If the backup project has long workunits, or the host has a slow internet connection (mainly hosts outside Europe) this reduction could be significant for these hosts (and for the whole project also). This would make some volunteers to disable their backup project(s), or manually manage them (which is pretty inconvenient).
However we should try it to know exactly its impact.

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 589
Credit: 2,044,516,875
RAC: 1,490,618
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46319 - Posted: 25 Jan 2017 | 22:21:04 UTC - in response to Message 46318.

I actually think it will be just as effective when there are lots of WU's to be had because it's likely at least one or more types of work will have a higher priority setting than others so you don't want them cached in order to get them completed as fast as possible.

Thanks for your support Retvari and Pappa

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 207
Credit: 304,023,061
RAC: 1,228,868
Level
Asp
Scientific publications
watwat
Message 46329 - Posted: 26 Jan 2017 | 7:54:22 UTC

I dont really second that as reducing the number of WU per GPU will lead to a poor utilization of capable Hardware (gtx 980, 980ti, 1070 and up) because of a bad CPU/GPU ratio. The 1 WU/GPU thing would only work well for mid range Maxwell Cards and most of the Kepler and Fermi Cards. But it will surely reduce the efficiency of the fast machines.

Well, I guess it is all about tasks. Let's also consider some other ideas to fill the queue, especially for the short runs. Just my two Cents.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Erich56
Send message
Joined: 1 Jan 15
Posts: 372
Credit: 1,685,138,202
RAC: 2,964,912
Level
His
Scientific publications
watwatwat
Message 46331 - Posted: 26 Jan 2017 | 11:20:38 UTC - in response to Message 46318.
Last modified: 26 Jan 2017 | 11:21:11 UTC

the basic idea is not a bad one from various viewpoints; however, my reservations are mainly because of these facts brought up by Zoltan:

... If the backup project has long workunits, or the host has a slow internet connection (mainly hosts outside Europe) this reduction could be significant for these hosts (and for the whole project also). This would make some volunteers to disable their backup project(s), or manually manage them (which is pretty inconvenient).

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 207
Credit: 304,023,061
RAC: 1,228,868
Level
Asp
Scientific publications
watwat
Message 46336 - Posted: 26 Jan 2017 | 12:14:05 UTC
Last modified: 26 Jan 2017 | 12:14:24 UTC

So perhaps this setting should be different for every host, based on the turnaround time of that host. Every host should start with 1 workunit per GPU, but if the turnaround time is below 48 hours, the maximum is set to 2. When the turnaround time is above 48 hours, the maximum is set to 1.


Zoltan, that is a good proposal.. although I am not sure how to technically implement it, overriding the client config.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 258
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 46346 - Posted: 27 Jan 2017 | 11:38:51 UTC
Last modified: 27 Jan 2017 | 11:39:13 UTC

Thanks Betting Slip for the thread :) Very nicely put.

Hm, the points made by Zoltan are quite valid I think. It's a risk if people have backup projects with long WUs.

It could make sense if it was as simple as switching a button, so when there are fewer WUs than GPUs it forces 1 WU per GPU, otherwise it allows 2. I don't know how feasible it would be to implement such a feature.

Now that I think about it, on the short queue some people might even want to run 2 WU per GPU if they indeed fit. So it would be better if it was enforced on specific queues, not whole project.

So the question becomes quantifying the damage. We could maybe do a short trial and see what the overall effect is.

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 589
Credit: 2,044,516,875
RAC: 1,490,618
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46347 - Posted: 27 Jan 2017 | 11:53:08 UTC - in response to Message 46346.
Last modified: 27 Jan 2017 | 11:54:57 UTC

Thanks Betting Slip for the thread :) Very nicely put.

Hm, the points made by Zoltan are quite valid I think. It's a risk if people have backup projects with long WUs.

It could make sense if it was as simple as switching a button, so when there are fewer WUs than GPUs it forces 1 WU per GPU, otherwise it allows 2. I don't know how feasible it would be to implement such a feature.

Now that I think about it, on the short queue some people might even want to run 2 WU per GPU if they indeed fit. So it would be better if it was enforced on specific queues, not whole project.

So the question becomes quantifying the damage. We could maybe do a short trial and see what the overall effect is.


You're comments on the short queue are valid and would have put in my original post had I remembered, not to implement on short queue. I was so engrossed in making my case clearly that it slipped my mind.

I think a 4 to 6 week trial would give both you and us a very good idea as to the impact on the project which, I personally think would be more positive than negative and most users would adapt their habits if running backup projects.

As I said above, I also think it will be positive when there is lots of work.

Be brave and give it a go. :-)

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 207
Credit: 304,023,061
RAC: 1,228,868
Level
Asp
Scientific publications
watwat
Message 46350 - Posted: 27 Jan 2017 | 13:06:50 UTC

May I put in one more remark.. the short run queue is intended for the slower cards where it doesn't make sense to run 2 concurrent jobs anyway. Whereas the fast Maxwell and Pascal cards need 2 tasks to have a good utilization otherwise they might be bottlenecked by the CPU.

So if at all I would apply the 1 Job/GPU rule to the short runs. As a positive side effect, the crunchers with high-end hardware will leave out the short runs to get more GPU utilization by multi-tasking. Which leaves more short runs to the crunchers who really need them.

____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 258
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 46353 - Posted: 27 Jan 2017 | 13:54:09 UTC
Last modified: 27 Jan 2017 | 13:54:17 UTC

Sorry since I am not very familiar with BOINC. What is considered a host? One machine or one GPU?
Which of these options would it be? https://boinc.berkeley.edu/trac/wiki/ProjectOptions#Joblimits

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 589
Credit: 2,044,516,875
RAC: 1,490,618
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46354 - Posted: 27 Jan 2017 | 14:00:09 UTC - in response to Message 46353.
Last modified: 27 Jan 2017 | 14:08:37 UTC

I would think it would be " Max WU in progress GPU"

Host = 1 machine

Would ask/PM Richard Hazelgrove or Jacob Klein https://gpugrid.net/show_user.php?userid=8048

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 207
Credit: 304,023,061
RAC: 1,228,868
Level
Asp
Scientific publications
watwat
Message 46355 - Posted: 27 Jan 2017 | 15:08:57 UTC

I am not exactly a Boinc expert either ... but can't the Boinc client check the estimated GFLOPS capability of the GPU used and send to the server?

https://boinc.berkeley.edu/dev/forum_thread.php?id=10716

If less than 2000 GFLOPS I would send only one short run per GPU. Otherwise grant 1-2 long runs. IF that is at all possible to configure.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 791
Credit: 1,427,385,720
RAC: 1,264,745
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46356 - Posted: 27 Jan 2017 | 15:10:48 UTC - in response to Message 46354.

I would think it would be " Max WU in progress GPU"

Host = 1 machine

Would ask/PM Richard Hazelgrove or Jacob Klein https://gpugrid.net/show_user.php?userid=8048

I'm not really an expert on practical server operations (just a volunteer user, like most people here), but I can find my way around well enough to point you to http://boinc.berkeley.edu/trac/wiki/ProjectOptions#Joblimits as a starting point.

You probably need to read through the next section - Job limits (advanced) - as well. I'm guessing that because you operate with GPU limits already, there's probably a pre-existing config_aux.xml file on your server, which might be a better example to follow than the somewhat garbled one given as a template in the trac wiki.

It appears that you can set limits both as totals for the project as a whole, and individually for each application (acemdlong / acemdshort).

As Betting Slip says, a 'host' in BOINC-speak is one computer in its entirety - any number of CPU cores plus any number of GPUs (possibly of multiple technologies). I would guess that '<per_proc/>' shown for the <gpu_limit> in the wiki should be interpreted as 'per GPU' (bearing in mind that some dual-circuit-board GPUs, like the Titan Z, will appear to BOINC as two distinct GPUs), but that's an area where the documentation seems a little hazy. If you can clarify the situation by experimentation, I think I have an account which allows me to edit for clarity.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,332,297,781
RAC: 5,255,441
Level
Met
Scientific publications
watwat
Message 46357 - Posted: 27 Jan 2017 | 15:11:56 UTC
Last modified: 27 Jan 2017 | 15:12:52 UTC

If less than 2000 GFLOPS I would send only one short run per GPU. Otherwise grant 1-2 long runs. IF that is at all possible to configure.


That seems pretty in-depth, I'm not sure it's that granular, but it's worth investigating. If not, I think the overarching 1 WU per GPU is the best route, at least for an experimental trial period.

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 589
Credit: 2,044,516,875
RAC: 1,490,618
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46358 - Posted: 27 Jan 2017 | 15:23:05 UTC - in response to Message 46356.

Sorry for spelling your name wrong Richard was going to go for the "s" but went with "z" instead. Does explane why I couldn't find you when doing a search, that should have given me a clue but no...

Thanks for helping.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 791
Credit: 1,427,385,720
RAC: 1,264,745
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46359 - Posted: 27 Jan 2017 | 15:34:11 UTC - in response to Message 46358.

Sorry for spelling your name wrong Richard was going to go for the "s" but went with "z" instead. Does explane why I couldn't find you when doing a search, that should have given me a clue but no...

Thanks for helping.

LOL - no offence taken. I'm used to the 'z' spelling when I dictate my surname over the telephone, but I'm always intrigued when somebody who only knows me from the written word plays it back in the alternate form. I guess the human brain vocalises as you read, and remembers the sounds better than the glyphs when the time comes to use it again?

I'm sure there's a PhD in some form of cognitive psychology in there for someone...

eXaPower
Send message
Joined: 25 Sep 13
Posts: 265
Credit: 1,049,530,417
RAC: 1,568,352
Level
Met
Scientific publications
watwatwatwatwatwat
Message 46362 - Posted: 27 Jan 2017 | 18:42:04 UTC
Last modified: 27 Jan 2017 | 18:43:06 UTC

Will any server policy implemented include a fix for multiple architecture GPUs running simultaneously?
For months ACEMD (CUDA 8.0) Pascal and (CUDA 6.5) Kelper / Maxwell app unable to work together on the same host.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,332,297,781
RAC: 5,255,441
Level
Met
Scientific publications
watwat
Message 46363 - Posted: 27 Jan 2017 | 18:48:25 UTC
Last modified: 27 Jan 2017 | 18:49:22 UTC

I am running Kepler and Maxwell together so it must just be that they are different versions of CUDA, so I don't expect this to ever be fixed. But I could be wrong.

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,836,565,599
RAC: 343,737
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46381 - Posted: 29 Jan 2017 | 14:23:15 UTC - in response to Message 46363.
Last modified: 29 Jan 2017 | 14:27:21 UTC

At present there are not enough tasks to go around & most/all WU's have high GPU utilization so the project doesn't benefit (overall) from even high end GPU's running 2tasks at a time. Considering the recent upload/server availability issues it makes even less sense to run more than 1task/GPU. I would even go so far as to suggest its time we were running 1 task across multiple GPU's (for those who have them) - something that would also benefit those with 2 or more smaller GPU's; assuming an NV-link/SLi isn't required. By the way, GPU utilization isn't the end all be all of GPU usage; apps that perform more complex simulations, that require more GDDR, and more PCIE bandwidth are just using the GPU differently; not more or less (one probably comes at the expense of the other).

When available tasks are exclusively low GPU-utilizing tasks, only then could the project benefit from high-end GPU's running 2tasks simultaneously.

PS. The Pascal app (cuda80) is different from the previous app (cuda65) which can run mixed GPU setups; the latest app is exclusively for Pascal's (GTX 1000 series).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Erich56
Send message
Joined: 1 Jan 15
Posts: 372
Credit: 1,685,138,202
RAC: 2,964,912
Level
His
Scientific publications
watwatwat
Message 46438 - Posted: 5 Feb 2017 | 17:23:55 UTC - in response to Message 46381.

At present there are not enough tasks to go around & most/all WU's have high GPU utilization so the project doesn't benefit (overall) from even high end GPU's running 2tasks at a time.

I think though that Betting Slip meant his suggestion in a different way: not talking about 2 tasks running at a time, but rather one task running, with another task downloading already (long time) before the running task gets finished.

However, I am asking whether the policy suggested by him has meanwhile been implemented:
On one of my PCs, when the BOINC manager tried an Update on GPUGRID tasks (i.e. trying to download the next WU) while one WU was in progress, it said "the computer has reached a limit on tasks in progress".

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 589
Credit: 2,044,516,875
RAC: 1,490,618
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46439 - Posted: 5 Feb 2017 | 17:33:00 UTC - in response to Message 46438.
Last modified: 5 Feb 2017 | 17:40:19 UTC

At present there are not enough tasks to go around & most/all WU's have high GPU utilization so the project doesn't benefit (overall) from even high end GPU's running 2tasks at a time.

I think though that Betting Slip meant his suggestion in a different way: not talking about 2 tasks running at a time, but rather one task running, with another task downloading already (long time) before the running task gets finished.

However, I am asking whether the policy suggested by him has meanwhile been implemented:
On one of my PCs, when the BOINC manager tried an Update on GPUGRID tasks (i.e. trying to download the next WU) while one WU was in progress, it said "the computer has reached a limit on tasks in progress".


No, it has not been implemented evidenced by one of your computers with a single 750ti having 2 long WU's. How does that even begin to make any sense? http://www.gpugrid.net/results.php?hostid=372115

I'm not holding my breath on it being implemented either.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,332,297,781
RAC: 5,255,441
Level
Met
Scientific publications
watwat
Message 46440 - Posted: 5 Feb 2017 | 17:56:41 UTC - in response to Message 46439.

No, it has not been implemented evidenced by one of your computers with a single 750ti having 2 long WU's. How does that even begin to make any sense? http://www.gpugrid.net/results.php?hostid=372115

I'm not holding my breath on it being implemented either.

Do you mind aborting at least one of those WUs? My two 1070s and 980ti are idling, and I'm sure many others are like me.

Erich56
Send message
Joined: 1 Jan 15
Posts: 372
Credit: 1,685,138,202
RAC: 2,964,912
Level
His
Scientific publications
watwatwat
Message 46443 - Posted: 6 Feb 2017 | 6:25:53 UTC

Frankly, I don't think it's a good idea that crunchers are not starting to bite (not to say: attack) each other because there are WUs in waiting position at the host of a given cruncher, while the GPU of another cruncher is running idle.
The policy in place so far allows for this, and it's up to GPUGRID to either continue it or to change it.

However, the situation of idle GPUs would of course not happen if there were enough WUs available all the time.
So, crunchers should not blame other crunchers for GPUs in idle status.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,679,584,594
RAC: 9,991,696
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46468 - Posted: 8 Feb 2017 | 0:51:40 UTC - in response to Message 46440.

Do you mind aborting at least one of those WUs? My two 1070s and 980ti are idling, and I'm sure many others are like me.
There's no point in aborting a task on an alive and active host, as there's nothing that could make sure that this task gets assigned to an idle and active host. As the number of tasks in progress decreases, there are more hosts trying to get work (in vain) including those which never finish the work. The overwhelmed / inactive hosts could impede the most the progress of a chain, as there's a 5 days deadline, so such a host could cause that much delay in a single step. When there's no work the chance of getting a previously timed out task is higher, even such tasks which timed out more times (causing 10, 15, 20... days delay). For example this one. It spent 10 days in vain before I've received it.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 207
Credit: 304,023,061
RAC: 1,228,868
Level
Asp
Scientific publications
watwat
Message 46469 - Posted: 8 Feb 2017 | 8:53:13 UTC - in response to Message 46443.

Frankly, I don't think it's a good idea that crunchers are not starting to bite (not to say: attack) each other because there are WUs in waiting position at the host of a given cruncher, while the GPU of another cruncher is running idle.
The policy in place so far allows for this, and it's up to GPUGRID to either continue it or to change it.

However, the situation of idle GPUs would of course not happen if there were enough WUs available all the time.
So, crunchers should not blame other crunchers for GPUs in idle status.


Right, squabbling will get us nowhere.

Well ... seems that GPUGRID is self regulatory in terms of crunching power as many things in nature. I try not to take that personally. If there is not enough work, quite a few crunchers will walk away and seek for jobs elsewhere. I have already withdrawn my 1070 from GPUGRID and my remaining 1080 still is idle very often due to lacking tasks. Maybe I should withdraw this one as well. Leaves more work for others.

____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,332,297,781
RAC: 5,255,441
Level
Met
Scientific publications
watwat
Message 46471 - Posted: 8 Feb 2017 | 13:07:39 UTC

Zoltan you are very right, there is a risk, but I have been fast enough to claim every aborted task so far. And JoergF, there's no sense in completely withdrawing GPUs from GPUGrid, just have a backup with less priority so it automatically switches to that other work load if there is no work in GPUGrid. This is the only way I can keep my house warm with this intermittent work, and with no babysitting.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,150,252,564
RAC: 1,055,558
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46487 - Posted: 10 Feb 2017 | 13:47:41 UTC - in response to Message 46471.
Last modified: 10 Feb 2017 | 13:48:05 UTC

Yes --- Leave GPUGrid attached!

I recommend setting GPUGrid at resource share 100 or higher, and attach to other GPU projects like SETI or Einstein or Asteroids, at 0 resource share, as backup projects, meaning they only get work when your non-backup projects don't have work.

Resource Share info:
http://boinc.berkeley.edu/wiki/Preferences#List_of_project_preferences

Honestly, all this griping about "no work" is a bit sickening. You all sure do like to micromanage :) Just attach to multiple projects, set your resource shares correctly, and let BOINC do its job to keep your resources busy!

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,332,297,781
RAC: 5,255,441
Level
Met
Scientific publications
watwat
Message 46488 - Posted: 10 Feb 2017 | 18:00:55 UTC

Somehow I manged to not get a single WU from the 600 new WUs released even though I manually updated all my my computers during the time they were still being claimed.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,679,584,594
RAC: 9,991,696
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46489 - Posted: 10 Feb 2017 | 20:34:16 UTC - in response to Message 46488.
Last modified: 10 Feb 2017 | 20:35:22 UTC

Somehow I manged to not get a single WU from the 600 new WUs released even though I manually updated all my my computers during the time they were still being claimed.
That's regrettable, but it's not unexpectable, and I see it as a justification of my observation I've posted earlier:
"there's nothing that could make sure that this task gets assigned to an idle and active host."
BTW, I haven't received from these workunits either.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 207
Credit: 304,023,061
RAC: 1,228,868
Level
Asp
Scientific publications
watwat
Message 47926 - Posted: 28 Sep 2017 | 10:05:30 UTC

Has the 1 WU per GPU rule been enforced? It seems so. I have just purchased a new 1080ti and have very poor utilization (~70%) with only one 1WU (although the client config is set to 2WU in parallel). My 6700K (not OC yet) isn't fast enough to feed the 1080ti with data, so this is the outcome. My old 1080 had >90% with 2WU even under Windows 10.

What a bother.

Any suggestion ... aside from switching to Folding@Home ... or cooling the 6700K with LN2 and massively OC it?
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 207
Credit: 304,023,061
RAC: 1,228,868
Level
Asp
Scientific publications
watwat
Message 47927 - Posted: 28 Sep 2017 | 10:06:10 UTC - in response to Message 47926.
Last modified: 28 Sep 2017 | 10:07:32 UTC

Well.. at present there are not many WU in the Queue anyway... :-(
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,332,297,781
RAC: 5,255,441
Level
Met
Scientific publications
watwat
Message 47928 - Posted: 28 Sep 2017 | 10:55:27 UTC - in response to Message 47926.

Has the 1 WU per GPU rule been enforced? It seems so. I have just purchased a new 1080ti and have very poor utilization (~70%) with only one 1WU (although the client config is set to 2WU in parallel). My 6700K (not OC yet) isn't fast enough to feed the 1080ti with data, so this is the outcome. My old 1080 had >90% with 2WU even under Windows 10.


One WU per GPU has not been enforced, rather there just aren't enough WUs to go around so everyone is lucky if they get just one. As for the utilization, if you look at the Server Status link at the top right of the web page, you'll see that they run a fleet of different WUs, all with different utilization. Some with have 90%+ even on windows, and some, 70-80% on any machine and any operating system.

There are ways to improve this. There is a variable called SWAN_SYNC in windows' environment variables. You get there by searching "variable" in the windows search. Once there you can click Environment variables on the bottom right. Under "user variables for your account" Click New and add SWAN_SYNC with a variable value of 1. Restart your computer and you will gain a few % in GPU utilization.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 207
Credit: 304,023,061
RAC: 1,228,868
Level
Asp
Scientific publications
watwat
Message 47938 - Posted: 2 Oct 2017 | 20:50:56 UTC

I beg to differ. My app_config.xml is the same as always... but Boinc seems to ignore it.

<app_config>
<app>
<name>acemdlong</name>
<max_concurrent>2</max_concurrent>
<cpu_usage>1.0</cpu_usage>
<gpu_usage>0.5</gpu_usage>
</app>
</app_config>



____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

mesman21
Send message
Joined: 16 Apr 09
Posts: 4
Credit: 291,585,309
RAC: 2,069,584
Level
Asn
Scientific publications
wat
Message 47939 - Posted: 3 Oct 2017 | 13:13:52 UTC - in response to Message 47938.

Have you tried disabling Hyper Threading (HT)? I've been experimenting and have seen slightly higher utilization when HT is off on my i7-7700k. CPU usage jumps from 13% to 25% per WU, and GPU-Z shows a small gain in usage.

I wish there was a way to force the WU to use a full physical core without turning HT off, but I haven't been able to figure it out. I tried setting <cpu_usage>2.0</cpu_usage> with HT on, but that didn't help, still 13% usage.

My comments apply to Windows 10, cpu usage in Linux seems less, I'm not sure this would help there.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,679,584,594
RAC: 9,991,696
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47940 - Posted: 3 Oct 2017 | 16:39:39 UTC - in response to Message 47939.

I've been experimenting and have seen slightly higher utilization when HT is off on my i7-7700k.
That's because there are only 3 CPU tasks running when HT is off, and 7 when HT is on.
You can gain a little further GPU utilization if you run only 1 CPU task, or no CPU tasks at all.

[when HT is off] CPU usage jumps from 13% to 25% per WU
That's because when HT is on your CPU can handle 8 threads (on the 4 physical cores), so one thread uses 100%/8=12.5% (~13%),
while your CPU can handle 4 threads on the 4 physical cores when HT is off, so one thread uses 100%/4=25%

GPU-Z shows a small gain in usage [when HT is off].
See my first explanation for its reason. (less CPU tasks = higher GPU usage)

I wish there was a way to force the WU to use a full physical core without turning HT off, but I haven't been able to figure it out. I tried setting <cpu_usage>2.0</cpu_usage> with HT on, but that didn't help, still 13% usage.
The <cpu_usage> option tells the BOINC manager how many cores (threads) to assign for the given application, it does not instructs the application to use that many cores (threads). You can use the SWAN_SYNC environmental variable to instruct the GPUGrid application to use a full core (thread), but it won't use more than one.

My comments apply to Windows 10, cpu usage in Linux seems less, I'm not sure this would help there.
There's no WDDM in Linux, so GPU usage will be higher under Linux than on Windows 10 (Vista, 7, 8, 8.1).

mesman21
Send message
Joined: 16 Apr 09
Posts: 4
Credit: 291,585,309
RAC: 2,069,584
Level
Asn
Scientific publications
wat
Message 47941 - Posted: 3 Oct 2017 | 17:35:43 UTC - in response to Message 47940.

less CPU tasks = higher GPU usage

To clarify, are you saying that the only reason that disabling HT is showing an improvement is because it's reducing the number of CPU tasks from other projects? That would imply that the increase in CPU usage from 12.5% (HT) to 25% (HT off) does not in itself improve WU return times.
For example, if no other projects were running, I would still think that more CPU usage is better regardless, but perhaps it is not that simple.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,679,584,594
RAC: 9,991,696
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47942 - Posted: 3 Oct 2017 | 18:31:47 UTC - in response to Message 47941.
Last modified: 3 Oct 2017 | 18:35:56 UTC

less CPU tasks = higher GPU usage
To clarify, are you saying that the only reason that disabling HT is showing an improvement is because it's reducing the number of CPU tasks from other projects?
Yes.

That would imply that the increase in CPU usage from 12.5% (HT) to 25% (HT off) does not in itself improve WU return times.
Yes.

For example, if no other projects were running, I would still think that more CPU usage is better regardless, but perhaps it is not that simple.
It's not that simple, as there are many factors which are reducing the GPU usage.
WDDM has the most impact, and it can't be turned off; you should use Linux or Windows XP (up to GTX 980 Ti, so you can choose only Linux for your GTX 1080)
More than 1 CPU task usually cause demonstrable decrease (depending on the task, as these are usually work on large data sets, which won't fit in the L3 cache of the CPU, thus they will flood the memory bus which will slow everything down a bit).
Using SWAN_SYNC (in combination with no CPU tasks) could increase the the GPU usage on Windows XP by up to ~15%, and by up to ~5% on Windows 10.

BTW you can try it very easily by simply suspending all CPU projects in BOINC manager for a test period.

[CSF] Aleksey Belkov
Avatar
Send message
Joined: 26 Dec 13
Posts: 28
Credit: 385,548,151
RAC: 503,395
Level
Asp
Scientific publications
watwatwatwatwatwat
Message 47945 - Posted: 5 Oct 2017 | 0:25:26 UTC - in response to Message 47942.

Retvari Zoltan wrote:
Using SWAN_SYNC (in combination with no CPU tasks)...

..or just increase priority of GPU tasks.
I wrote earlier in this message how to automate the process of increasing priority of GPU tasks.
I see no reason to abandon computing CPU tasks, if there is a quite simple way to minimize their effect(or other programs, if it's not a dedicated host for computing) on GPU tasks.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,150,252,564
RAC: 1,055,558
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47946 - Posted: 5 Oct 2017 | 11:54:30 UTC - in response to Message 47945.
Last modified: 5 Oct 2017 | 12:07:34 UTC

Aleksey Belkov wrote:
Retvari Zoltan wrote:
Using SWAN_SYNC (in combination with no CPU tasks)...

..or just increase priority of GPU tasks.
I wrote earlier in this message how to automate the process of increasing priority of GPU tasks.
I see no reason to abandon computing CPU tasks, if there is a quite simple way to minimize their effect(or other programs, if it's not a dedicated host for computing) on GPU tasks.


As people stated in your link...

GPU tasks already run, by default, at a higher process priority (Below Normal process priority = 6) than CPU tasks (Idle process priority = 4). You can inspect the "Base Prio" column in Task Manager, or the "Priority" column in Process Explorer, to confirm. The Windows Task scheduler does a good job of honoring these priorities.

So...

Are you seeing a speedup by hacking the priorities with Process Hacker? And, if so, isn't the real reason that you see it is because you're bumping priorities to be higher than the (Normal priority = 8) tasks that Windows uses for all your other non-BOINC processes?

BOINC is meant to run tasks without interfering with the normal operations of a device. So, I think our defaults work well to do that, and I think your hacking may work well to achieve more GPU throughput at the expense of potentially slowing down normal operations of the device which now run at a priority lower than your hacked processes.

Regards,
Jacob

mmonnin
Send message
Joined: 2 Jul 16
Posts: 47
Credit: 55,952,319
RAC: 1,365,176
Level
Thr
Scientific publications
wat
Message 47947 - Posted: 5 Oct 2017 | 14:28:16 UTC - in response to Message 47945.

Retvari Zoltan wrote:
Using SWAN_SYNC (in combination with no CPU tasks)...

..or just increase priority of GPU tasks.
I wrote earlier in this message how to automate the process of increasing priority of GPU tasks.
I see no reason to abandon computing CPU tasks, if there is a quite simple way to minimize their effect(or other programs, if it's not a dedicated host for computing) on GPU tasks.


Or run #ofCPUThreads minus 2 to leave a full core for GPUGrid if you must. I sure wouldn't castrate CPU production from 7 to 3 tasks for a single GPU task. The 3 would run faster but production most likely would go down overall.

There just needs to be more tasks. Plain and simple. There is a Formula BOINC competition right now for the next 3 days and it's not going to be a competition of who has the most processing power but who can get the most work. :(

[CSF] Aleksey Belkov
Avatar
Send message
Joined: 26 Dec 13
Posts: 28
Credit: 385,548,151
RAC: 503,395
Level
Asp
Scientific publications
watwatwatwatwatwat
Message 47948 - Posted: 5 Oct 2017 | 16:54:41 UTC - in response to Message 47946.

Jacob Klein wrote:

Are you seeing a speedup by hacking the priorities with Process Hacker? And, if so, isn't the real reason that you see it is because you're bumping priorities to be higher than the (Normal priority = 8) tasks that Windows uses for all your other non-BOINC processes?

BOINC is meant to run tasks without interfering with the normal operations of a device. So, I think our defaults work well to do that, and I think your hacking may work well to achieve more GPU throughput at the expense of potentially slowing down normal operations of the device which now run at a priority lower than your hacked processes.

Probably I did not express myself correctly.

Described method ensures that for GPU tasks will be allocated as much resources as they are requesting and CPU tasks will receive all remaining available resources(after execution of other processes having higher priority).
In my opinion the method using SWAN_SYNC leads to loss of some CPU resources, which is rigidly assigned for GPU tasks. In previous tests I did not see a significant difference between using SWAN_SYNC and increasing priority, that has made me think that GPU tasks don't need such amount of CPU time. Therefore, it is sufficient to prioritise the execution of GPU tasks over other processes(not time-critical) to improve performance relative to the standard mode(without SWAN_SYNC and raising priority).
I in my experience when priority of the GPU tasks increasing, I have not noticed any problems with responsiveness of system(in my case it isn't a dedicated host) or any other negative effects(unless you don't try to play demanding 3D games).

Of course, on different systems and in different usage scenarios, the effect can vary greatly.
I suggest you conduct your own tests on a dedicated host or home/work host.
I believe that this method is particularly useful for those who run GPUGRID on home computers.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,150,252,564
RAC: 1,055,558
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47949 - Posted: 5 Oct 2017 | 17:02:54 UTC

Thanks for the reply. I'm curious how much of a speedup your process hacking actually gets?

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 207
Credit: 304,023,061
RAC: 1,228,868
Level
Asp
Scientific publications
watwat
Message 47950 - Posted: 5 Oct 2017 | 22:17:21 UTC

well ..... back to my question and the topic ... is the 1 Work Unit Per GPU rule now set? If so, may I protest. That measure has an enormous impact on my GPU utilization.

____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,150,252,564
RAC: 1,055,558
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47951 - Posted: 5 Oct 2017 | 22:41:16 UTC
Last modified: 5 Oct 2017 | 22:41:38 UTC

No, I don't think it is set. What evidence do you have? I have a PC with 2 GPUs and 4 GPUGrid GPU tasks.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,679,584,594
RAC: 9,991,696
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47953 - Posted: 6 Oct 2017 | 8:54:41 UTC - in response to Message 47950.
Last modified: 6 Oct 2017 | 8:55:24 UTC

is the 1 Work Unit Per GPU rule now set?
It's not set. There was a debate about this earlier when there was a shortage, with minimal response from the staff.
I think it would be much better if the low-end cards would be refused to get work from the long queue, as this is predictably futile considering the 5 days deadline.
See this workunit.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,679,584,594
RAC: 9,991,696
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47954 - Posted: 6 Oct 2017 | 10:29:24 UTC - in response to Message 47949.

Thanks for the reply. I'm curious how much of a speedup your process hacking actually gets?
I was experimenting with process priority and CPU affinity also on my various CPUs (i7-870, i7-980x, i7-4770k, i3-4160, i3-4360, i7-4930k), but the GPU performance gain from these tricks was negligible compared to the lack of WDDM, and less (or none) CPU tasks.
This led me to the conclusion that a single PC could not excel simultaneously in GPU and CPU performance, thus there's no need for a high-end CPU (though it should be state of the art) to maximize high-end GPU performance. It is very hard to resist the temptation of running 12 CPU tasks on a very expensive (6-core+HT) CPU which will reduce GPU performance; thus it's better to build (at least) two separate PCs: one for CPU tasks and one for GPU tasks. This is the reason for my PCs with i3 CPUs: I rather spend more money on better GPUs than on better CPUs.
Regarding RAC (or PPD): the loss of RAC of a high-end GPU by running CPU tasks simultaneously on the same PC is much bigger than the RAC gain of CPU tasks, so the overall RAC of the given PC will be lower if it runs CPU and GPU tasks simultaneously.
Of course if someone could have only one PC, their possibilities for performance optimization are limited, and my advice is not worth to be applied (because of the huge loss in the user's overall CPU performance).
Sorry for writing the same thing different ways many times, but my experiences are confirmed by the fact that my GTX 980Ti's (driven by i3 CPUs) are competing with GTX 1080Ti's on the performance page. Though the days of my GTX 980Ti's (running under Windows XPx64) are numbered, as the next generation (Volta) GPUs will wash them away from the top list even with WDDM, we could still use Linux to avoid WDDM, so my advice will still stand.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,150,252,564
RAC: 1,055,558
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47955 - Posted: 6 Oct 2017 | 12:57:20 UTC
Last modified: 6 Oct 2017 | 12:57:44 UTC

Thanks Retvari.

You must understand that your goals of "maximized GPU performance" are radically different than my goal of "help out lots of projects with science, on my regular PC". I'm attached to all projects, do work for about 12 of them, I happily run lots of CPU tasks alongside the GPU tasks, and I use Windows 10 (Insider - Fast Ring - I love to beta test). I do not care at all about credit, but I do care about utility of helping the projects.

I adjust my settings to allow a little bit of a performance boost for GPUGrid.
My settings are:
- Use app_config to budget 0.500 CPU per GPUGrid task
- Main PC: Set BOINC to use (n-1) % CPUs
- Unattended PCs: Set BOINC to use 100% CPUs

One of the main reasons I use (n-1) there, also, is because there has always been a Windows 10 shell priority bug where right-clicking taskbar icons gets a delayed response - sometimes several seconds! That piece of shell code must be running at IDLE priority, and stalls if using 100% CPUs! Using (n-1) alleviates it.

Kind regards,
Jacob

mesman21
Send message
Joined: 16 Apr 09
Posts: 4
Credit: 291,585,309
RAC: 2,069,584
Level
Asn
Scientific publications
wat
Message 47956 - Posted: 6 Oct 2017 | 13:32:48 UTC - in response to Message 47955.

I can understand the different goals and mindsets with all the projects out there. I'd say my goals are more in line with Retvari's; fastest GPU WU return times possible. With this in mind, I've detached from all CPU projects and have seen improvements. The faster/ more GPU's you have in your system the more this makes sense. An improvement of only 1% on the return times from my pair of GTX 1080's easily provides more RAC than my i7-7700k could running CPU tasks alone.

Regardless of your goals, Aleksey's advice of running "Process Hacker" can be extremely beneficial for those of us running tasks on a non-dedicated machine. For example, I run Plex media server on the same machine, and my WUs would get decimated whenever Plex started transcoding video. The Plex transcoder ran at "Normal" priority an WUs at "Below Normal". With "Process Hacker" I was able to permanently adjust Plex transcoder to a lower priority and WUs to a higher priority. Now the WU's hardly slow at all during transcoding and I've seen no performance reduction in Plex. Thank you!

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,150,252,564
RAC: 1,055,558
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47957 - Posted: 6 Oct 2017 | 13:50:23 UTC - in response to Message 47956.
Last modified: 6 Oct 2017 | 13:52:16 UTC

I would encourage you to do some testing, using the following BOINC client options (which you'd use in cc_config.xml), especially <process_priority_special>, which might allow you to do what you want without hacking.

https://boinc.berkeley.edu/wiki/Client_configuration

<no_priority_change>0|1</no_priority_change>

If 1, don't change priority of applications (run them at same priority as client).
NB: This option can, if activated, impact system responsiveness for the user. Default, all CPU science apps run at lowest (idle) priority Nice 15.


<process_priority>N</process_priority>
<process_priority_special>N</process_priority_special>

The OS process priority at which tasks are run. Values are 0 (lowest priority, the default), 1 (below normal), 2 (normal), 3 (above normal), 4 (high) and 5 (real-time - not recommended). 'special' process priority is used for coprocessor (GPU) applications, wrapper applications, and non-compute-intensive applications, 'process priority' for all others. The two options can be used independently. New in 7.6.14


I bet they work for you!!
Try them and let us know :)

mesman21
Send message
Joined: 16 Apr 09
Posts: 4
Credit: 291,585,309
RAC: 2,069,584
Level
Asn
Scientific publications
wat
Message 47958 - Posted: 6 Oct 2017 | 17:30:01 UTC - in response to Message 47957.

Try them and let us know :)

I tried these changes to cc_config, and they worked for all of the other projects I tried, but not for GPUgrid. I was running CPU tasks on WCG and I run GPU tasks on Einstein when work is low here. I was able to manipulate the priority of each, setting a higher priority for the Einstein GPU tasks. No such luck on GPUgrid tasks; that's why I was so happy to hear about "Process Hacker".

Maybe it's just me, has anyone successfully changed the priority of GPUgrid tasks with cc_config?

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,150,252,564
RAC: 1,055,558
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47959 - Posted: 6 Oct 2017 | 17:54:13 UTC - in response to Message 47958.

I reproduced your problem - GPUGrid GPU tasks aren't honoring that <process_priority_special>. Other projects' GPU tasks do honor it.

I guess that's another bug to add to GPUGrids list of brokenness. Dangit.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 47
Credit: 55,952,319
RAC: 1,365,176
Level
Thr
Scientific publications
wat
Message 47960 - Posted: 6 Oct 2017 | 21:06:55 UTC - in response to Message 47959.

I reproduced your problem - GPUGrid GPU tasks aren't honoring that <process_priority_special>. Other projects' GPU tasks do honor it.

I guess that's another bug to add to GPUGrids list of brokenness. Dangit.


Would swan_sync play a role with this priority option?

mmonnin
Send message
Joined: 2 Jul 16
Posts: 47
Credit: 55,952,319
RAC: 1,365,176
Level
Thr
Scientific publications
wat
Message 47961 - Posted: 6 Oct 2017 | 21:06:57 UTC - in response to Message 47959.
Last modified: 6 Oct 2017 | 21:07:39 UTC

Edit: Double post :(

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,150,252,564
RAC: 1,055,558
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47962 - Posted: 6 Oct 2017 | 21:15:45 UTC - in response to Message 47960.

I reproduced your problem - GPUGrid GPU tasks aren't honoring that <process_priority_special>. Other projects' GPU tasks do honor it.

I guess that's another bug to add to GPUGrids list of brokenness. Dangit.


Would swan_sync play a role with this priority option?


No, I don't think so.

[CSF] Aleksey Belkov
Avatar
Send message
Joined: 26 Dec 13
Posts: 28
Credit: 385,548,151
RAC: 503,395
Level
Asp
Scientific publications
watwatwatwatwatwat
Message 47963 - Posted: 6 Oct 2017 | 22:14:46 UTC - in response to Message 47960.
Last modified: 6 Oct 2017 | 22:15:13 UTC

mmonnin wrote:

Would swan_sync play a role with this priority option?

This combination can lead to significant problems with responsiveness of system.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,679,584,594
RAC: 9,991,696
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47964 - Posted: 7 Oct 2017 | 9:29:52 UTC - in response to Message 47959.
Last modified: 7 Oct 2017 | 10:12:41 UTC

I reproduced your problem - GPUGrid GPU tasks aren't honoring that <process_priority_special>. Other projects' GPU tasks do honor it.

I guess that's another bug to add to GPUGrids list of brokenness. Dangit.
If I can recall it correctly, this behavior is intentional.
Originally, the GPUGrid app run at the same process priority level ("Idle") as CPU tasks, but it turned out to hinder GPU performance. Back then this <process_priority_special> did not exist (or the staff thought that it wouldn't be used by many participants) so it's been hard coded into the app to run at "below normal" priority level.

EDIT: it was the result of "iterating" the optimal process priority level, as when it was hard coded to "above normal" some systems (Core2 Duo and Core2 Quad using SWAN_SYNC) became sluggish back then. While it's not explicitly stated, see this post by GDF (and the whole thread).
EDIT2: See this thread also.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,150,252,564
RAC: 1,055,558
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47965 - Posted: 7 Oct 2017 | 11:47:07 UTC

Well, it should be fixed. If the user is going to use Swan_Sync via manual system variable, they can use manual cc_config to control the priority (if the app isn't rude like it is currently).

Fixable. Some staff required.

Post to thread

Message boards : Number crunching : 1 Work Unit Per GPU