Advanced search

Message boards : Number crunching : 2nd Card not used by GPUGrid

Author Message
PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,617,042,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 48272 - Posted: 3 Dec 2017 | 1:38:41 UTC

I have added app_config.xml with the use all GPUs command and in the event log it doesn't say (not used) next to the 2nd GPU. Some projects on this computer use both GPUs and some only use 1. This is the strangest problem I've had in a while. Anyone have any idea how to fix this?

Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 48273 - Posted: 3 Dec 2017 | 2:27:46 UTC - in response to Message 48272.

You need to make a cc_config.xml and place it in the boinc folder not the GPUGrid folder.

<cc_config>
<options>
<use_all_gpus>1</use_all_gpus>
</options>
</cc_config>


Z
____________

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,617,042,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 48274 - Posted: 3 Dec 2017 | 2:54:04 UTC

I'm sorry I meant cc_config.xml. I already have this set up and boinc doesn't say (not used), below that it also says use all coprocessors. I don't know what's going on here.

12/2/2017 8:22:45 PM | | CUDA: NVIDIA GPU 0: GeForce GTX 1080 Ti (driver version 388.13, CUDA version 9.1, compute capability 6.1, 4096MB, 3550MB available, 11974 GFLOPS peak)
12/2/2017 8:22:45 PM | | CUDA: NVIDIA GPU 1: GeForce GTX 1080 Ti (driver version 388.13, CUDA version 9.1, compute capability 6.1, 4096MB, 3550MB available, 11974 GFLOPS peak)
12/2/2017 8:22:45 PM | | OpenCL: NVIDIA GPU 0: GeForce GTX 1080 Ti (driver version 388.13, device version OpenCL 1.2 CUDA, 11264MB, 3550MB available, 11974 GFLOPS peak)
12/2/2017 8:22:45 PM | | OpenCL: NVIDIA GPU 1: GeForce GTX 1080 Ti (driver version 388.13, device version OpenCL 1.2 CUDA, 11264MB, 3550MB available, 11974 GFLOPS peak)
12/2/2017 8:22:45 PM | | Host name: DESKTOP-L0VCIE2
12/2/2017 8:22:45 PM | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz [Family 6 Model 94 Stepping 3]
12/2/2017 8:22:45 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx tm2 pbe fsgsbase bmi1 hle smep bmi2
12/2/2017 8:22:45 PM | | OS: Microsoft Windows 10: Professional x64 Edition, (10.00.16299.00)
12/2/2017 8:22:45 PM | | Memory: 15.94 GB physical, 18.32 GB virtual
12/2/2017 8:22:45 PM | | Disk: 118.75 GB total, 93.87 GB free
12/2/2017 8:22:45 PM | | Local time is UTC -5 hours
12/2/2017 8:22:45 PM | | Config: use all coprocessors

Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 48275 - Posted: 3 Dec 2017 | 3:04:49 UTC - in response to Message 48274.
Last modified: 3 Dec 2017 | 3:06:01 UTC

When you installed the second card, did you remove the previous driver and do a clean install of the driver downloaded from the nvidia website? Sometimes, if you just add the card without reinstalling the driver, it will fail to use the second card. Something to try at least.

Edit..

Both cards are the same model type and vendor?

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,617,042,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 48277 - Posted: 3 Dec 2017 | 3:48:52 UTC - in response to Message 48275.

Both cards are the same model type and vendor?

Yes they are the exact same card.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,188,346,966
RAC: 10,548,139
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48278 - Posted: 3 Dec 2017 | 5:25:48 UTC - in response to Message 48277.

Both cards are the same model type and vendor?

Yes they are the exact same card.


Are you running CPU WUs, if so try suspending them.

What are resource share on the other project which are CPU only WUs if you are running them?

In Boinc Manager, in advanced view, under options, computer preferences, computing tab, what are your usage limits?




PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,617,042,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 48279 - Posted: 3 Dec 2017 | 13:48:57 UTC - in response to Message 48278.

Are you running CPU WUs, if so try suspending them.

What are resource share on the other project which are CPU only WUs if you are running them?

In Boinc Manager, in advanced view, under options, computer preferences, computing tab, what are your usage limits?

I think you found the problem. Previously I've had trouble running CPU WUs at the same time as a system with more than one GPU. For some reason even with "use at most 100% of the CPUs" and "use at most 100% of the CPU time" It still disables one GPU WU even with resource share of GPUgrid at 10000 and Drugdiscovery at 0. Anyone have any ideas of how to prioritize GPU WUs to always run no matter what?

mmonnin
Send message
Joined: 2 Jul 16
Posts: 332
Credit: 3,772,896,065
RAC: 4,765,302
Level
Arg
Scientific publications
watwatwatwatwat
Message 48280 - Posted: 3 Dec 2017 | 14:10:28 UTC - in response to Message 48279.
Last modified: 3 Dec 2017 | 15:06:05 UTC

Are you running CPU WUs, if so try suspending them.

What are resource share on the other project which are CPU only WUs if you are running them?

In Boinc Manager, in advanced view, under options, computer preferences, computing tab, what are your usage limits?

I think you found the problem. Previously I've had trouble running CPU WUs at the same time as a system with more than one GPU. For some reason even with "use at most 100% of the CPUs" and "use at most 100% of the CPU time" It still disables one GPU WU even with resource share of GPUgrid at 10000 and Drugdiscovery at 0. Anyone have any ideas of how to prioritize GPU WUs to always run no matter what?


Use an app_config.xml file to specify less than 1 CPU for GPU work. This only tells BOINC Manager to reserve so many CPU threads from NOT doing CPU work. It doesn't tell the GPU task how much CPU to really use.
<app_config>
<app>
<name>acemdlong</name>
<gpu_versions>
<gpu_usage>1</gpu_usage>
<cpu_usage>0.5</cpu_usage>
</gpu_versions>
</app>
<app>
<name>acemdshort</name>
<gpu_versions>
<gpu_usage>1</gpu_usage>
<cpu_usage>0.5</cpu_usage>
</gpu_versions>
</app>
</app_config>



This will stop BM from limiting GPU work for CPU work. You may just have had too much CPU work to complete before deadlines so it stopped GPU work. Thats more of a queue management issue.

I would still use something like Process Lasso to keep CPU threads separate from GPU threads due to GPUGrids high CPU usage in Windows.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,188,346,966
RAC: 10,548,139
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48281 - Posted: 3 Dec 2017 | 14:14:08 UTC - in response to Message 48279.

Are you running CPU WUs, if so try suspending them.

What are resource share on the other project which are CPU only WUs if you are running them?

In Boinc Manager, in advanced view, under options, computer preferences, computing tab, what are your usage limits?

I think you found the problem. Previously I've had trouble running CPU WUs at the same time as a system with more than one GPU. For some reason even with "use at most 100% of the CPUs" and "use at most 100% of the CPU time" It still disables one GPU WU even with resource share of GPUgrid at 10000 and Drugdiscovery at 0. Anyone have any ideas of how to prioritize GPU WUs to always run no matter what?


The only thing I can think of, is to separate your projects into CPU WUs only and GPU WUs only and give the GPU WUs only resource share greater than 1 and the CPU WUs only zero.



Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48283 - Posted: 5 Dec 2017 | 3:53:39 UTC

If BOINC thinks that a task is at risk of missing its deadline, it will schedule it first (ie: prioritize it), even if it is a CPU task.

My suggestion: Find out which projects are giving your CPU tasks that end up in deadline risk high-priority mode. Then see if the deadlines are too short for them.

If the deadlines are reasonable, then perhaps your cache settings ("Store at least x days" and "Store up to an additional y days") are set way too high, such that you get a ton of work which BOINC thought you could make at the time, but the estimate wasn't perfect, and when it came time to run them they went into deadline risk high-priority mode.

My suggestion: Try cache settings of x = 0.8 days, and y = 0.5 days.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 332
Credit: 3,772,896,065
RAC: 4,765,302
Level
Arg
Scientific publications
watwatwatwatwat
Message 48285 - Posted: 5 Dec 2017 | 21:32:35 UTC

If BOINC thinks that a task is at risk of missing its deadline, it will schedule it first (ie: prioritize it), even if it is a CPU task.


^^ Applies if the GPU app is set to use a full CPU thread. That is for combined GPU tasks. If 4 GPUs set to 0.25 CPUs then 1 CPU thread will not have a CPU task running. Then BOINC can pause GPU tasks and run a CPU task to meet deadlines. Have it set so GPU tasks are less than 1 total and BM will not do that.

Windows GPUGrid needs basically a full CPU thread so play with that as you wish.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48286 - Posted: 5 Dec 2017 | 21:45:16 UTC - in response to Message 48285.
Last modified: 5 Dec 2017 | 21:46:35 UTC

It applies at all times.

As an example, let's say you have a GPU app that budgets (1 NVIDIA + 0.501 CPU) for each instance. Let's say you have 2 GPUs, and 8 CPUs, with plenty of tasks available, but the CPU ones all have immediate deadlines.

When the BOINC scheduler runs, it:
- Sees the "deadline miss" CPU tasks
- Schedules them first... 1, 2, 3, ..., 8.
- Allows "up to n+1 CPUs", for any GPU tasks... so, allows budgeting for up-to-9 CPUs.
- Will schedule *1* of your GPU tasks, bringing CPU budget up to 8.501
- Will NOT schedule the 2nd GPU, because that would put the budget to 9.002, which is over the 9.000 limit.

I have a few rigs with multiple GPUs, and also some settings like GPUGrid to run multiple tasks per GPU. So, this exact scenario comes up from time to time.

If CPU tasks go into deadline-risk mode, it's possible that a GPU task won't get scheduled as a result, until the CPU task finishes... all because of this budgeting that you can count out yourself.

To workaround the problem, see if the CPU deadlines aren't too short (a project problem that some projects have), or that your caches aren't too large (which can result in deadlines misses because task completion estimates aren't accurate).

Fun.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 332
Credit: 3,772,896,065
RAC: 4,765,302
Level
Arg
Scientific publications
watwatwatwatwat
Message 48290 - Posted: 7 Dec 2017 | 0:22:00 UTC - in response to Message 48286.
Last modified: 7 Dec 2017 | 0:22:57 UTC

It applies at all times.

As an example, let's say you have a GPU app that budgets (1 NVIDIA + 0.501 CPU) for each instance. Let's say you have 2 GPUs, and 8 CPUs, with plenty of tasks available, but the CPU ones all have immediate deadlines.

When the BOINC scheduler runs, it:
- Sees the "deadline miss" CPU tasks
- Schedules them first... 1, 2, 3, ..., 8.
- Allows "up to n+1 CPUs", for any GPU tasks... so, allows budgeting for up-to-9 CPUs.
- Will schedule *1* of your GPU tasks, bringing CPU budget up to 8.501
- Will NOT schedule the 2nd GPU, because that would put the budget to 9.002, which is over the 9.000 limit.

I have a few rigs with multiple GPUs, and also some settings like GPUGrid to run multiple tasks per GPU. So, this exact scenario comes up from time to time.

If CPU tasks go into deadline-risk mode, it's possible that a GPU task won't get scheduled as a result, until the CPU task finishes... all because of this budgeting that you can count out yourself.

To workaround the problem, see if the CPU deadlines aren't too short (a project problem that some projects have), or that your caches aren't too large (which can result in deadlines misses because task completion estimates aren't accurate).

Fun.


No it does not apply to ALL times. I said to keep CPU cores under 1 CPU thread for GPU tasks.


^^ Applies if the GPU app is set to use a full CPU thread. That is for combined GPU tasks.


One can easily make BM go into high priority but it does not mean GPU tasks are suspended for CPU work.

I upped my queue to 10 days on a 3770k that is running 8 CPU threads that are in high priority AND a GPU task. Until CPU usage for GPU tasks is over 1, high priority will not affect GPU tasks.

https://imgur.com/a/WpQqR

Post to thread

Message boards : Number crunching : 2nd Card not used by GPUGrid

//