Advanced search

Message boards : News : Changes to scheduling policy

Author Message
Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38145 - Posted: 29 Sep 2014 | 8:09:51 UTC
Last modified: 4 Oct 2014 | 18:16:43 UTC

Hi all,

In attempt to rationalise the rules for assigning WUS to crunchers, I've made some changes to the underlying scheduler program. Here are the new rules:



* If you have driver >= 343.00 and sm >= 2.0 you will get a CUDA 6.5

* If you have driver >= 334.21 and < 343.00 and sm >= 2.0 you will get a CUDA 6.0

* If you have driver >= 295.30 and< 334.21 and sm >= 2.0 and < 5.0, you'll get a CUDA 4.2

* If you have driver >= 295.30 and sm == 1.3 you'll only get CUDA 4.2


Matt

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 397
Credit: 5,328,418,628
RAC: 2,019,184
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38151 - Posted: 29 Sep 2014 | 10:22:14 UTC - in response to Message 38145.

I have a GTX690 card and driver 344.11 cuda 6.5, and I am getting nothing.


9/29/2014 6:23:27 AM | GPUGRID | Requesting new tasks for NVIDIA
9/29/2014 6:23:31 AM | GPUGRID | Scheduler request completed: got 0 new tasks
9/29/2014 6:23:31 AM | GPUGRID | No tasks sent
9/29/2014 6:23:31 AM | GPUGRID | No tasks are available for ACEMD beta version
9/29/2014 6:23:31 AM | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)
9/29/2014 6:23:31 AM | GPUGRID | Tasks for CPU are available, but your preferences are set to not accept them



Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38155 - Posted: 29 Sep 2014 | 11:00:05 UTC - in response to Message 38151.

There was a tiny problem. You should be getting something now.

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38165 - Posted: 29 Sep 2014 | 13:07:13 UTC

Not all works correct if I may. I have probably a to old driver (331.00) as the newer is a bit slower for my 780Ti's on win7.
So I could only do cuda 42 and 55. Now I get cuda 60, witch I should not get when read correct, and they error all out, but do not start to run.
According to new scheduling I should not get these tasks?
____________
Greetings from TJ

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,082,528
RAC: 77,103
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38167 - Posted: 29 Sep 2014 | 13:24:27 UTC

I hope you're still able to feed my rigs 3 GPUs, both now and in the future:

Current setup:
GTX 660 Ti
GTX 660 Ti
GTX 460

Setup in 2 years:
GTX 980 Ti
GTX 660 Ti
GTX 660 Ti

:)

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38168 - Posted: 29 Sep 2014 | 13:29:44 UTC - in response to Message 38165.

Ok, looks like your old driver is incorrectly reporting CUDA 6 capability.
Is there no newer driver you are prepared to update to? 331 is touching a year old now.

Matt

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38170 - Posted: 29 Sep 2014 | 13:47:46 UTC - in response to Message 38168.

Ok, looks like your old driver is incorrectly reporting CUDA 6 capability.
Is there no newer driver you are prepared to update to? 331 is touching a year old now.

Matt

Thanks Matt, yes I have already downloaded the latest version an will update as the queue is empty. I know that is not necessary but that is the way I do always and I have no loss of any work, having a high electricity bill.
____________
Greetings from TJ

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2185
Credit: 15,844,028,804
RAC: 352,592
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38218 - Posted: 1 Oct 2014 | 8:52:05 UTC
Last modified: 1 Oct 2014 | 8:52:48 UTC

The scheduler is still doing its job in a strange manner.
I've set my host with two GTX 780Ti to receive short runs also. It has received 4 short units, 3 of them were CUDA6.5, the 4th was CUDA6.0.

Profile [AF>Amis des Lapins]CeDri...
Send message
Joined: 12 Sep 14
Posts: 5
Credit: 5,705,494
RAC: 4
Level
Ser
Scientific publications
wat
Message 38250 - Posted: 2 Oct 2014 | 9:45:12 UTC
Last modified: 2 Oct 2014 | 10:24:06 UTC

Hi all

I have a Asus [2] NVIDIA GeForce GTX 295 (896MB) driver: 340.52, and I am getting nothing cuda60. Short runs (2-3 hours on fastest card) v8.41 (cuda60)
it worked before!

mighty-atom-PC

1799 GPUGRID 02/10/2014 12:16:34 No tasks are available for CPU only app
1798 GPUGRID 02/10/2014 12:16:34 No tasks are available for ACEMD beta version
1797 GPUGRID 02/10/2014 12:16:34 No tasks are available for Short runs (2-3 hours on fastest card)
1796 GPUGRID 02/10/2014 12:16:34 No tasks sent
1795 GPUGRID 02/10/2014 12:16:34 Scheduler request completed: got 0 new tasks
1794 GPUGRID 02/10/2014 12:16:32 Requesting new tasks for CPU and NVIDIA


Help me
Thank you

mikey
Send message
Joined: 2 Jan 09
Posts: 282
Credit: 535,558,191
RAC: 67
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38252 - Posted: 2 Oct 2014 | 11:54:59 UTC

My three NVIDIA GeForce GTX 760 (2048MB) driver: 337.88, each in its own machine, are getting 6.0 work and doing just fine.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38254 - Posted: 2 Oct 2014 | 13:04:17 UTC - in response to Message 38218.

It's as specific as I can make it until I've a CUDA 6.5 app on each queue.

Matt

Profile [AF>Amis des Lapins]CeDri...
Send message
Joined: 12 Sep 14
Posts: 5
Credit: 5,705,494
RAC: 4
Level
Ser
Scientific publications
wat
Message 38255 - Posted: 2 Oct 2014 | 13:07:29 UTC - in response to Message 38254.

I have a Asus [2] NVIDIA GeForce GTX 295 (896MB) driver: 340.52, and I am getting nothing cuda60. Short runs (2-3 hours on fastest card) v8.41 (cuda60)
it worked before!

Please help me

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38257 - Posted: 2 Oct 2014 | 13:12:40 UTC - in response to Message 38255.


it worked before!


Yes, quite so- sm 1.3 is re-enabled for the moment. I'll probably drop this support fairly soon, though. GTX270-295 account for less than 1% of our throughput now, and it's not worth maintaining an application for.

Matt

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 324
Credit: 72,394,453
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38258 - Posted: 2 Oct 2014 | 13:14:15 UTC - in response to Message 38254.

Hi, With my GTX 770 in Ubuntu 14.04 Boinc 7.4.22 and Nvidia 340.46 (cuda 6.5) I receive tasks only 8.21 (60 Cuda).

Profile [AF>Amis des Lapins]CeDri...
Send message
Joined: 12 Sep 14
Posts: 5
Credit: 5,705,494
RAC: 4
Level
Ser
Scientific publications
wat
Message 38259 - Posted: 2 Oct 2014 | 13:17:23 UTC - in response to Message 38257.

Ok. Thanks

Have a nice day

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38260 - Posted: 2 Oct 2014 | 14:22:25 UTC - in response to Message 38258.
Last modified: 2 Oct 2014 | 14:22:55 UTC

Hi, With my GTX 770 in Ubuntu 14.04 Boinc 7.4.22 and Nvidia 340.46 (cuda 6.5) I receive tasks only 8.21 (60 Cuda).

Hi Carlesa25, if I have read all the posts correct, than at the moment cuda65 is only in beta and short runs.
____________
Greetings from TJ

biodoc
Send message
Joined: 26 Aug 08
Posts: 167
Credit: 1,633,077,546
RAC: 15,524
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38283 - Posted: 3 Oct 2014 | 23:08:31 UTC

10/3/2014 3:45:28 PM | GPUGRID | Sending scheduler request: To fetch work.
10/3/2014 3:45:28 PM | GPUGRID | Requesting new tasks for NVIDIA GPU
10/3/2014 3:45:31 PM | GPUGRID | Scheduler request completed: got 0 new tasks
10/3/2014 3:45:31 PM | GPUGRID | No tasks sent
10/3/2014 3:45:31 PM | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)


No tasks available for my GTX780Ti on linux with latest drivers (343.22) and boinc version 7.4.22 that reports Nvidia driver version

biodoc
Send message
Joined: 26 Aug 08
Posts: 167
Credit: 1,633,077,546
RAC: 15,524
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38287 - Posted: 4 Oct 2014 | 9:14:27 UTC - in response to Message 38283.

10/3/2014 3:45:28 PM | GPUGRID | Sending scheduler request: To fetch work.
10/3/2014 3:45:28 PM | GPUGRID | Requesting new tasks for NVIDIA GPU
10/3/2014 3:45:31 PM | GPUGRID | Scheduler request completed: got 0 new tasks
10/3/2014 3:45:31 PM | GPUGRID | No tasks sent
10/3/2014 3:45:31 PM | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)


No tasks available for my GTX780Ti on linux with latest drivers (343.22) and boinc version 7.4.22 that reports Nvidia driver version


Boinc, using linux nvidia driver 343.22, reports cuda version 6.5 for my 780Ti: No work available.

If I roll back my nvidia driver to 337.25, boinc reports cuda 6.0 and now I can download work.

Problem solved for now.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38289 - Posted: 4 Oct 2014 | 13:25:19 UTC - in response to Message 38287.

There'll be a CUDA 6.5 app for linux later today.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38290 - Posted: 4 Oct 2014 | 13:50:57 UTC - in response to Message 38287.

biodoc - it's on acemdbeta and short now. Please test it!

Matt

biodoc
Send message
Joined: 26 Aug 08
Posts: 167
Credit: 1,633,077,546
RAC: 15,524
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38292 - Posted: 4 Oct 2014 | 14:21:28 UTC - in response to Message 38290.

biodoc - it's on acemdbeta and short now. Please test it!

Matt


Ok, will do. Finishing up a windows cuda 6.5 long WU in about an hour.

Thanks!

biodoc
Send message
Joined: 26 Aug 08
Posts: 167
Credit: 1,633,077,546
RAC: 15,524
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38293 - Posted: 4 Oct 2014 | 14:28:09 UTC - in response to Message 38289.

There'll be a CUDA 6.5 app for linux later today.


Will it work for my 780Ti or is it exclusive for the 980/970?

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38295 - Posted: 4 Oct 2014 | 18:10:43 UTC

I've revised the scheduling policy rules in the original post.

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 669
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38300 - Posted: 5 Oct 2014 | 7:36:25 UTC - in response to Message 38295.

I havr a GTX460 which has not had work for over a day (long)
____________
Radio Caroline, the world's most famous offshore pirate radio station.
Great music since April 1964. Support Radio Caroline Team -
Radio Caroline

Profile [VENETO] sabayonino
Send message
Joined: 4 Apr 10
Posts: 49
Credit: 626,128,537
RAC: 4,007
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38318 - Posted: 6 Oct 2014 | 12:27:50 UTC - in response to Message 38300.
Last modified: 6 Oct 2014 | 12:29:52 UTC

Hi

no more WUs (Short and Long)

GtX 750ti - Linux - nv-343.22
GTX 780 - Linux - nv-343.22
GTX 780ti - Linux - nv-343.22
GTX 660ti - Linux - nv-343.22
GTX 760 - Linux - nv-343.22

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38319 - Posted: 6 Oct 2014 | 12:32:37 UTC - in response to Message 38318.

Veneto,

Is your BOINC client new enough to be reporting the driver version number?

Matt

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38322 - Posted: 6 Oct 2014 | 14:46:05 UTC - in response to Message 38318.

sabayonino, I see that you are getting work now.

Matt

Profile [VENETO] sabayonino
Send message
Joined: 4 Apr 10
Posts: 49
Credit: 626,128,537
RAC: 4,007
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38326 - Posted: 6 Oct 2014 | 17:28:19 UTC - in response to Message 38322.
Last modified: 6 Oct 2014 | 17:29:12 UTC

sabayonino, I see that you are getting work now.

Matt


Now all my gpus are crunching :) (cuda65)

so my boinc client version is 7.2.42 for all hosts with gpu

maybe it was a temporary problem :)

tnx

valterc
Send message
Joined: 21 Jun 10
Posts: 20
Credit: 2,844,359,513
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38337 - Posted: 7 Oct 2014 | 10:23:00 UTC - in response to Message 38326.

I have more or less the same issues on this host http://www.gpugrid.net/results.php?hostid=178360

Boinc 7.2.42
Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-36-generic x86_64)

mar 07 ott 2014 11:53:33 CEST | | CUDA: NVIDIA GPU 0: GeForce GTX 780 Ti (driver version unknown, CUDA version 6.5, compute capability 3.5, 3072MB, 2987MB available, 6022 GFLOPS peak)
mar 07 ott 2014 11:53:33 CEST | | OpenCL: NVIDIA GPU 0: GeForce GTX 780 Ti (driver version 343.13, device version OpenCL 1.1 CUDA, 3072MB, 2987MB available, 6022 GFLOPS peak)


I'm able to get short workunits *only*, no matter what I try

Profile [VENETO] sabayonino
Send message
Joined: 4 Apr 10
Posts: 49
Credit: 626,128,537
RAC: 4,007
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38339 - Posted: 7 Oct 2014 | 11:19:20 UTC - in response to Message 38337.
Last modified: 7 Oct 2014 | 11:21:13 UTC

Hi Valterc


as reported here

only shorts are available

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38340 - Posted: 7 Oct 2014 | 12:18:24 UTC - in response to Message 38339.

The Linux cuda65 app is on long now.

Matt

Hype
Send message
Joined: 21 Nov 11
Posts: 10
Credit: 7,741,968
RAC: 0
Level
Ser
Scientific publications
wat
Message 38414 - Posted: 11 Oct 2014 | 21:37:27 UTC

Hello,

unfortunately I'm getting computation errors most of the time.
I've got two GTX 570 with 2.5 GB VRAM each, newest driver 344.11.
Doesn't matter if I'm in SLI or not.
Other GPU projects like SETI, Einstein or Asteroids run fine.
Is there anything I can do?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2185
Credit: 15,844,028,804
RAC: 352,592
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38415 - Posted: 11 Oct 2014 | 22:41:56 UTC - in response to Message 38414.
Last modified: 11 Oct 2014 | 22:45:41 UTC

unfortunately I'm getting computation errors most of the time.

If you take a look into your tasks details, you could see the reason for those errors:
# The simulation has become unstable. Terminating to avoid lock-up (1)

This error is a sign of an unstable GPU. The root of this instability can be various:
- Too high GPU temperature (above 80°C - so this is not for you)
- Too low GPU voltage for the given GPU clock
- Too high GPU clock for the given GPU voltage (e.g. an aging GPU could not run even at factory settings)
- Too high GDDR5 frequency
- Insufficient, low quality or (nearly) broken PSU
- Too high transient resistance on the PCIe power connectors (usually caused by Molex->PCIe converters), or on the two 12V pins of the 24-pin MB power connector

I've got two GTX 570 with 2.5 GB VRAM each, newest driver 344.11.

This card has twice as much memory chips as a standard GTX570 has, so perhaps the GPU can't drive the memory data lanes that fast.

Doesn't matter if I'm in SLI or not.

SLI is usually a source of random errors.

Other GPU projects like SETI, Einstein or Asteroids run fine.

Other GPU projects has obsolete GPU applications built on older CUDA versions, while GPUGrid uses the latest (CUDA6.5 at the moment), therefore other projects couldn't stress the GPU as much as the GPUGrid client does.
The "GPU usage" measurement is misleading.

Is there anything I can do?

Check all power connectors in your PC for burnt ones.
Lower the GPU clock by 100MHz steps until it gets stable, if it doesn't work then try again by lowering the GDDR5 frequency by 100MHz steps.
If your GPU gets stable by lowering the GPU clock at some point, you can try to raise the GPU clock by 10-20MHz steps, while it doesn't cause these "simulation became unstable" messages, then increase the GPU voltage by 12.5mV, and repeat increasing the clock while the GPU doesn't get hot.
Beware of that different GPUGrid batches stressing the GPU differently, so if there's no stability headroom in your settings, some harder workunits could fail.

Hype
Send message
Joined: 21 Nov 11
Posts: 10
Credit: 7,741,968
RAC: 0
Level
Ser
Scientific publications
wat
Message 38510 - Posted: 14 Oct 2014 | 17:26:35 UTC

Thank you very much for the detailed information.
I checked the system and everything looks fine.
I lowered the clocks from 732 mhz to 650 mhz, but had 3 driver crashes while processing 2 short run WUs.
However, no computation errors, both completed successfully.

Hype
Send message
Joined: 21 Nov 11
Posts: 10
Credit: 7,741,968
RAC: 0
Level
Ser
Scientific publications
wat
Message 38513 - Posted: 14 Oct 2014 | 18:31:42 UTC

And the next WU crashed again at about 10% :-(

Post to thread

Message boards : News : Changes to scheduling policy