Advanced search

Message boards : News : monitor suspend/resume bug in 295/296 drivers

Author Message
Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 593
Credit: 4,273,184
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 23636 - Posted: 24 Feb 2012 | 21:49:03 UTC
Last modified: 25 Feb 2012 | 9:55:52 UTC

There are some reports of bugs concerning the latest NVIDIA drivers (failures when monitor goes to sleep). GPUGRID may not be immune to the bug. If it occurs to you, either

* rollback to previous drivers
* or configure the monitor so that it does not turn off

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,147,372,614
RAC: 1,071,233
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23637 - Posted: 24 Feb 2012 | 22:26:00 UTC

The following driver sets are bugged for me:
- 295.73 WHQL
- 295.51 Beta

The last driver set that worked for me was:
- 290.53 Beta

For me, the bug affects all 3 of my GPU projects, most of the time making tasks error out immediately:
- GPUGRID.net
- Einstein@Home
- SETI@Home

Michael Goetz
Avatar
Send message
Joined: 2 Mar 09
Posts: 124
Credit: 2,974,144
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 23638 - Posted: 24 Feb 2012 | 23:12:59 UTC - in response to Message 23636.

There are some reports of bugs concerning the latest NVIDIA drivers (failures when monitor goes to sleep). GPUGRID may be immune to the bug, but if it occurs to you, rollback to previous drivers.


I wrote the BOINC version of the GeneferCUDA app over at PrimeGrid, and the diagnostics it's spitting out indicate that the CUDA subsystem is completely unavailable when the 295 drivers put a monitor into sleep mode. As far as I can tell, no CUDA program at all, from any project, or even non-BOINC CUDA programs, will be able to work under these circumstances.

I don't know yet which platforms it affects (Windows/Linux/Mac), and I don't know if OpenCL is affected, but I'd be very surprised if the GPUGRID apps worked.

We're advising people to either use an earlier driver, or make sure they've configured their system to never turn the monitors off.
____________
My lucky number is 75898^524288+1

jjwhalen
Send message
Joined: 23 Nov 09
Posts: 29
Credit: 17,591,899
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23640 - Posted: 25 Feb 2012 | 1:05:51 UTC

Does anyone know if nVIDIA is aware of/working on this issue?
____________

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,147,372,614
RAC: 1,071,233
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23642 - Posted: 25 Feb 2012 | 5:16:44 UTC - in response to Message 23640.
Last modified: 25 Feb 2012 | 5:20:23 UTC

nVidia has been informed, but there has been no response.

Claggy reported the issue formally to nVidia, on 2/1/2012, using Ref 120201-000013, as posted here:
http://forums.nvidia.com/index.php?showtopic=223426&view=findpost&p=1374585

I reported the issue in their 295.51 Beta drivers thread, on 2/16/2012, here:
http://forums.nvidia.com/index.php?showtopic=221985&view=findpost&p=1370579

I also reported the issue in their 295.73 WHQL drivers thread, on 2/24/2012, here:
http://forums.nvidia.com/index.php?showtopic=223426&view=findpost&p=1374645

We have not heard any response as of yet.
If you know of a more appropriate way to inform them of the problem, or get them to fix it, you're welcome to try it.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 593
Credit: 4,273,184
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 23643 - Posted: 25 Feb 2012 | 9:51:09 UTC - in response to Message 23642.
Last modified: 25 Feb 2012 | 9:58:55 UTC

Thanks, Michael and Jacob, for the details.

Profile MarkJ
Volunteer moderator
Project tester
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 730
Credit: 189,243,545
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23645 - Posted: 25 Feb 2012 | 10:48:26 UTC

The message threads over at SETI seem to indicate its the windows driver that has the issue. It has been reported by people using a DVI connected monitor, not sure if a VGA connected monitor also has the problems. It depends on the card and if they are using a DVI to VGA adaptor.
____________
BOINC blog

coldFuSion
Send message
Joined: 22 May 10
Posts: 20
Credit: 85,355,427
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 23681 - Posted: 28 Feb 2012 | 1:31:35 UTC - in response to Message 23645.

The message threads over at SETI seem to indicate its the windows driver that has the issue. It has been reported by people using a DVI connected monitor, not sure if a VGA connected monitor also has the problems. It depends on the card and if they are using a DVI to VGA adaptor.


I use the HDMI connector on my GTX 580's and the issue affected me using both 295.51 beta and 295.73 WHQL drivers.

I have configured power settings to never turn off the monitor and have since completed 4 tasks in a row successfully.

BDDave
Avatar
Send message
Joined: 29 Jul 10
Posts: 6
Credit: 170,414,723
RAC: 253,488
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23682 - Posted: 28 Feb 2012 | 3:45:32 UTC - in response to Message 23636.

I’ve rolled back to previous drivers thanks. 3 days of all error on milkyway, SETI, GPUGRID and Einstein. What mess!
Get crunchin!
BDDave

____________

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,147,372,614
RAC: 1,071,233
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23798 - Posted: 6 Mar 2012 | 13:21:38 UTC - in response to Message 23642.
Last modified: 6 Mar 2012 | 13:22:55 UTC

I thought I'd chime in with some more information.

If you want to use 295.73 WHQL or 295.51 Beta without CUDA failures:
The workaround for the bug is that you must set the Windows Power Options to "Turn off the display: Never". You may still use a screen saver, and you may still physically turn the monitor off, but you must not let the software power the monitor down... according to my testing.

Also, for anyone trying to reproduce the problem, I have found that the problem occurs when Windows powers off the monitor first, and then BOINC tries to start or resume a CUDA task while the monitor is off. This means that, if you try to reproduce it using tasks that are already running before Windows powers down the monitor, those tasks will not fail. But any tasks that try to start or resume, while the monitor is off, will fail... according to my testing.

Finally, the best news yet, I have been contacted privately by an nVidia employee, who was having trouble recreating the problem. I assisted him, and he can now repro on demand now (it's easiest to repro with Einstein@Home, and he didn't know that the monitor has to power down before BOINC begins CUDA processing), and he will be presenting information to the developers.

I am now going to run 295.73 WHQL with a "Blank" screensaver and "Turn off the display: Never", and try to remember to physically turn the monitor off if I get up for an extended period of time.

Regards,
Jacob Klein

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,834,518,624
RAC: 292,156
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23806 - Posted: 6 Mar 2012 | 16:44:10 UTC - in response to Message 23798.

Thanks Jacob, I amended a post in the FAQ - Best configurations for GPUGRID thread to reflect your findings.

Good work, should help many.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

kenlo
Send message
Joined: 25 Jan 11
Posts: 1
Credit: 682,342,105
RAC: 39
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23931 - Posted: 13 Mar 2012 | 12:59:46 UTC

I did a rollback to the 285.62 driver and still no work, what do i do now?
____________

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1068
Credit: 1,147,372,614
RAC: 1,071,233
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23948 - Posted: 14 Mar 2012 | 2:12:28 UTC - in response to Message 23931.

What does "still no work" mean?

Profile MarkJ
Volunteer moderator
Project tester
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 730
Credit: 189,243,545
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23954 - Posted: 14 Mar 2012 | 9:55:19 UTC

There is a 296.10 WHQL driver out. According to the SETI guys it still has the sleep mode bug.
____________
BOINC blog

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 593
Credit: 4,273,184
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 23956 - Posted: 14 Mar 2012 | 11:44:12 UTC - in response to Message 23954.

Did not see anything CUDA-related in the changelog.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 790
Credit: 1,423,423,595
RAC: 1,364,343
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23963 - Posted: 14 Mar 2012 | 20:25:39 UTC - in response to Message 23956.

Did not see anything CUDA-related in the changelog.

We couldn't see anything either, though we had a good chuckle over some of them.

A new bug ticket has been raised by a SETI developer and acknowledged by a named NVidia staffer.

Einstein are also now in active engagement with NVidia:
http://einstein.phys.uwm.edu/forum_thread.php?id=9307&nowrap=true#116397

JLConawayII
Send message
Joined: 31 May 10
Posts: 48
Credit: 24,931,604
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 23964 - Posted: 14 Mar 2012 | 20:45:06 UTC
Last modified: 14 Mar 2012 | 20:45:23 UTC

The 266.58 are the last drivers that seem to be problem-free, no downclocking bug and obviously no sleep mode bug. AFAIK they support everything up through the GTX 580. Unless you have a game or other software that requires the newer drivers, I would suggest rolling back to those. You will have to do a clean install though and be absolutely sure that no Nvidia software remains on your system before installing them. Otherwise certain core files will remain and you might still get the same issues.

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 136
Credit: 589,129,600
RAC: 270,074
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23965 - Posted: 14 Mar 2012 | 20:55:40 UTC - in response to Message 23964.

266.58 doesn't work well on Ubuntu with Albert&Einstein and DistrRTgen tasks.

Profile Matman
Send message
Joined: 3 Oct 10
Posts: 2
Credit: 34,005,977
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23998 - Posted: 16 Mar 2012 | 20:28:40 UTC

1 am running 296.10 NVidia (WHQL) drivers. Screen saver is set never to turn monitor off or "sleep" system. GPUGRID tasks yield computation errors immediately. SETI and Einstein are functioning without errors. So what's up?

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,834,518,624
RAC: 292,156
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24001 - Posted: 16 Mar 2012 | 22:41:08 UTC - in response to Message 23998.

Again, avoid using 295 and 296 drivers.
296.10 fixed nothing and like 295 has been reported as causing errors on several GPU projects. 296.17 is just for Win8 Preview. So no point updating to that either!
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 589
Credit: 2,039,148,450
RAC: 1,508,450
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24050 - Posted: 20 Mar 2012 | 8:01:00 UTC - in response to Message 23964.

The 266.58 are the last drivers that seem to be problem-free, no downclocking bug and obviously no sleep mode bug.


285.62 doesn't have problems.



____________

Profile Matman
Send message
Joined: 3 Oct 10
Posts: 2
Credit: 34,005,977
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24132 - Posted: 24 Mar 2012 | 16:17:16 UTC

OK, so I went back step by step to to 285.72 drivers. All CUDA tasks have performed without errors. I have not been able to test with CPUGRID, as I am waiting for a new WU.

Matman

coldFuSion
Send message
Joined: 22 May 10
Posts: 20
Credit: 85,355,427
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 24138 - Posted: 24 Mar 2012 | 21:58:12 UTC

Win 7 64-bit (SP1)
Dual GTX 580s
Driver: 295.73
Power Control Panel -> Turn off the display: Never

Result: no errors

Profile Bob Harris
Send message
Joined: 10 Jun 11
Posts: 6
Credit: 70,330,451
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 24200 - Posted: 1 Apr 2012 | 9:56:45 UTC

There are many many people out there with the 295.73 driver, and must be causing thousands of errors on the GPUGRID projects.

People in general, will be reluctant to role back drivers to earlier versions, because GPUGRID is not the primary reason to own or use a computer.


GPUGRID is wasting valuable data at this moment, because of many thousands of errors.

Therefore, shouldn't the GPUGRID team do something themselves, instead of asking every member to change drivers?

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,834,518,624
RAC: 292,156
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24201 - Posted: 1 Apr 2012 | 10:22:11 UTC - in response to Message 24200.

Do what?

GPUGrid has to rely on the system to deal with these issues. Users that continuously fail tasks will stop getting tasks.
If you ban users with specific drivers, unless the drivers universally fail, you will be banning users that complete tasks successfully too.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Bob Harris
Send message
Joined: 10 Jun 11
Posts: 6
Credit: 70,330,451
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 24202 - Posted: 1 Apr 2012 | 11:59:50 UTC - in response to Message 24201.

Do what?

GPUGrid has to rely on the system to deal with these issues. Users that continuously fail tasks will stop getting tasks.
If you ban users with specific drivers, unless the drivers universally fail, you will be banning users that complete tasks successfully too.



Has anyone investigated why the tasks fail with some drivers?

On a global scale, Nvidia will not change their drivers to pander to a relatively small group.

Therefore, shouldn't GPUGRID be looking at re-writing the code required to untertake the tasks under the newer drivers?

Profile Bob Harris
Send message
Joined: 10 Jun 11
Posts: 6
Credit: 70,330,451
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 24203 - Posted: 1 Apr 2012 | 12:07:22 UTC - in response to Message 24202.

Do what?

GPUGrid has to rely on the system to deal with these issues. Users that continuously fail tasks will stop getting tasks.
If you ban users with specific drivers, unless the drivers universally fail, you will be banning users that complete tasks successfully too.



Has anyone investigated why the tasks fail with some drivers?

On a global scale, Nvidia will not change their drivers to pander to a relatively small group.

Therefore, shouldn't GPUGRID be looking at re-writing the code required to untertake the tasks under the newer drivers?



UPDATE:
I have just done a casual check on some of the top performers on GPUGRID, and note that the majority of them are experiencing multiple failures of tasks. Even some of those with 285.xx drivers.

Maybe there is something else wrong here?

I am also active with Seti@home, and, with one unrelated exception, have no failures on those tasks...

Michael Goetz
Avatar
Send message
Joined: 2 Mar 09
Posts: 124
Credit: 2,974,144
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 24204 - Posted: 1 Apr 2012 | 12:09:46 UTC - in response to Message 24201.

Do what?


That is a question EVERY project is asking themselves. This is what I did with PrimeGrid's GeneferCUDA application. Ken had previously done something similar with PrimeGrid's other CUDA applications.

If a CUDA API call returns CUDA_ERROR_NO_DEVICE, GeneferCUDA prints a warning to stderr saying that under Windows, using RDP or using the 295/296 Nvidia driver causes the GPU to not work.

Stderr is visible in the BOINC task webpage, so there's a chance the user might read it.

After printing the message, Genefer goes to sleep for 10 minutes. It's still active, and doesn't return to BOINC, but it's not doing anything. This is intentially tying up the GPU, since no other BOINC task is going to be able to run on the GPU.

After 10 minutes, it tries again. This continues until either the program can run successfully, or one hour elapses. After an hour, Genefer gives up, declares a computation error, and exits.

This approach has two benefits. First, this error is transient and may go away while Genefer is still waiting, if either the RDP session is closed, or the monitor comes out of sleep mode. Second, in the more likely event that the problem doesn't go away, we're only failing one WU per hour instead of several per minute.

This certainly doesn't solve the problem, but it does mitigate its affect on the project somewhat.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 790
Credit: 1,423,423,595
RAC: 1,364,343
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24205 - Posted: 1 Apr 2012 | 15:39:45 UTC - in response to Message 24202.

On a global scale, Nvidia will not change their drivers to pander to a relatively small group.

On the contrary, Nvidia told Einstein:

This bug is considered as release critical (show-stopper) for the next NVIDIA driver release that's due in 2-4 weeks. Thus a fix will be available by that time.

We are only a 'relatively small group' if we hide away in our separate corners and try to sort out problems like this 'one project at a time'. There are times when collective action is necessary, and if 'BOINC Central' isn't proactively co-ordinating it, then projects which have adopted the BOINC platform should go and bang on their doors until they do.

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 19
Credit: 8,762,173
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 24358 - Posted: 10 Apr 2012 | 19:46:01 UTC

The latest version of the BOINC client software + manager, that being version 7.0.25 , has taken it upon themselves not to recognize the GPU anymore, if the user has one of those 'incompatible drivers'.

I think that this represents a mistake, because now that I've programmed my own Windows 7 Pro, x64 computer never to switch off the monitor, I am handing in work units successfully again, and I do think it's unrealistic thinking from BOINC, that users will downgrade their drivers, for the sake of BOINC.

I'm using driver version 296.10 successfully now.

Actually, one reason I had for upgrading from my outdated drivers, was my concern that 'the old method' of implementing "PhysX", would have used a discrete "PPU" (Physics Processing Unit) on my graphics card, and I wanted to /make sure/ that since this approach has been abandoned by nVidia, in favor of using the "GPGPU" itself, my own graphics card should also be using the GPGPU.

Especially since the instructions for downgrading, now tell us to remove ALL nVidia software from our computers, this has become a totally infeasible thing for me to do, with PhysX and "CUDA" SDKs all installed and working.

I have to add something to the advice, for how to prevent the monitor from sleeping though. Well enough, one would set the general Power Settings, accessible through Screensaver preferences. But then it can happen that some other process tries to give the command anyway, to put the monitor to sleep, especially since /some of us/ have sundry programs installed.

The stronger setting I would recommend would be (in addition to the standard setting):

Start Menu
Type in "Edit Group Policy" into the search field and hit Enter
Computer Configuration
Administrative Templates
System
Power Management
Video And Display Settings
Turn Off Display (Plugged In AND On Battery)
--> Disabled

What this does on Windows 7 Pro at least, is take away the privileges processes would have, which we might not have kept rack of, to put the monitor to sleep.

I think that by simply banning all up-to-date device drivers, the new version of BOINC client software will kill off one major source of contributed work for you. The main reason my own GC did crash at one point, was simply the fact that I had not researched the subject (in the forums), and it's not likely to happen to me again.

Dirk

5pot
Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 24359 - Posted: 10 Apr 2012 | 20:03:45 UTC - in response to Message 24358.
Last modified: 10 Apr 2012 | 20:53:19 UTC

Sorry to double post, but it seems important:

(not my words)
You'll be happy to know that 301.24 fixes the sleeping monitor Bug, althrough i haven't tried it on PrimeGrid yet, only Einstein & Seti so far,

Claggy (Some user on PG)

First post
I did 7 Setiathome offline Benches last night and couldn't get it to fail (But i'm using a different monitor to when i could get to fail with 295.xx drivers),
Before i upgraded i grabbed some BRP4Cuda work and have done some of it this morning no problem,
In a little while i'll downgrade to 295.73 and check i can get offline benches to fail on this monitor.

Second:

I downgraded my i7-2600K/GTX460/HD5770 host to 295.73, ran a setiathome offline bench, proved that the cuda apps do fail with this monitor, then upgraded back up to 301.24,

EDIT: Decided to check for myself using 301.10, and after letting monitor sleep for awhile, I resumed and saw that GPU usage remained steady. Still have to wait for validation from wingman

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 19
Credit: 8,762,173
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 24363 - Posted: 10 Apr 2012 | 20:52:13 UTC

Doesn't it even seem to matter to you, that BOINC's new client and manager package, is overriding the individual projects' policies on the subject?

Dirk

5pot
Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 24364 - Posted: 10 Apr 2012 | 20:56:28 UTC
Last modified: 10 Apr 2012 | 20:59:35 UTC

It does, but since many people rely on auto update, or always get latest driver. This may become a moot point after awhile. From what I can tell, only people who know what they're doing didn't use those drivers anyway, and since BOINC blacklisted them it won't matter when NVIDIA releases WHQL. I mean it prevents failed WU, and even Einstein quit allowing people to use those drivers (they blacklisted WU from going to hosts w/ those drivers anyways. Even if you fixed it yourself it wouldn't work. Should actually help projects in the long run, even if its rude to the users.

EDIT: by allowing BOINC to blacklist, it would have allowed me to use my 680 on Einstein, since they would know that even though mines higher than there 290 limit (301), they would have been able to send me WU. Arrogant, kinda yea, but MANY WU are failing everywhere b/c of it. Especially here I do believe

Michael Goetz
Avatar
Send message
Joined: 2 Mar 09
Posts: 124
Credit: 2,974,144
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 24367 - Posted: 10 Apr 2012 | 21:19:03 UTC - in response to Message 24358.
Last modified: 10 Apr 2012 | 21:43:30 UTC

The latest version of the BOINC client software + manager, that being version 7.0.25 , has taken it upon themselves not to recognize the GPU anymore, if the user has one of those 'incompatible drivers'.


I probably didn't dig deep enough, but where in the release notes does it say this? I couldn't find mention of this.

Assuming it's true, I've got very mixed feelings about it. On the one hand, from the project side of things, this driver bug is a huge pain in the posterior. Thousands and thousands of errors, and I've got WU's over at PrimeGrid that are hitting the "too many tasks" limits because of this.

From a user's perspective, it's not so nice -- but the user has the option of upgrading the driver to 301, downgrading the driver to 285, or reverting BOINC to 6.12.34 after clearing their work queue.

All in all, the benefits probably outweigh the disadvantages.
____________
My lucky number is 75898^524288+1

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 19
Credit: 8,762,173
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 24368 - Posted: 10 Apr 2012 | 21:25:40 UTC - in response to Message 24364.
Last modified: 10 Apr 2012 | 21:33:41 UTC

@Michael Goetz:

It's possible that I misread the issues with the newer BOINC Manager and Client.
What they wrote, is that when we install BOINC as a Service, OR in Protected Execution, GPU detection won't work anymore.

http://boinc.berkeley.edu/wiki/Release_Notes#Protected_Application_Execution_.28Service.29_Installation.2C_GPU_detection_and_Windows_XP

I was under the impression that 'as a service' is the opposite of 'in protected execution'.

If in fact they are one and the same thing, then I got it wrong. In that case, BOINC installed 'in User Mode' will still recognize the GPUs (without problem)... If that's so, you might want to make the text just a tad more clear about it. How does it address the malfunctions?


It does, but since many people rely on auto update, or always get latest driver. This may become a moot point after awhile. From what I can tell, only people who know what they're doing didn't use those drivers anyway, and since BOINC blacklisted them it won't matter when NVIDIA releases WHQL.


In my opinion there are two errors here.

1) Updating your graphics driver and system software, is not a trivial task. When I asked Windows Update to do it first, Windows Update updated and left me with an improper install. I could no longer open my nVidia Control Panel from that. So I had to do a manual upgrade afterward, my icons were all displaced and so on...

I don't think that users who simply have their computers on auto-pilot experience that.

2) The other people who chose the 296.10 driver, have other things to do with their computers, than BOINC Work Units. We only run BOINC on the side. I'm into game development, PhysX etc..

I'd say that ~BOINC is my screensaver~, but in fact mine is the 3D Text Screensaver, with BOINC running in the background.

You can't convince me to reinstall, and then re-reinstall my graphics drivers.

Dirk

Michael Goetz
Avatar
Send message
Joined: 2 Mar 09
Posts: 124
Credit: 2,974,144
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 24369 - Posted: 10 Apr 2012 | 21:42:39 UTC - in response to Message 24368.

@Michael Goetz:

It's possible that I misread the issues with the newer BOINC Manager and Client.
What they wrote, is that when we install BOINC as a Service, OR in Protected Execution, GPU detection won't work anymore.



I'm pretty sure that's a feature that's been in BOINC for many years now, and certainly isn't driver specific. It definitely was in the 6.12.34 client, and possibly in all of the 6.x.x clients.
____________
My lucky number is 75898^524288+1

5pot
Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 24370 - Posted: 10 Apr 2012 | 21:45:58 UTC

I'm not trying to convince anyone of anything. My point is that if you are using the WHQL driver for game development and the like (i play games) than you would upgrade to newest WHQL in order to have their ( NVIDIA) latest software. In this case you would be upgrading to 300 series whenever the WHQL is released if I'm not mistaken. This was my point, if your using the latest currently, than why not upgrade when newest is released. I personally use NVIDIA website so i can do a clean install. When I made my comment I was merely saying if boinc is currently being run on the side, than I would ASSUME you would want the latest. This being 300, which is a good thing for everyone all around.

dirkmittler
Send message
Joined: 13 Mar 12
Posts: 19
Credit: 8,762,173
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 24371 - Posted: 10 Apr 2012 | 21:46:23 UTC - in response to Message 24369.
Last modified: 10 Apr 2012 | 22:07:23 UTC

My apology. I thought that I was being urged to upgrade to the beta driver, etc.. I can upgrade to the 300.xx driver as soon as it becomes WHQL, just because my current setup will continue to work for now.

And I did just upgrade my client software to 7.0.25, as requested.

Dirk

5pot
Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 24372 - Posted: 10 Apr 2012 | 22:27:51 UTC

Quite allright. I've been trying to spread the word to various sites, b/c as Michael had stated, it has been a HUGE problem from the projects standpoint. MANY MANY errors have been caused by this monitor sleep bug, and when I said, "only the people that know what they're doing don't use it anyways" I should have been more clear. I meant to mean the BOINC ONLY crowd, but since this can be a rather small percentage on some sites, many users who aren't BOINC ONLY (they attach project and leave it) w/o ever checking results, they just keep producing errors w/o knowing it.

No need to go to beta if you already prevent monitor from sleeping, but not everyone does this, and this was the problem. People who play games etc. in spare time want/need the latest drivers in order for their system to function properly (whether it's WHQL or not).

All in all, it's great news for the BOINC community as a whole, b/c now everyone's happy (will be soon anyways when WHQL is released). From both the projects side (valid WU), and latest and greatest PhyX etc.

As always Happy Crunching

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 790
Credit: 1,423,423,595
RAC: 1,364,343
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 24380 - Posted: 11 Apr 2012 | 13:34:16 UTC - in response to Message 24369.

@Michael Goetz:

It's possible that I misread the issues with the newer BOINC Manager and Client.
What they wrote, is that when we install BOINC as a Service, OR in Protected Execution, GPU detection won't work anymore.

I'm pretty sure that's a feature that's been in BOINC for many years now, and certainly isn't driver specific. It definitely was in the 6.12.34 client, and possibly in all of the 6.x.x clients.

Trying to eliminate some confusion here:

'Service mode' and 'Protected Application Execution' are the same thing.

In Windows Vista and Windows 7 GPUs can NOT be used in Service/PAE mode - in any version of BOINC (it's an OS restriction).

In Windows XP, GPUs CAN be used in Service/PAE mode up to and including BOINC v6.12.34 - but not in the new BOINC v7.0.25

candido
Send message
Joined: 12 Jun 11
Posts: 8
Credit: 28,200,703
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 26322 - Posted: 15 Jul 2012 | 10:45:33 UTC
Last modified: 15 Jul 2012 | 10:49:41 UTC

Toshiba has now released a new display driver for notebooks that hava a NVIDIA card installed.
The driver seems to be the 296.31 version.
Has anyone tested this toshiba driver?
I guess I better stick with my older version which has worked without a problem and wait for a new toshiba version based on the 300.xx series.
____________

candido
Send message
Joined: 12 Jun 11
Posts: 8
Credit: 28,200,703
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 29317 - Posted: 4 Apr 2013 | 22:27:52 UTC - in response to Message 26322.
Last modified: 4 Apr 2013 | 22:29:58 UTC

Toshiba has now released a new display driver for notebooks with a NVIDIA card installed.
The driver seems to be the 296.31 version.
Has anyone tested this toshiba driver?



To answer my own question, in case anyone might have the same problem...
I had to install the latest driver from the manufacturer of my laptop for other reasons then boincing, and since I had the driver installed I decided to test with two short WU (1 Natham and 1 Noelia).
I replicated the conditions under which WU failed with 295.xx and 296.xx, as they were explained above by Jacob Klein:

for anyone trying to reproduce the problem, I have found that the problem occurs when Windows powers off the monitor first, and then BOINC tries to start or resume a CUDA task while the monitor is off. This means that, if you try to reproduce it using tasks that are already running before Windows powers down the monitor, those tasks will not fail. But any tasks that try to start or resume, while the monitor is off, will fail... according to my testing.


None of the WU failed, which might mean that either NVIDA solved the problem in the 296.31 version, or Toshiba did that for them.

Post to thread

Message boards : News : monitor suspend/resume bug in 295/296 drivers