Advanced search

Message boards : Number crunching : gtx1070 upgrade from 670 and getting too many errors

Author Message
Profile BeemerBiker
Avatar
Send message
Joined: 31 Oct 08
Posts: 77
Credit: 603,504,478
RAC: 884,485
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46205 - Posted: 17 Jan 2017 | 5:08:27 UTC

I thought the problem was SLI but even after removing the SLI cable on this host I am still getting errors.

It seems I even get credit when an error occurs, why?

I upgraded to 376.33 and BM 7.6.33 but still got errors. Last valid credit was on 375.95 but that version also had errors.

of the 10 valid credits every one has an error message:
-------

SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1965.
-------

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,641,103,144
RAC: 9,944,131
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46206 - Posted: 17 Jan 2017 | 9:46:48 UTC - in response to Message 46205.

It sounds like a driver corruption. If a simple system restart does not help, then you should:
1. Download Display Driver Uninstaller
2. Suspend all projects in BOINC manager
3. uninstall NVidia driver through "Programs and services" (in Windows 10 you can right click on the start button, and it's at the top)
4. Restart your PC in safe mode by shift+click on the restart (After your PC restarts to the Choose an option screen, select Troubleshoot -> Advanced options -> Startup Settings -> Restart. After your PC restarts (again), you'll see a list of options. Press 4 or F4 to start your PC in Safe Mode. Or if you'll need to use the Internet, select 5 or F5 for Safe Mode with Networking.)
5. Start Display Driver Uninstaller, and remove all NVidia drivers
6. Restart your PC in normal mode
7. Install the latest NVidia driver
8. Restart your PC
9. Resume all projects in BOINC manager

Profile BeemerBiker
Avatar
Send message
Joined: 31 Oct 08
Posts: 77
Credit: 603,504,478
RAC: 884,485
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46207 - Posted: 17 Jan 2017 | 13:57:54 UTC - in response to Message 46206.

going to run one board at time. Possibly bad board. should have tested each board first.

JoergF
Avatar
Send message
Joined: 20 Apr 15
Posts: 206
Credit: 298,111,561
RAC: 1,110,607
Level
Asn
Scientific publications
watwat
Message 46209 - Posted: 17 Jan 2017 | 15:43:35 UTC
Last modified: 17 Jan 2017 | 15:44:47 UTC

As Zoltan already wrote, a clean re-install of drivers would be essential. Having said this ... and I don't know if applicable here as well... I have had lot of trouble with Folding@Home, my gtx 1070 and recent Nvidia Drivers. The driver crashed all the time. Stepping back to driver 368.xx was the answer to the problem.

Well, GPUGRID is not F@H so maybe it's comparing apples and oranges ... but worth a try even so.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 271
Credit: 1,313,601,956
RAC: 5,444,153
Level
Met
Scientific publications
watwat
Message 46210 - Posted: 17 Jan 2017 | 16:00:14 UTC - in response to Message 46207.

going to run one board at time. Possibly bad board. should have tested each board first.


Were you running SLI 1070s?

Profile BeemerBiker
Avatar
Send message
Joined: 31 Oct 08
Posts: 77
Credit: 603,504,478
RAC: 884,485
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46227 - Posted: 18 Jan 2017 | 17:11:18 UTC - in response to Message 46210.
Last modified: 18 Jan 2017 | 17:11:55 UTC

Yes, I had SLI enabled but I pulled the cable connecting the two boards after reading about the problem. Maybe I had to go to the control panel and make additional changes before pulling the cable and rebooting??

Anyway, I completed and validated two WU's and have just swapped boards to check out the other 1070.

I also had problem with einstein on the 1070s running that older driver and they suggested upgrading to the latest.

This system, eVga 132-YW-E178-FTW had 2 PCIe-16 "ver 2" and 1 PCIe-16 "ver 1" in slots 1,3,2 respectively according to the documentation. The 1070 is "ver 3" but seem to work OK in the "ver 2" slots and supposedly the mombo can run 3 boards in SLI. This system will not boot unless the two video boards are in the "ver 2" slots as I get the error message "SLI must be in slots 1 and 3" and bios remains in post mode. This occurs whether I have the SLI cable connected or not. I do not use SLI. I do a lot of video conversions and the programs I use require CUDA to perform the conversions.

Post to thread

Message boards : Number crunching : gtx1070 upgrade from 670 and getting too many errors