Advanced search

Message boards : Graphics cards (GPUs) : ACEMD2 6.05 issues

Author Message
Tom Philippart
Send message
Joined: 12 Feb 09
Posts: 57
Credit: 23,376,686
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17433 - Posted: 29 May 2010 | 14:33:48 UTC

Hello,
I have a lot of errors with the new app, here are some examples:

http://www.gpugrid.net/result.php?resultid=2397710

http://www.gpugrid.net/result.php?resultid=2401224

what's wrong?

The same gpu worked without problems on the old version...

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3429
Credit: 694,067,284
RAC: 1,502,613
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17447 - Posted: 30 May 2010 | 1:31:00 UTC - in response to Message 17433.

Nombre D228-OTTO_HERGflip656-71-100-RND0730_0
Workunit 1521056
Crear 27 May 2010 7:28:05 UTC
Sent 27 May 2010 7:46:43 UTC
Received 28 May 2010 9:16:06 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 255 (0xff)
Computer ID 26214
Report deadline 1 Jun 2010 7:46:43 UTC
Run time 90476.74818
CPU time 4931.781
stderr out

<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
The extended attributes are inconsistent. (0xff) - exit code 255 (0xff)
</message>
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce 9600 GT"
# Clock rate: 1.90 GHz
# Total amount of global memory: 536870912 bytes
# Number of multiprocessors: 8
# Number of cores: 64
MDIO ERROR: cannot open file "restart.coor"
SWAN : FATAL : Failure executing kernel sync [fft_data_swizzle_out] [999]
Assertion failed: 0, file swanlib_nv.cpp, line 121

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

</stderr_txt>
]]>

My first impression was that 1.9GHz was pushing it a bit, but you did make it to 25h!
The error should mean something to the techs.

Tom Philippart
Send message
Joined: 12 Feb 09
Posts: 57
Credit: 23,376,686
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17453 - Posted: 30 May 2010 | 8:26:19 UTC

I installed a newer driver and now it seems to be working! I returned my first successful 6.05 WU in 65000 seconds on the same card!

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 301
Credit: 48,124,534
RAC: 44,167
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 17559 - Posted: 10 Jun 2010 | 15:32:31 UTC
Last modified: 10 Jun 2010 | 15:46:14 UTC

Last night, I had an error with this application; one NOT properly reported to boinc.exe so it could go on to another workunit. It gave a Windows error message instead.


acemd2_6.05_windows_intelx86_cuda has stopped working

Windows can check online for a solution to the problem.

-> Check online for a solution and close the program

-> Close the program


One of the problem details:

Problem Event Name: APPCRASH


http://www.gpugrid.net/result.php?resultid=2482667


I've made no attempts to overclock the CPU or the GPU card. It will be a few days before I can do much about any driver issues.

The Brain QC
Send message
Joined: 27 Oct 08
Posts: 27
Credit: 3,211,916
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwat
Message 17560 - Posted: 11 Jun 2010 | 0:11:31 UTC

Wus are very long to complete with Win Seven 64 bits, near the double time compared to 6.06 app under Linux 64 for equivalent amount of points.

Is there a way to make it work better ? (Is Sync Swan switch useful or not with this new app?)

geraldrube
Send message
Joined: 6 Jun 09
Posts: 3
Credit: 39,073,765
RAC: 162
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwat
Message 17594 - Posted: 13 Jun 2010 | 15:15:16 UTC - in response to Message 17560.
Last modified: 13 Jun 2010 | 15:18:08 UTC

Could someone tell me why Ubuntu 32 bit with Nvidia 275 will not crunch --does it have to be 64 bit pissed off it was working with XP but DONT want to go back to windows--happy for any help!!??--marysduby---PhenomII x4 940

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3429
Credit: 694,067,284
RAC: 1,502,613
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17597 - Posted: 13 Jun 2010 | 16:39:52 UTC - in response to Message 17594.
Last modified: 13 Jun 2010 | 16:40:05 UTC

There is only a 64bit compiled Linux application.
Why not use Ubuntu 64bit?

Wiyosaya
Send message
Joined: 22 Nov 09
Posts: 82
Credit: 64,725,953
RAC: 78,267
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17614 - Posted: 15 Jun 2010 | 0:08:22 UTC - in response to Message 17447.

I had three of these this past weekend.

<core_client_version>6.10.56</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# Using device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce 8800 GT"
# Clock rate: 1.50 GHz
# Total amount of global memory: 536543232 bytes
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
SWAN : FATAL : Failure executing kernel sync [mshake_position_kernel_2] [700]
Assertion failed: 0, file swanlib_nv.cpp, line 121

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

</stderr_txt>
]]>

This was unusual for my machine. So far, with 6.05, I have had good results.

Here are links to the other two results:
http://www.gpugrid.net/result.php?resultid=2497691
http://www.gpugrid.net/result.php?resultid=2498741

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 301
Credit: 48,124,534
RAC: 44,167
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 17640 - Posted: 16 Jun 2010 | 18:21:38 UTC
Last modified: 16 Jun 2010 | 18:23:33 UTC

acemd2_6.05_windows_intelx86__cuda has stopped working

Windows can check online for a solution to the problem.

-> Check online for as solution and close the program

-> Close the program


6/16/2010 12:44:51 PM GPUGRID Computation for task I36-OTTO_HERGflip656-91-100-RND5225_0 finished
6/16/2010 12:44:51 PM GPUGRID Output file I36-OTTO_HERGflip656-91-100-RND5225_0_1 for task I36-OTTO_HERGflip656-91-100-RND5225_0 absent
6/16/2010 12:44:51 PM GPUGRID Output file I36-OTTO_HERGflip656-91-100-RND5225_0_2 for task I36-OTTO_HERGflip656-91-100-RND5225_0 absent
6/16/2010 12:44:51 PM GPUGRID Output file I36-OTTO_HERGflip656-91-100-RND5225_0_3 for task I36-OTTO_HERGflip656-91-100-RND5225_0 absent



May have something to do with the rather high total demand for main memory near the time of the problem - near 7 GB out of 8 GB.

Held on to the GPU until I told Windows to close the program.

I've noticed that when the amount of memory in use on my machine exceeds 50% of the total physical memory, 64-bit programs keep on running properly, but 32-bit programs (including one I suspect is responsible for the keyboard communications) tend to slow down drastically or have other problems.

You may find this an adequate reason for providing a 64-bit version of the workunit software, even if this does not increase the speed otherwise.


ACEMD2: GPU molecular dynamics 6.05 (cuda)

Vista Home Premium SP2
waiting for a very long The Lattice Project workunit to finish so I can install the latest Nvidia driver - 193 hours so far out of 320 expected, already 100 hours past the last usable checkpoint.
Intel 4-processor-core CPU
9800 GT GPU

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3429
Credit: 694,067,284
RAC: 1,502,613
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17642 - Posted: 16 Jun 2010 | 21:44:28 UTC - in response to Message 17640.

I have seen this error on my x64 Vista system too.
Perhaps you could add a spare hard drive and set the Virtual memory to use that space. Using your primary hard drive (system drive) for cache is slower, and the system might think it has crashed (timeout) when it is just trying to read or write to the drive:
- Right click Computer, Properties, Advanced System Settings, Advanced, Performance Settings, Advanced, Virtual Memory, Change, and Uncheck Automatically manage paging file size for all drives, select the other drive to use for cache. I’m guessing 12GB would be the default cache size.

The Lattice Project uses 1.2GB per long task, so it is probably the culprit here!
32bit software, as you say, may work on x64 systems, but there are problems after 3.2/3.4GB is used.
Would be nice to have 64bit apps for x64 Windows operating systems as well as Linux, but then Linux cannot be x86, there is only an x64 app!

Wiyosaya
Send message
Joined: 22 Nov 09
Posts: 82
Credit: 64,725,953
RAC: 78,267
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17675 - Posted: 20 Jun 2010 | 17:13:00 UTC
Last modified: 20 Jun 2010 | 17:14:07 UTC

Two more sets of bad results this weekend virtually identical to what I received last weekend.

http://www.gpugrid.net/result.php?resultid=2537703

http://www.gpugrid.net/result.php?resultid=2539814

My 8800 has gone from being able to run work units that other cards cannot to not being able to run any work units. :(

Is there a fix for this yet?

Thanks.

Ivailo Bonev
Avatar
Send message
Joined: 14 Jun 10
Posts: 5
Credit: 300,613
RAC: 0
Level

Scientific publications
watwat
Message 17686 - Posted: 22 Jun 2010 | 10:19:30 UTC - in response to Message 17675.
Last modified: 22 Jun 2010 | 10:21:04 UTC

Turn off power saving features on your computer, like in Display Properties -> Power Options Properties
-> Turn off monitor - Never,
-> Turn off hard disks - Never,
-> System standby - Never.
____________

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Wiyosaya
Send message
Joined: 22 Nov 09
Posts: 82
Credit: 64,725,953
RAC: 78,267
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17688 - Posted: 23 Jun 2010 | 3:04:14 UTC - in response to Message 17686.

Turn off power saving features on your computer, like in Display Properties -> Power Options Properties
-> Turn off monitor - Never,
-> Turn off hard disks - Never,
-> System standby - Never.

Thanks for the suggestion, however, my computer is already set as such.

I will try disabling the BOINC screen saver this weekend to see if that makes a difference.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 722
Credit: 624,332,167
RAC: 1,484,470
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwat
Message 17690 - Posted: 23 Jun 2010 | 15:38:48 UTC - in response to Message 17688.

I will try disabling the BOINC screen saver this weekend to see if that makes a difference.

The BOINC screensaver consistently crashes GPUGRID on my machines. For some reason it occasionally sets itself to on from time to time on a couple of the boxes and I had to delete the BOINC.scr file in the Windows base directory to get it to stop.

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3429
Credit: 694,067,284
RAC: 1,502,613
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17691 - Posted: 23 Jun 2010 | 21:55:17 UTC - in response to Message 17690.

It is best to not install the Boinc Screen Saver, during setup.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 722
Credit: 624,332,167
RAC: 1,484,470
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwat
Message 17692 - Posted: 24 Jun 2010 | 9:07:54 UTC

AFAIK it always gets installed. BOINC setup just asks if you want to use it by default.

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3429
Credit: 694,067,284
RAC: 1,502,613
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17693 - Posted: 24 Jun 2010 | 13:40:52 UTC - in response to Message 17692.

You are right about that. It can still be selected as the screen saver at a later time. I never select to use it during Boinc setup, but it is still available as a screen saver.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 722
Credit: 624,332,167
RAC: 1,484,470
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwat
Message 17694 - Posted: 24 Jun 2010 | 14:34:01 UTC - in response to Message 17693.

You are right about that. It can still be selected as the screen saver at a later time. I never select to use it during Boinc setup, but it is still available as a screen saver.

The problem on 2 of my Win7 boxes so far is that for some unknown reason every now and then the settings change from "no screensaver" to "BOINC screensaver", then GPUGRID crashes. Deleting the BOINC.scr file fixes the problem. Unfortunately upgrading to a new BOINC version puts it back in...

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3429
Credit: 694,067,284
RAC: 1,502,613
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17695 - Posted: 24 Jun 2010 | 20:01:11 UTC - in response to Message 17694.

Perhaps Boinc is reading previous installation settings either incorrectly (not the way you want, because of a change in Boinc) or because at some time in the past you installed Boinc with the screen saver enabled. Unless Boinc kept an Installation History file you would be just guessing.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 722
Credit: 624,332,167
RAC: 1,484,470
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwat
Message 17696 - Posted: 24 Jun 2010 | 20:20:19 UTC - in response to Message 17695.

Perhaps Boinc is reading previous installation settings either incorrectly (not the way you want, because of a change in Boinc) or because at some time in the past you installed Boinc with the screen saver enabled. Unless Boinc kept an Installation History file you would be just guessing.

Definitely not the case.

Post to thread

Message boards : Graphics cards (GPUs) : ACEMD2 6.05 issues