Advanced search

Message boards : Number crunching : No progress change during calculation?

Author Message
Profile Sutaru Tsureku
Send message
Joined: 1 May 09
Posts: 13
Credit: 3,655,193
RAC: 0
Level
Ala
Scientific publications
watwatwat
Message 19039 - Posted: 25 Oct 2010 | 16:16:29 UTC
Last modified: 25 Oct 2010 | 16:41:12 UTC

Hello community!


I looked in BOINC and the progress (%) of the GPUGRID WU didn't changed. I looked ~ 5 mins. The estimate time increased and increased.

I aborted the WU.

3 GPUGRID WUs were calculated well before on this GPU.

Manufacturer OCed GTX260-216.
ACEMD2: GPU molecular dynamics v6.05 (cuda)
resultid=3186839

SWAN : FATAL : Failure executing kernel sync [nb_k_nt_p_tp_pme] [700]
Assertion failed: 0, file swanlib_nv.cpp, line 121

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

(I don't know if this is an info because of aborting)

Is this 'normal' that sometimes the progress don't change?

Is this an error, what can happen sometimes?
What to do?
Restart of BOINC, PC, ..?
Aborting?


Thanks!
____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19043 - Posted: 25 Oct 2010 | 18:02:18 UTC - in response to Message 19039.

That specific task seemed to fail after about 3h, but some of the KKi tasks are unusually long. For a GTX260 about 5h to 8h is about right, depending on the task.

Profile Sutaru Tsureku
Send message
Joined: 1 May 09
Posts: 13
Credit: 3,655,193
RAC: 0
Level
Ala
Scientific publications
watwatwat
Message 19045 - Posted: 25 Oct 2010 | 22:34:12 UTC - in response to Message 19043.

The other 'KKi'-WUs running < 8 hours (~ 28,000 secs).

If I look in BOINC, normally I see that the % progress change every few seconds.
But this time, I looked and ~ 5 mins the % progress didn't changed. The elapsed time increased also the estimate time.
If I look to the wingmen (wuid=2012518), the 1st 9400M, 3rd GTX295 got computing error.
My 2nd result was 'strange'.
The 4th result will come from a GTX275.

It could be that this WU is 'buggy'?

____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 19050 - Posted: 26 Oct 2010 | 12:47:18 UTC - in response to Message 19045.

Yes, I would say the WU had a problem in this case, rather than your setup.
As it is a KKi task, with a known issues that occasionally prevent some tasks from running (fail very early) it is possible that this is one such task, but it managed to run on your system.

Profile Sutaru Tsureku
Send message
Joined: 1 May 09
Posts: 13
Credit: 3,655,193
RAC: 0
Level
Ala
Scientific publications
watwatwat
Message 19134 - Posted: 31 Oct 2010 | 13:38:10 UTC
Last modified: 31 Oct 2010 | 14:15:46 UTC

The old WU in question was calculated well from a Linux GTX275 now.
Maybe it's a prob with the Windows app?


This morning I saw again a (new) WU with no % progress change: wuid=2028574.
I suspended the calculation in BOINC, waited a few seconds and after enabling of calculation the WU in question was marked immediately as computation error.

What is the prob?
The Windows app?
My system?
The old nVIDIA driver 190.38? S@h and MW@h run well.

I would like to crunch GPUGRID from time to time, but if every few times a WU have probs..
What would happen, if I don't see this in BOINC?
The WU stay in calculation mode, but crunch days and days in an idle loop and block the GPU?
____________

Profile Sutaru Tsureku
Send message
Joined: 1 May 09
Posts: 13
Credit: 3,655,193
RAC: 0
Level
Ala
Scientific publications
watwatwat
Message 19138 - Posted: 31 Oct 2010 | 14:13:27 UTC - in response to Message 19134.

I see now, both WUs were calculated on GPU #2.
Maybe the GPU is buggy for/at GPUGRID? Other project are well.

____________

Post to thread

Message boards : Number crunching : No progress change during calculation?

//