Advanced search

Message boards : Number crunching : Problems with Nathan's wu

Author Message
ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 22809 - Posted: 24 Dec 2011 | 13:42:39 UTC

Hi,

Most of the Nathan's wu cancels after several hours of processing.
Can you have a look at this problem.

Running all my computers with windows-xp-pro and with gtx295 etc.

Good luck
____________
Ton (ftpd) Netherlands

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 316
Credit: 68,878,234
RAC: 29,806
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 22843 - Posted: 27 Dec 2011 | 12:13:44 UTC - in response to Message 22809.

Hi I have also problems with these tasks, the Linux Ubuntu 11.10 + GTX295.

When done as I can that I have running, kill any discharge of this type of work, are a problem for my team.

If you find or report a solution will be appreciated. Greetings.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 316
Credit: 68,878,234
RAC: 29,806
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 22877 - Posted: 2 Jan 2012 | 23:45:36 UTC - in response to Message 22843.

Hello: Problems continue with these tasks and it seems that no project manager cares, would be to please any comments about it or if we are the only ones affected...??

It is becoming impossible to work in GPUGRID unfortunately. Greetings. Greetings.

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 22883 - Posted: 3 Jan 2012 | 23:41:16 UTC - in response to Message 22877.

Nathan tasks are crunching OK here on Ubuntu 11.10 and GTX 570. I have crashed a few Nathan tasks lately but that is due to some testing I am doing on my computer. When I look at the list of results for each EU it appears many crunchers are having trouble with Nathan tasks.

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 23000 - Posted: 18 Jan 2012 | 18:06:37 UTC - in response to Message 22809.

Still not any answer from the team.

It is not correct to handle this way.

I still cancel all nathan-wu at any machine when downloaded.

Can i expect an answer before the end of this month???
____________
Ton (ftpd) Netherlands

Profile nate
Volunteer moderator
Project scientist
Send message
Joined: 6 Jun 11
Posts: 123
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwat
Message 23002 - Posted: 18 Jan 2012 | 19:11:38 UTC - in response to Message 23000.
Last modified: 18 Jan 2012 | 19:12:03 UTC

Gentleman, I apologize. I had just left for personal vacation for two weeks on Dec 22nd and did not see this thread when I returned. I will try to have some answer by the end of the week. Sorry for the delay.

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 125
Credit: 231,550,067
RAC: 518,289
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23008 - Posted: 18 Jan 2012 | 23:01:16 UTC

I the same problem was here.
I had to underclock factory overclock from 800 to 790 (GTX560Ti), these tasks run with no problem now.

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 125
Credit: 231,550,067
RAC: 518,289
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 23009 - Posted: 19 Jan 2012 | 3:58:42 UTC - in response to Message 23008.

I had to underclock factory overclock from 800 to 790...
Bad numbers, from 900 to 890.

Profile nate
Volunteer moderator
Project scientist
Send message
Joined: 6 Jun 11
Posts: 123
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwat
Message 23011 - Posted: 19 Jan 2012 | 13:41:43 UTC - in response to Message 23009.

After a little investigation, here's what I have so far:

1) Looking at failed work units for both ftpd and Carl, "NATHAN" work units are not failing any more than other work units. If you look through failed wu's for your account, you'll see that all types of jobs are failing with similar errors. These wu's are all very different, so if it were a problem with one type we would see some correlation there. We do not.

2) I have showed the errors to the technical members of the team, and their first thought was that overclocking with the cards is the cause, even if it is a factory overclock. nenym also reports that decreasing the clock rate by 1% solved his problem of failures.

Considering all this, I suggest you attempt to decrease the clock frequency slightly and see if that helps with failed WUs. A 1% or 2% decrease is very small, and the time lost from slower computation will be made up from not spending hours computing for a WU that will fail. I will look at it some more, because I want to be sure I am not missing something. For now, it looks like it's caused by hardware, not a problem with the simulation itself. I will let you know if I find something that indicates otherwise. Please let me know if changing the clock speed helps the problem.

Nate

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 23076 - Posted: 22 Jan 2012 | 13:19:48 UTC - in response to Message 23011.

Thx for explaining the question.

On my GTX295 I never changed anything since five years.

I will try again.
____________
Ton (ftpd) Netherlands

Post to thread

Message boards : Number crunching : Problems with Nathan's wu