Advanced search

Message boards : Number crunching : NOELIA WU

Author Message
Killersocke
Send message
Joined: 18 Oct 13
Posts: 52
Credit: 356,843,647
RAC: 463,626
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36977 - Posted: 1 Jun 2014 | 9:13:17 UTC
Last modified: 1 Jun 2014 | 9:14:04 UTC

got today 2 tasks with errors


http://www.gpugrid.net/result.php?resultid=11080227

95-NOELIA_TRP188-0-1-RND5224_3

Stderr Ausgabe
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -98 (0xffffff9e)
</message>
<stderr_txt>
# GPU [GeForce GTX 760] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0 :
# Name : GeForce GTX 760
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:01:00.0
# Device clock : 1071MHz
# Memory clock : 3004MHz
# Memory width : 256bit
# Driver version : r337_00 : 33788
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
10:42:52 (4752): called boinc_finish

</stderr_txt>
]]>

______________________________________________________________________________

http://www.gpugrid.net/result.php?resultid=11080200

26-NOELIA_TRP215-1-4-RND9753_0

Stderr Ausgabe
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -98 (0xffffff9e)
</message>
<stderr_txt>
# GPU [GeForce GTX 760] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0 :
# Name : GeForce GTX 760
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:01:00.0
# Device clock : 1071MHz
# Memory clock : 3004MHz
# Memory width : 256bit
# Driver version : r337_00 : 33788
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
10:44:56 (6264): called boinc_finish

</stderr_txt>
]]>[url][/url]

GoodFodder
Send message
Joined: 4 Oct 12
Posts: 53
Credit: 333,467,496
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 36978 - Posted: 1 Jun 2014 | 10:04:03 UTC
Last modified: 1 Jun 2014 | 10:17:21 UTC

Likewise the new Noelia_Trp fail instantly on my gtx 750 ti (Win XP x86) with the same error having received three so far. Good to know its not just my machine / maxwell issue.

ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
01:51:38 (3128): called boinc_finish

Jeremy Zimmerman
Send message
Joined: 13 Apr 13
Posts: 61
Credit: 726,605,417
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwat
Message 36979 - Posted: 1 Jun 2014 | 11:12:46 UTC - in response to Message 36978.

Currently all of the NOELIA_TRP WU's I have received (four so far) have failed within a few seconds also. All give the same error:

Exit status -98 (0xffffffffffffff9e) Unknown error number
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
02:07:23 (11844): called boinc_finish


Machine (149863) - GTX680
WinXP Driver 335.28 v8.41 (cuda60)
http://www.gpugrid.net/result.php?resultid=11078372
http://www.gpugrid.net/result.php?resultid=11077061

Machine (158717) - GTX680
WinXP Driver 335.28 v8.41 (cuda60)
http://www.gpugrid.net/result.php?resultid=11078799

Machine (165832) - GTX780Ti
Win7 Driver 331.82 v8.41 (cuda42)
http://www.gpugrid.net/result.php?resultid=11070971

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36980 - Posted: 1 Jun 2014 | 11:54:43 UTC

Guess its an error in the program. I had one too, but all wing(wo)men too.
Just after a few seconds, so no worries. W'll have to wait until Noelia made a change.
____________
Greetings from TJ

Matt
Avatar
Send message
Joined: 11 Jan 13
Posts: 216
Credit: 846,538,252
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36981 - Posted: 1 Jun 2014 | 14:50:01 UTC
Last modified: 1 Jun 2014 | 14:50:53 UTC

Same here - instant fail. It looks like this WU finally went through after the sixth send-out, however.

http://www.gpugrid.net/workunit.php?wuid=8215418

Matt
Avatar
Send message
Joined: 11 Jan 13
Posts: 216
Credit: 846,538,252
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 36983 - Posted: 1 Jun 2014 | 16:43:24 UTC

...and another

http://www.gpugrid.net/workunit.php?wuid=8215374

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1249
Credit: 3,361,241,474
RAC: 1,522,781
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37000 - Posted: 5 Jun 2014 | 18:57:41 UTC

This problem seems to have been repeated with today's batch of NOELIA_SH2 tasks on the short queue.

Many errors on host 45218

eXaPower
Send message
Joined: 25 Sep 13
Posts: 293
Credit: 1,897,601,978
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 37002 - Posted: 6 Jun 2014 | 0:20:33 UTC
Last modified: 6 Jun 2014 | 0:21:13 UTC

Instant work unit NOELIA_SH2 failure throughout today- (Exit status -98 (0xffffffffffffff9e) Unknown error number ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified)

Majority of wingman host(s) failed NOELIA_SH2- prompting (too many errors may be bug) message. All presently requested units on my host show many wingman failures beforehand.

Instant failure no matter beginning part of NOELIA_SH2 work unit name-for example... argphex1,asvalx7,et al.

boinc127
Send message
Joined: 31 Aug 13
Posts: 11
Credit: 7,952,212
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwat
Message 37003 - Posted: 6 Jun 2014 | 5:26:57 UTC

Hello all,

I'm new to GPUGRID.net (obviously) but I've been using my NVidia card on other projects. From what I've been reading here the NOELIA wu's have been failing at a fairly substantial rate for some crunchers. I have also been having the same difficulties with the NOELIA units; they all instantly fail with the exit code -98, ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified. I'm currently using a NVidia GTX650 Ti Boost with the latest driver (337) from NVidia trying to crunch the short workunits for now. I'm currently 0 for 5 on GPU tasks. I just thought I would post it here since this appears to be the correct thread for this issue. Hopefully the project administrators can correct the issue so I can get back to crunching!

Thank you all for your time. Happy crunching!

JHMarshall
Send message
Joined: 29 Nov 12
Posts: 2
Credit: 505,458,899
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 37004 - Posted: 6 Jun 2014 | 5:53:47 UTC - in response to Message 37003.

I've noticed that with these WUs where I have a failure, all the wingman that fail are running Windows as I am. When I see WU with 5 failures and a success, the success is always on a Linux system. The failures are on Windows systems. Has anybody else seen this pattern?

John C MacAlister
Send message
Joined: 17 Feb 13
Posts: 181
Credit: 144,871,276
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37005 - Posted: 6 Jun 2014 | 11:11:31 UTC
Last modified: 6 Jun 2014 | 11:12:04 UTC

After a long run without problems, I encountered these failures yesterday. :(

11116386 8236608 158482 6 Jun 2014 | 0:36:38 UTC 6 Jun 2014 | 0:43:02 UTC Error while computing 3.03 0.14 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)
11114622 8236718 158482 6 Jun 2014 | 0:36:38 UTC 6 Jun 2014 | 0:43:02 UTC Error while computing 4.40 0.16 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)
11110452 8236955 158482 6 Jun 2014 | 0:39:13 UTC 6 Jun 2014 | 0:43:02 UTC Error while computing 0.12 0.12 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)
11110140 8236969 158482 6 Jun 2014 | 0:39:13 UTC 6 Jun 2014 | 0:43:02 UTC Error while computing 3.34 0.14 --- Short runs (2-3 hours on fastest card) v8.41 (cuda60)

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 37006 - Posted: 6 Jun 2014 | 12:33:32 UTC

Hey guys,

We're looking into it. Unfortunately Noelia is away right now so we may have to cancel them for now. Thanks for catching it and pointing it out.

Nate

captainjack
Send message
Joined: 9 May 13
Posts: 160
Credit: 1,248,690,584
RAC: 103,939
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37009 - Posted: 6 Jun 2014 | 14:54:39 UTC

I've noticed that with these WUs where I have a failure, all the wingman that fail are running Windows as I am. When I see WU with 5 failures and a success, the success is always on a Linux system. The failures are on Windows systems. Has anybody else seen this pattern?


I just completed 3 Noelia short tasks on an Ubuntu 14.04 rig and they have validated. On two of the tasks, there were several previous failures on Windows boxes.


Profile [AF>Amis des Lapins] Phil...
Send message
Joined: 16 Jul 13
Posts: 56
Credit: 1,623,768,890
RAC: 75
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37010 - Posted: 6 Jun 2014 | 14:58:42 UTC

Hello !

Have you stopped the short WU's upload ?

Same problem for me last night, 19 short NOELIA WU's in error after a few seconds.

Thank You

Kind Regards

Phil1966

Profile ritterm
Avatar
Send message
Joined: 31 Jul 09
Posts: 88
Credit: 244,413,897
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37012 - Posted: 7 Jun 2014 | 0:42:35 UTC
Last modified: 7 Jun 2014 | 0:45:26 UTC

Work unit argargx8-NOELIA_SH2eq-0-1-RND4849 failed on a mix of Win7 and Linux hosts.
____________

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 37017 - Posted: 7 Jun 2014 | 10:03:39 UTC

Hello all.

The NOELIA_SH2eq workunits were all failing, so those have been cancelled.

There some other groups that follow the naming pattern NOELIA_TRPXXX. Those continue to run, because from what I can tell the ones that are being sent out now do not have a problem. The first post in this thread (http://www.gpugrid.net/forum_thread.php?id=3770&nowrap=true#36977) refers to a group that was cancelled, fixed and resent earlier in the week. However, if you continue to have problems with those please let me know here.

Nate

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 324
Credit: 72,394,453
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37019 - Posted: 7 Jun 2014 | 11:04:36 UTC - in response to Message 37017.

Hello all.

The NOELIA_SH2eq workunits were all failing, so those have been cancelled.

There some other groups that follow the naming pattern NOELIA_TRPXXX. Those continue to run, because from what I can tell the ones that are being sent out now do not have a problem. The first post in this thread (http://www.gpugrid.net/forum_thread.php?id=3770&nowrap=true#36977) refers to a group that was cancelled, fixed and resent earlier in the week. However, if you continue to have problems with those please let me know here.

Nate



Hello: Six (6) Units - NOELIA_SH2eq-0-1-RND2630_3 - downloaded a few minutes (13 H local) makes all failed in Windows 8.1.

In Linux Ubuntu 14.04 the same type of short work units operate without problem.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1249
Credit: 3,361,241,474
RAC: 1,522,781
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37020 - Posted: 7 Jun 2014 | 12:07:29 UTC - in response to Message 37017.

Unfortunately, I'll have to wait for my daily quota to allow another fetch before I can test.

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 324
Credit: 72,394,453
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37021 - Posted: 7 Jun 2014 | 15:54:07 UTC - in response to Message 37019.


Hello: Six (6) Units - NOELIA_SH2eq-0-1-RND2630_3 - downloaded a few minutes (13 H local) makes all failed in Windows 8.1.

In Linux Ubuntu 14.04 the same type of short work units operate without problem.


Hello: Short tasks - NOELIA_SH2eq-0-1-RND2630_3 - still fail in Windows

Profile Dingo
Avatar
Send message
Joined: 1 Nov 07
Posts: 20
Credit: 122,646,317
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 37022 - Posted: 7 Jun 2014 | 16:33:29 UTC

I am getting the same error today when I started the first task on my GTX 750 Ti:
lystrpx8-NOELIA_SH2eq-0-1-RND2046_1
Workunit 8238384
Created 7 Jun 2014 | 14:50:16 UTC
Sent 7 Jun 2014 | 16:19:10 UTC
Received 7 Jun 2014 | 16:26:05 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -98 (0xffffffffffffff9e) Unknown error number
Computer ID 170120
Report deadline 12 Jun 2014 | 16:19:10 UTC
Run time 2.00
CPU time 0.14
Validate state Invalid
Credit 0.00
Application version Short runs (2-3 hours on fastest card) v8.41 (cuda60)


Stderr output
<core_client_version>7.3.19</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -98 (0xffffff9e)
</message>
<stderr_txt>
# GPU [GeForce GTX 750 Ti] Platform [Windows] Rev [3301M] VERSION [60]
# SWAN Device 0 :
# Name : GeForce GTX 750 Ti
# ECC : Disabled
# Global mem : 2048MB
# Capability : 5.0
# PCI ID : 0000:01:00.0
# Device clock : 1137MHz
# Memory clock : 2700MHz
# Memory width : 128bit
# Driver version : r337_00 : 33788
ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
12:27:22 (4164): called boinc_finish

</stderr_txt>
]]>



Looks like my buddy on this had a different error:

http://www.gpugrid.net/workunit.php?wuid=8238384


____________

Proud Founder and member of



Have a look at my WebCam

Profile Carlesa25
Avatar
Send message
Joined: 13 Nov 10
Posts: 324
Credit: 72,394,453
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37025 - Posted: 8 Jun 2014 | 10:47:01 UTC - in response to Message 37022.

" ERROR: file mdioload.cpp line 162: No CHARMM parameter file specified
12:27:22 (4164): called boinc_finish "

Hello: This is the same mistake that I have from a few days ago in Windows 8.1 but as I say these same tasks in Linux work perfectly me.

Post to thread

Message boards : Number crunching : NOELIA WU