Advanced search

Message boards : Number crunching : Lot of errors

Author Message
Trotador
Send message
Joined: 25 Mar 12
Posts: 103
Credit: 13,832,927,393
RAC: 5,845,643
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55437 - Posted: 7 Oct 2020 | 12:50:09 UTC

Since today, nobody else?

ncoded.com
Send message
Joined: 16 Aug 16
Posts: 20
Credit: 628,821,413
RAC: 2,111,319
Level
Lys
Scientific publications
watwatwatwat
Message 55438 - Posted: 7 Oct 2020 | 12:52:46 UTC
Last modified: 7 Oct 2020 | 13:06:11 UTC

In the last half hour we have also starting getting computational errors on five (Linux) hosts

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1618
Credit: 8,605,794,351
RAC: 16,308,879
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55439 - Posted: 7 Oct 2020 | 12:56:20 UTC

Same here, so far only on one Linux machine.

Trotador
Send message
Joined: 25 Mar 12
Posts: 103
Credit: 13,832,927,393
RAC: 5,845,643
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55440 - Posted: 7 Oct 2020 | 13:05:15 UTC
Last modified: 7 Oct 2020 | 13:05:29 UTC

Yes all my hosts are in Linux, some license could have expired

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1618
Credit: 8,605,794,351
RAC: 16,308,879
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55441 - Posted: 7 Oct 2020 | 13:18:08 UTC

Second Linux machine has started failing all new tasks (was delayed by a stray ADRIA task which crept in under the wire). Windows machines are continuing as normal. I agree, feels like a license expiry.

csbyseti
Send message
Joined: 4 Oct 09
Posts: 6
Credit: 1,109,425,695
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55442 - Posted: 7 Oct 2020 | 13:26:00 UTC

got also errors on all Linux machines, WU's exit at startup.

Tried this Windows System, same error, both WU's stopped directly at statup with error.

candido
Send message
Joined: 12 Jun 11
Posts: 12
Credit: 150,069,999
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 55443 - Posted: 7 Oct 2020 | 13:46:00 UTC
Last modified: 7 Oct 2020 | 13:47:22 UTC

All my tasks in two computers are failing

Edit: i7 9th and 10th generation, GTX 1660, Windows 10

Dayle Diamond
Send message
Joined: 5 Dec 12
Posts: 84
Credit: 1,663,883,415
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55444 - Posted: 7 Oct 2020 | 13:53:55 UTC

Must be everybody?

On a Windows 10, 1070ti that hasn't had any errors before, I've gotten 7
errors. Checking the logs I see "exit code 195 (0xc3)"

Pausing work for the project until the errors stop.

STARBASEn
Avatar
Send message
Joined: 17 Feb 09
Posts: 91
Credit: 1,603,303,394
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 55445 - Posted: 7 Oct 2020 | 17:11:37 UTC
Last modified: 7 Oct 2020 | 17:15:31 UTC

Same here as of Oct 7 on two Linux machines with 3 GTX1060. No errors prior so temporarily suspending work until resolved.

Edit: E@H still running fine.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1618
Credit: 8,605,794,351
RAC: 16,308,879
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55446 - Posted: 7 Oct 2020 | 17:19:47 UTC - in response to Message 55441.

Second Linux machine has started failing all new tasks (was delayed by a stray ADRIA task which crept in under the wire). Windows machines are continuing as normal. I agree, feels like a license expiry.

ADRIA task has completed, uploaded, and reported. But my Windows machines have all failed as well.

This is the first time I've been able to reach the web server since I posted that, four hours ago. Had to use a Linux browser and knock off https to connect. I'll send Toni a PM while I'm here (but have a look round first).

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1618
Credit: 8,605,794,351
RAC: 16,308,879
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55447 - Posted: 7 Oct 2020 | 17:26:51 UTC

Hmmm. The new version of ACEMD was installed 16 Oct 2019 - you'd have thought we still had a week's grace left to run. Or did they license it for a beta run before full release?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,914,107,676
RAC: 32,603,125
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 55451 - Posted: 7 Oct 2020 | 18:14:22 UTC

I vaguely remember that it was said somewhere here in the forum that with the new version of ACEMD, these licences are no longer required.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1618
Credit: 8,605,794,351
RAC: 16,308,879
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55452 - Posted: 7 Oct 2020 | 18:33:18 UTC - in response to Message 55451.

I vaguely remember that it was said somewhere here in the forum that with the new version of ACEMD, these licences are no longer required.

The only reference I can find is at the very end of message 53251 - and that was posted by ServicEnginIC, rather than by any of the project staff.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 581
Credit: 9,532,762,024
RAC: 19,585,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55453 - Posted: 7 Oct 2020 | 19:20:19 UTC - in response to Message 55452.
Last modified: 7 Oct 2020 | 19:39:39 UTC

I vaguely remember that it was said somewhere here in the forum that with the new version of ACEMD, these licences are no longer required.

The only reference I can find is at the very end of message 53251 - and that was posted by ServicEnginIC, rather than by any of the project staff.

The facts seem to indicate otherwise...
It was replied by Retvari Zoltan this way, as a premonition:

from now it is not dependent on an eventually expiring license...

You wish, but actually there's no proof or indication of that.


Edit:
There is an indirect reference on Toni's message 52539

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1618
Credit: 8,605,794,351
RAC: 16,308,879
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55454 - Posted: 7 Oct 2020 | 21:02:40 UTC - in response to Message 55453.

It's still an ACEMD application, even if hidden behind a wrapper. I don't know what made people assume it was no longer subject to proprietary licencing.

(see stderr:

14:39:04 (6232): wrapper: running acemd3.exe (--boinc input --device 0)
14:39:05 (6232): acemd3.exe exited; CPU time 0.000000

Exit status 195 (0xc3) EXIT_CHILD_FAILED)

Greger
Send message
Joined: 6 Jan 15
Posts: 76
Credit: 24,002,802,249
RAC: 6,035,193
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 55455 - Posted: 7 Oct 2020 | 21:14:11 UTC

Applications is close a year old now (16 Oct 2019) so it could be it.

WR-HW95
Send message
Joined: 16 Dec 08
Posts: 7
Credit: 1,432,701,313
RAC: 22,568
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55456 - Posted: 7 Oct 2020 | 21:43:17 UTC

I´m getting these too on Win10 machine.

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 195 (0xc3)</message>
<stderr_txt>
17:31:23 (2248): wrapper (7.9.26016): starting
17:31:23 (2248): wrapper: running acemd3.exe (--boinc input --device 0)
17:31:24 (2248): acemd3.exe exited; CPU time 0.000000
17:31:24 (2248): app exit status: 0x1
17:31:24 (2248): called boinc_finish(195)

Exit status 195 (0xc3) EXIT_CHILD_FAILED

nemesis
Send message
Joined: 11 Dec 19
Posts: 2
Credit: 7,521,365
RAC: 0
Level
Ser
Scientific publications
wat
Message 55457 - Posted: 7 Oct 2020 | 22:06:25 UTC

this:

Stderr output

<core_client_version>7.9.3</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
13:35:35 (15075): wrapper (7.7.26016): starting
13:35:35 (15075): wrapper (7.7.26016): starting
13:35:35 (15075): wrapper: running acemd3 (--boinc input --device 0)
13:35:36 (15075): acemd3 exited; CPU time 0.000172
13:35:36 (15075): app exit status: 0x1
13:35:36 (15075): called boinc_finish(195)

</stderr_txt>
]]>

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 55458 - Posted: 8 Oct 2020 | 2:21:13 UTC

10/7/2020 6:50:27 PM | GPUGRID | Project is temporarily shut down for maintenance


During GPUGRID shutdowns I let my GPUs work for Folding@home. My CPUs are always crunching there since the covid moonshot race began.

tullio
Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 55459 - Posted: 8 Oct 2020 | 8:09:44 UTC

World Community Grid says that IBM is working on a GPU version for their Open Pandemics against COVID-19.
Tullio
____________

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,914,107,676
RAC: 32,603,125
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 55460 - Posted: 8 Oct 2020 | 9:13:28 UTC - in response to Message 55459.

World Community Grid says that IBM is working on a GPU version for their Open Pandemics against COVID-19.
Tullio

even more reason for GPUGRID to think about digging into that - I would hope.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55465 - Posted: 8 Oct 2020 | 13:23:30 UTC - in response to Message 55453.

I vaguely remember that it was said somewhere here in the forum that with the new version of ACEMD, these licences are no longer required.
The only reference I can find is at the very end of message 53251 - and that was posted by ServicEnginIC, rather than by any of the project staff.
The facts seem to indicate otherwise...
It was replied by Retvari Zoltan this way, as a premonition:
from now it is not dependent on an eventually expiring license...
You wish, but actually there's no proof or indication of that.
It's because I'm a prophet.
Or it's because I do remember that the GPUGrid app is actually a proprietary software of acellera.
The transition to the boinc wrapper method haven't changed the proprietorial state of the given product (i.e. the GPUGrid app) itself.
It's very unlikely that any proprietor would lend their proprietary software to anyone without any protection / limitation.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1131
Credit: 9,914,107,676
RAC: 32,603,125
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 55467 - Posted: 8 Oct 2020 | 14:36:04 UTC

what's just to bad is that the GPUGRID people simply dont make a note, in red, in their calendar, so that the licence could be renewed BEFORE it expires.
Would save some headaches for them and for us crunchers :-)

ncoded.com
Send message
Joined: 16 Aug 16
Posts: 20
Credit: 628,821,413
RAC: 2,111,319
Level
Lys
Scientific publications
watwatwatwat
Message 55528 - Posted: 9 Oct 2020 | 20:31:57 UTC
Last modified: 9 Oct 2020 | 20:32:51 UTC

https://www.gpugrid.net/forum_thread.php?id=5183#55495

Post to thread

Message boards : Number crunching : Lot of errors

//