Advanced search

Message boards : Server and website : Upload Problems

Author Message
Profile [SG]bieboderbeste
Send message
Joined: 19 Mar 08
Posts: 1
Credit: 100,130,978
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 37589 - Posted: 16 Aug 2014 | 13:08:16 UTC

Hi.

I can't upload my last Long Runs Result. The big package with 40,44 MB is uploading to 99% and then breaks it up...

Sorry for my bad English...

Robert Gammon
Send message
Joined: 28 May 12
Posts: 61
Credit: 619,183,642
RAC: 304,485
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 37590 - Posted: 16 Aug 2014 | 13:26:47 UTC

I too am having upload problems. 2 short runs that get a tiny amount of data sent, then stop

Server is OUT of disc space. So until new capacity is added or data purged from the server, we are stuck

MrJo
Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 2,475
Level
Met
Scientific publications
watwatwatwatwat
Message 37591 - Posted: 16 Aug 2014 | 15:46:00 UTC

The same with me. 5 machines, none of them can upload..
____________
Regards, Josef

petebe
Send message
Joined: 19 Nov 12
Posts: 31
Credit: 1,549,545,867
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwat
Message 37592 - Posted: 16 Aug 2014 | 15:48:02 UTC

Same here - can't upload: "Server is out of disk space".

Killersocke
Send message
Joined: 18 Oct 13
Posts: 41
Credit: 134,973,970
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 37593 - Posted: 16 Aug 2014 | 16:31:10 UTC


same here :-(

John C MacAlister
Send message
Joined: 17 Feb 13
Posts: 178
Credit: 132,357,411
RAC: 16,335
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 37594 - Posted: 16 Aug 2014 | 17:48:15 UTC

Three shorts awaiting upload......

Evil Penguin
Avatar
Send message
Joined: 15 Jan 10
Posts: 42
Credit: 18,255,462
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 37595 - Posted: 16 Aug 2014 | 19:59:09 UTC

Same issue here.

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwat
Message 37596 - Posted: 16 Aug 2014 | 20:25:04 UTC

Same here. 3 longs. All the smaller files uploaded but the three 53MB files still waiting.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1837
Credit: 10,436,349,394
RAC: 8,755,917
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37597 - Posted: 16 Aug 2014 | 20:25:19 UTC

Someone please make some space on the server.

2014. 08. 16. 22:21:01 GPUGRID Started upload of e2s356_e1s863f335-SANTI_marsalWTbound2-30-32-RND8784_0_9
2014. 08. 16. 22:21:01 GPUGRID Started upload of I4R44-SDOERR_BARNA5-12-100-RND6142_0_9
2014. 08. 16. 22:21:03 GPUGRID [error] Error reported by file upload server: Server is out of disk space
2014. 08. 16. 22:21:03 GPUGRID [error] Error reported by file upload server: Server is out of disk space
2014. 08. 16. 22:21:03 GPUGRID Temporarily failed upload of e2s356_e1s863f335-SANTI_marsalWTbound2-30-32-RND8784_0_9: transient upload error
2014. 08. 16. 22:21:03 GPUGRID Backing off 40 min 29 sec on upload of e2s356_e1s863f335-SANTI_marsalWTbound2-30-32-RND8784_0_9
2014. 08. 16. 22:21:03 GPUGRID Temporarily failed upload of I4R44-SDOERR_BARNA5-12-100-RND6142_0_9: transient upload error
2014. 08. 16. 22:21:03 GPUGRID Backing off 2 min 15 sec on upload of I4R44-SDOERR_BARNA5-12-100-RND6142_0_9

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwat
Message 37598 - Posted: 16 Aug 2014 | 20:42:21 UTC

And of course cant get new tasks.


8/16/2014 8:34:07 PM | GPUGRID | Sending scheduler request: Requested by user.
8/16/2014 8:34:07 PM | GPUGRID | Requesting new tasks for CPU and NVIDIA
8/16/2014 8:34:09 PM | GPUGRID | Scheduler request completed: got 0 new tasks
8/16/2014 8:34:09 PM | GPUGRID | No tasks sent
8/16/2014 8:34:09 PM | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card)
8/16/2014 8:34:09 PM | GPUGRID | This computer has reached a limit on tasks in progress

Profile Mumak
Avatar
Send message
Joined: 7 Dec 12
Posts: 92
Credit: 225,897,225
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwat
Message 37599 - Posted: 16 Aug 2014 | 20:44:13 UTC

The entire staff on vacation?

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwat
Message 37600 - Posted: 16 Aug 2014 | 20:52:29 UTC

Now 4 waiting. No new. I am crunching E@H meantime.

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,469,215,105
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37601 - Posted: 16 Aug 2014 | 21:23:03 UTC

Also upload problems for me, but I got a new task, that is running now. The small files all uploaded the bigger ones don't.
Two old erros on my list from 2013 can be removed from the server Matt, that makes a tiny bit of room....:)

http://www.gpugrid.net/results.php?userid=29115&offset=0&show_names=1&state=5&appid=
____________
Greetings from TJ

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwat
Message 37602 - Posted: 16 Aug 2014 | 22:50:36 UTC - in response to Message 37601.

Hi TJ , Yes I see your comp id 163838 got two uploaded at 21:19 & 21:56 UTC, so thus the new. I wonder how they squeezed in? Mybe the fix has started :)

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1060
Credit: 1,124,857,939
RAC: 1,264,503
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37603 - Posted: 16 Aug 2014 | 23:13:00 UTC

Yay -- I get to test BOINC's ability to have my 0-resource-share GPU-backup projects kick in again! (My GPU-backup projects: SETI, Einstein, SETIBETA, and Albert)

By the way, it's always good to have backup projects setup, so your GPUs can stay busy, when things like this happen to your main project!

Until then, we just have to wait patiently. So, make sure your GPUs keep busy during the wait!

- Jacob

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwat
Message 37604 - Posted: 17 Aug 2014 | 0:31:17 UTC

Well now this effecting <24hr return bonuses. Still no word from project ? :(

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwat
Message 37605 - Posted: 17 Aug 2014 | 1:46:02 UTC - in response to Message 37604.

Well I was able to hit "retry" a couple times on my oldest tasks and was able to get them to upload and get new b4 the 24hr mark.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1060
Credit: 1,124,857,939
RAC: 1,264,503
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37606 - Posted: 17 Aug 2014 | 2:32:39 UTC - in response to Message 37605.

I think that involves getting lucky enough to "shove your uploads" into the space that was freed up elsewhere. I too was able to do that to get some of mine to upload. But some are still failing with "[error] Error reported by file upload server: Server is out of disk space"

So... We just have to be patient. I hope somebody can fix it Monday, but MJH mentioned vacations, so, we'll see. Just be patient, and make sure your backup projects can keep the GPUs busy :)

Robert7NBI
Send message
Joined: 17 Jul 09
Posts: 1
Credit: 250,200,747
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwat
Message 37607 - Posted: 17 Aug 2014 | 9:31:16 UTC

9289 GPUGRID 17-08-2014 11:31:59 [error] Error reported by file upload server: can't write file /home/ps3grid/projects/PS3GRID/upload/288/I16R45-SDOERR_BARNA5-13-100-RND1176_1_9: No space left on server

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37608 - Posted: 17 Aug 2014 | 10:10:26 UTC

I installed Windows 8.1 this morning. First job was to install BOINC and connect to GPUGrid. Got a long and it's processing. But...

I have two GPUs, both seen by Device Manager, but I cannot get another WU after many "update" commands.

Is this due to the current problem or must I do something to get a second?

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,827,795,914
RAC: 445,250
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37609 - Posted: 17 Aug 2014 | 10:33:35 UTC - in response to Message 37607.
Last modified: 17 Aug 2014 | 10:34:29 UTC

I emailed Matt and Gianni regarding the disk space problems...
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1837
Credit: 10,436,349,394
RAC: 8,755,917
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37610 - Posted: 17 Aug 2014 | 10:40:41 UTC - in response to Message 37608.

I installed Windows 8.1 this morning. First job was to install BOINC and connect to GPUGrid. Got a long and it's processing. But...

I have two GPUs, both seen by Device Manager, but I cannot get another WU after many "update" commands.

Is this due to the current problem or must I do something to get a second?

This is a different problem.
Possibly you can resolve this by creating a line containing <use_all_gpus>1</use_all_gpus> under the <options> section of the cc_config.xml file on your host.
See this post.

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37611 - Posted: 17 Aug 2014 | 10:55:14 UTC - in response to Message 37610.

Many thanks Retvari. That hit the spot!!

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 810,073,458
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37612 - Posted: 17 Aug 2014 | 11:48:05 UTC - in response to Message 37606.
Last modified: 17 Aug 2014 | 11:50:43 UTC


Just be patient, and make sure your backup projects can keep the GPUs busy :)


Hopefully Einstein has new enough applications and dont dislike some of my x4 pcie 1.0 slots anymore ^^ they are gpugrid only cruncher so they are build cheapest possible i could get it run here at full speed *hehe* but we will see, we only can wait as mentioned, one machine seemed to upload enough files to get two new Workunits itself over night :)

Edit: no Sir, Einstein still dont like low PCIe bandwidth, BOINCTasks says two 570s and one of it is 50% slower then the other, noooo ^^
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Crunching for my deceased Dog who had "good" Braincancer..

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 584
Credit: 4,273,184
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 37613 - Posted: 17 Aug 2014 | 13:23:26 UTC - in response to Message 37612.

freed up some space this morning.

Aedazan
Send message
Joined: 8 Apr 11
Posts: 1
Credit: 84,476,851
RAC: 1,523
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 37614 - Posted: 17 Aug 2014 | 13:55:02 UTC - in response to Message 37613.
Last modified: 17 Aug 2014 | 13:59:21 UTC

Not enough, I am still getting the error. :P
Feel free to free some more ;)

In light of this issue I think I will switch to long runs on my server from now on :D

Robert Gammon
Send message
Joined: 28 May 12
Posts: 61
Credit: 619,183,642
RAC: 304,485
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 37615 - Posted: 17 Aug 2014 | 14:36:39 UTC - in response to Message 37613.

Still getting no luck on uploads, and server refuses to give me more tasks to run.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1060
Credit: 1,124,857,939
RAC: 1,264,503
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37618 - Posted: 17 Aug 2014 | 17:12:07 UTC - in response to Message 37613.

freed up some space this morning.


Thanks, but we are still unable to upload, due to no space being available.

opr
Send message
Joined: 24 May 11
Posts: 7
Credit: 51,058,276
RAC: 209,679
Level
Thr
Scientific publications
watwatwatwat
Message 37619 - Posted: 17 Aug 2014 | 17:15:52 UTC

Hello!
So... Some disk is full,right? I'm having biggest part of a short-run-wu waiting uploading here too. Boinc says the transfer is 100% but status remains "sending". I guess it sends it some day?

Regards,opr

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37620 - Posted: 17 Aug 2014 | 17:30:49 UTC

And I see the server is fast running out of longs...

Nicolas_orleans
Send message
Joined: 25 Jun 14
Posts: 14
Credit: 446,219,525
RAC: 0
Level
Gln
Scientific publications
watwatwat
Message 37621 - Posted: 17 Aug 2014 | 18:19:34 UTC - in response to Message 37619.

Boinc says the transfer is 100% but status remains "sending".

+1, retries every 4-5 hours fail. No space on disk, it reminds me of other DC projects poorly managed.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 402
Credit: 162,364,996
RAC: 495,019
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37622 - Posted: 17 Aug 2014 | 18:25:47 UTC - in response to Message 37599.

The entire staff on vacation?


I'd expect most of them to be away from the project on Sundays. There might not be any there who know how to free any disk space without creating problems that are even worse.

An additional hard disk for the server is fairly cheap, but would they need to buy something else (such as a hard drive cabinet) in order to allow the server to use it? They might mention the price they would have to pay in order to add a terabyte or so of hard drive space to the server, and then ask for donations to pay for this.

They should, however, tell the server to extend the deadlines of all workunits still in progress by approximately the length of the upload outages.

Downloading workunits MIGHT help free some disk space, so you should probably keep on downloading and running workunits as long as you have enough hard drive space on your computers to store the output files of the workunits - unless the problem on the server gets bad enough to prevent any more downloads.

MrJo
Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 2,475
Level
Met
Scientific publications
watwatwatwatwat
Message 37623 - Posted: 17 Aug 2014 | 19:06:59 UTC - in response to Message 37622.

An additional hard disk for the server is fairly cheap..

It is hard to imagine for me that no one seems to have observed the available disk space for the entire project. Hmm..
____________
Regards, Josef

Profile Misfit
Avatar
Send message
Joined: 23 May 08
Posts: 33
Credit: 543,059,481
RAC: 325,720
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37624 - Posted: 17 Aug 2014 | 20:52:47 UTC

8/17/2014 1:53:41 PM | GPUGRID | [error] Error reported by file upload server: Server is out of disk space

Woohoo!

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1060
Credit: 1,124,857,939
RAC: 1,264,503
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37627 - Posted: 18 Aug 2014 | 2:05:38 UTC

I was able to upload my files now, and get new work. The issue might be resolved.

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 810,073,458
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37628 - Posted: 18 Aug 2014 | 5:51:15 UTC
Last modified: 18 Aug 2014 | 6:02:45 UTC

Thx god =)
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Crunching for my deceased Dog who had "good" Braincancer..

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37629 - Posted: 18 Aug 2014 | 5:53:30 UTC - in response to Message 37627.

I was able to upload my files now, and get new work. The issue might be resolved.

Not for me! This morning I found an upload that had stalled at 100%. I did a retry and it is now at 48%.

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 810,073,458
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37630 - Posted: 18 Aug 2014 | 6:03:13 UTC
Last modified: 18 Aug 2014 | 6:08:06 UTC

Ehm ok not all units uploaded, full again now O.o

Hopefully something today on monday happens, the deadline is coming closer on some units and a massive resend cos much units will go over 2 1/2 day until report when they "repaired" it O.o
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Crunching for my deceased Dog who had "good" Braincancer..

MrJo
Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 2,475
Level
Met
Scientific publications
watwatwatwatwat
Message 37631 - Posted: 18 Aug 2014 | 6:31:31 UTC - in response to Message 37627.

The issue might be resolved.

Unfortunately, not me

____________
Regards, Josef

Localizer
Send message
Joined: 17 Apr 08
Posts: 113
Credit: 1,656,514,857
RAC: 11,478
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37632 - Posted: 18 Aug 2014 | 6:33:00 UTC

.....Still have the issue on my two hosts; although overnight one host did manage to report 4 WUs, it is now stalled again.

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37633 - Posted: 18 Aug 2014 | 6:36:04 UTC - in response to Message 37629.

I was able to upload my files now, and get new work. The issue might be resolved.

Not for me! This morning I found an upload that had stalled at 100%. I did a retry and it is now at 48%.

Same WU stalled at 100% and is now back to 49%!

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwat
Message 37634 - Posted: 18 Aug 2014 | 6:58:17 UTC
Last modified: 18 Aug 2014 | 6:59:51 UTC

The issue is not resolved at this time.

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,469,215,105
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37635 - Posted: 18 Aug 2014 | 9:10:38 UTC

Sometimes uploading is working, sometimes not. I got new work and some result where uploaded normally, other way after 24 hours. So there seems to be progress but it is not resolved yet.
____________
Greetings from TJ

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 810,073,458
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37636 - Posted: 18 Aug 2014 | 9:39:39 UTC
Last modified: 18 Aug 2014 | 9:41:13 UTC

Yes i think always when the right workunits get through to end a sequence, it writes down the result and deletes the big files for example three workunits and freeing up itself some space, where 100users try to poke in, and hopefully the correct ones to get again some other sequences finished ^^ i want my 500M Milestone!! :P
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Crunching for my deceased Dog who had "good" Braincancer..

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37637 - Posted: 18 Aug 2014 | 10:13:01 UTC

Well - it's Monday afternoon in Spain and still no reaction. Clearly no-one is minding the shop...

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 810,073,458
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37638 - Posted: 18 Aug 2014 | 10:30:07 UTC
Last modified: 18 Aug 2014 | 10:30:51 UTC

I think in america it could be still wakeup morning time? Dont know where the staff is located.
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Crunching for my deceased Dog who had "good" Braincancer..

mikey
Send message
Joined: 2 Jan 09
Posts: 269
Credit: 247,832,990
RAC: 377,122
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 37639 - Posted: 18 Aug 2014 | 10:38:16 UTC - in response to Message 37637.

Well - it's Monday afternoon in Spain and still no reaction. Clearly no-one is minding the shop...


Yeah I am out of work and have units to send back, I set my pc's to no new tasks and attached to a different project.

John C MacAlister
Send message
Joined: 17 Feb 13
Posts: 178
Credit: 132,357,411
RAC: 16,335
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 37641 - Posted: 18 Aug 2014 | 10:44:11 UTC - in response to Message 37639.

August in Spain.....

Well - it's Monday afternoon in Spain and still no reaction. Clearly no-one is minding the shop...


Yeah I am out of work and have units to send back, I set my pc's to no new tasks and attached to a different project.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1060
Credit: 1,124,857,939
RAC: 1,264,503
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37642 - Posted: 18 Aug 2014 | 10:50:37 UTC - in response to Message 37639.

Well - it's Monday afternoon in Spain and still no reaction. Clearly no-one is minding the shop...


Yeah I am out of work and have units to send back, I set my pc's to no new tasks and attached to a different project.


There no reason to set No New Tasks. Just let BOINC's transmit-retry-back-off do its job until GPUGrid can complete the transfers, and make sure you are attached to backup-GPU projects.

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 258
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 37643 - Posted: 18 Aug 2014 | 11:20:29 UTC

We are actually in the shop :) The problem is figuring out where all the disk space has vanished which seems to take some time and then finding the best course of action. But we are definitely working on it.

Profile Mumak
Avatar
Send message
Joined: 7 Dec 12
Posts: 92
Credit: 225,897,225
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwat
Message 37644 - Posted: 18 Aug 2014 | 11:22:16 UTC

Will we loose credit for not reporting long tasks within 24h ?

MrJo
Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 2,475
Level
Met
Scientific publications
watwatwatwatwat
Message 37645 - Posted: 18 Aug 2014 | 11:54:30 UTC

The problem is figuring out where all the disk space has vanished

A mysterious disappearance of hard disk space. Most probably extraterrestrial forces were involved. Maybe we will find them with Seti@home ;-)
____________
Regards, Josef

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 584
Credit: 4,273,184
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 37646 - Posted: 18 Aug 2014 | 12:42:43 UTC - in response to Message 37645.
Last modified: 18 Aug 2014 | 12:44:16 UTC

Believe it or not, dedicated people are working on the overworked server regardless of the time of the day. When the disk is full it takes time even to delete files.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1060
Credit: 1,124,857,939
RAC: 1,264,503
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37648 - Posted: 18 Aug 2014 | 13:32:12 UTC - in response to Message 37646.

Thank you Toni. Keep up the good work.

My backup GPU projects kicked in briefly, and everything worked as intended. Now as GPUGrid comes back, and can handle the transfers, BOINC will transition to using it again. All without user intervention.

Again, thank you, for continuing to make GPUGrid run relatively very smoothly!

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37649 - Posted: 18 Aug 2014 | 14:14:05 UTC - in response to Message 37648.

Thank you Toni. Keep up the good work.

Here here!

Jacob,

I took your advice and decided to connect to einstein@home. It's now downloading ~50 files at 5 megs each! And given my BOINC settings they look like they're CPU tasks.

What project do you recommend to guarantee GPU tasks?

MrJo
Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 2,475
Level
Met
Scientific publications
watwatwatwatwat
Message 37650 - Posted: 18 Aug 2014 | 14:18:02 UTC - in response to Message 37648.

My backup GPU projects kicked in briefly, and everything worked as intended. Now as GPUGrid comes back, and can handle the transfers, BOINC will transition to using it again. All without user intervention.


Hi Jacob,

can you please tell me how you manage this? Sounds like:

    If (GPUGrid fails) than
      run Seti@home

    Else

      run GPUGrid

    End If


I would be very grateful for your solution.
____________
Regards, Josef

Localizer
Send message
Joined: 17 Apr 08
Posts: 113
Credit: 1,656,514,857
RAC: 11,478
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37651 - Posted: 18 Aug 2014 | 14:30:03 UTC - in response to Message 37650.

My backup GPU projects kicked in briefly, and everything worked as intended. Now as GPUGrid comes back, and can handle the transfers, BOINC will transition to using it again. All without user intervention.


Hi Jacob,

can you please tell me how you manage this? Sounds like:

    If (GPUGrid fails) than
      run Seti@home

    Else

      run GPUGrid

    End If


I would be very grateful for your solution.



......... Pick another project & set the resource share to 0 in Boinc Manager - that way work will only be requested when projects with a greater than 0 resource share are unavailable.

I have Einstein set to resources share of 0 all the time - so when GPUGrid developed this problem my system started requesting a single Einstein GPU task at a time -a resource share of 0 does not download more than one task at a time.

Make sure you edit your Einstien preferences to only use your nVidia GPU - or you may find CPU tasks appearing ........

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37652 - Posted: 18 Aug 2014 | 14:32:45 UTC - in response to Message 37638.

Dont know where the staff is located.

GPUGrid is based in Barcelona, Spain.

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 584
Credit: 2,007,913,400
RAC: 1,652,527
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37653 - Posted: 18 Aug 2014 | 14:49:25 UTC - in response to Message 37646.
Last modified: 18 Aug 2014 | 15:07:37 UTC

Believe it or not, dedicated people are working on the overworked server regardless of the time of the day. When the disk is full it takes time even to delete files.


Glad to hear the server is overworked always good news for a project. Why not ask a big company to make a donation of new server for exposure? I think maybe get someone to send begging letters to all computer suppliers on a daily basis like Andy Dufresne did in Shawshank Redemption, excellent film!

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 584
Credit: 4,273,184
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 37654 - Posted: 18 Aug 2014 | 14:52:15 UTC - in response to Message 37652.

An update: we have been freeing up space since Sunday night. The resolution is not instantaneous because:

1. deleting huge amount of files is slow
2. as soon as files are deleted, the freed up space is used by newly uploaded WUs.

So, please don't worry if things take some time to go back to normality.

The increase in required space is partly due to new WU types (e.g. the CPU experiments).

Addition of HD space has been of course considered, but with the current server it is sadly not possible for hardware reasons.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1060
Credit: 1,124,857,939
RAC: 1,264,503
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37655 - Posted: 18 Aug 2014 | 14:52:52 UTC
Last modified: 18 Aug 2014 | 14:57:07 UTC

......... Pick another project & set the resource share to 0 in Boinc Manager - that way work will only be requested when projects with a greater than 0 resource share are unavailable.

I have Einstein set to resources share of 0 all the time - so when GPUGrid developed this problem my system started requesting a single Einstein GPU task at a time -a resource share of 0 does not download more than one task at a time.

Make sure you edit your Einstien preferences to only use your nVidia GPU - or you may find CPU tasks appearing ........


That's right -- To use a "Backup project", make sure it's Resource Share is 0. See, that's a special case -- "0 Resource Share" means "Only ask this project for work if all other projects couldn't give me any, and even then, just give me enough work to keep busy right now, don't build a cache of work."

So, I have Einstein/SETI/Albert/SETIBeta all set to 0 resource share as backup projects, and all allowed to download GPU tasks.

I have tons of CPU projects (so I don't have much chance of running out of work on CPU), but I only have GPUGrid/POEM for GPU projects. And so, basically, I end up never asking those backup projects for work, except when GPUGrid has an issue. And when that happens, my setup works flawlessly to keep GPUs busy with backup project GPU tasks, all without user intervention.

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37656 - Posted: 18 Aug 2014 | 15:00:58 UTC - in response to Message 37651.

Pick another project & set the resource share to 0 in Boinc Manager

OK. I have three GPUGrid uploads @ 100%, and one GPUGrid and one Einstein GPU task running.

I've tried but I cannot find how to set the resource share by project in BOINC Manager. Help!!

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1060
Credit: 1,124,857,939
RAC: 1,264,503
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37657 - Posted: 18 Aug 2014 | 15:05:01 UTC - in response to Message 37656.
Last modified: 18 Aug 2014 | 15:12:10 UTC

"Resource Share" is a project setting.
http://boinc.berkeley.edu/wiki/preferences

It can be set either at:
- the project's website (and then issue an Update command in BOINC), or
- if you use an account manager (like BOINCStats), you can set it there (and then issue an Update command in BOINC).

It can even be set "per venue/location", but I don't use venues/locations to do my settings.

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37658 - Posted: 18 Aug 2014 | 15:22:37 UTC - in response to Message 37657.

"Resource Share" is a project setting.

Ah! It's not a BOINC Manager setting.

I have GPUGrid @ 100% and Einstein @ 1% (it does not like 0%), so I guess I'm OK for the eventual return to normal of GPUGrid!

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1060
Credit: 1,124,857,939
RAC: 1,264,503
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37659 - Posted: 18 Aug 2014 | 15:34:05 UTC - in response to Message 37658.
Last modified: 18 Aug 2014 | 15:35:14 UTC

...and Einstein @ 1% (it does not like 0%)

Can you please explain?.

You should be able to put 0, click "Update preferences", and then it'll show "---" meaning "it is now set as a backup project".

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37660 - Posted: 18 Aug 2014 | 15:42:58 UTC - in response to Message 37659.

...and Einstein @ 1% (it does not like 0%)

Can you please explain?.

You should be able to put 0, click "Update preferences", and then it'll show "---" meaning "it is now set as a backup project".

The first time I tried it it would not let me enter a "0". Just tried it again and it did!! And it now shows ---

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 810,073,458
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37661 - Posted: 18 Aug 2014 | 16:28:43 UTC
Last modified: 18 Aug 2014 | 16:30:12 UTC

Yes its definate running with 0%, its working here too. I would prefer to run POEM as backup but they have two Problems.

1.) Not enough workunits all the time for a 100% working backupproject, because they need a backupproject itself :D
2.) does not like dual gpu machines, yes i know exclude blabla, but then point 1. is again for one of the GPUs ^^

So its only Einstein for science reason you can use as backup thats really working as it should. Im sure they are happy about some additional hardcore gpus witch are running on this project :D
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Crunching for my deceased Dog who had "good" Braincancer..

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 159
Credit: 780,146,469
RAC: 1,265
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37662 - Posted: 18 Aug 2014 | 16:59:19 UTC

Hmm

I was hoping with the beginning of the work week the disk space issues would be confirmed by a project message and some information regarding a lasting solution shared with people here.

I am still hopeful we'll something along these lines.

I too would revert to POEM GPU's -- but they are exceedingly rare at the moment as they work through application issues.

Instead, I'm running work from Prime Grid or Collatz or Moo. My definitely preference with Nvidia cards is GPU Grid which has been running well over the past several months.

So here's hoping we get some message from the project letting folks know of a projected time frame for resolution.

____________

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37663 - Posted: 18 Aug 2014 | 17:01:16 UTC - in response to Message 37661.
Last modified: 18 Aug 2014 | 17:02:33 UTC

So its only Einstein for science reason you can use as backup thats really working as it should.

I'm not so sure. I'm running my last GPUGrid and one Einstein. I set Einstein to "No CPU" and did an "update". So what's this enormous 500meg download?:



There's much more below...

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 810,073,458
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37664 - Posted: 18 Aug 2014 | 17:10:10 UTC
Last modified: 18 Aug 2014 | 17:12:18 UTC

Seems your getting very much workunits ;) did you set it to 0% and update your client before you accept new work?
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Crunching for my deceased Dog who had "good" Braincancer..

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 159
Credit: 780,146,469
RAC: 1,265
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37665 - Posted: 18 Aug 2014 | 17:13:57 UTC - in response to Message 37662.

OK -- I see I missed an earlier message from Toni describing the problem -- my apologies.

I am wondering if it is possible to connect an external drive not on the array and to use it to offload files for archive purposes and thus free up space. It is possible that the server hardware doesn't directly support an external USB or ESATA drive though -- in which case, it would require downing the server (never any fun) and installing a card to provide the interface. Just a thought since the message suggests this is going to be an ongoing issue.



Hmm

I was hoping with the beginning of the work week the disk space issues would be confirmed by a project message and some information regarding a lasting solution shared with people here.

I am still hopeful we'll something along these lines.

I too would revert to POEM GPU's -- but they are exceedingly rare at the moment as they work through application issues.

Instead, I'm running work from Prime Grid or Collatz or Moo. My definitely preference with Nvidia cards is GPU Grid which has been running well over the past several months.

So here's hoping we get some message from the project letting folks know of a projected time frame for resolution.

MrJo
Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 2,475
Level
Met
Scientific publications
watwatwatwatwat
Message 37666 - Posted: 18 Aug 2014 | 17:21:15 UTC

@Jacob: Thanks, Einstein works now perfekt as a backup-project.



But as with Tomba, there are a lot of WU's now. And the GPU load varies between 0 and 50%, CPU-application is running as well, although I have GPU-only applications clicked in the settings. Einstein does not work the way I'm used to from GPUGrid. Hmm..
____________
Regards, Josef

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37667 - Posted: 18 Aug 2014 | 17:43:28 UTC - in response to Message 37664.

Seems your getting very much workunits ;) did you set it to 0% and update your client before you accept new work?

Yep. I set it to 0% and did an update.

Those 50+ files are downloaded and they are NOT CPU WUs. Not sure what they are but I'm currently running my final GPUGrid WU and one Einstein.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1060
Credit: 1,124,857,939
RAC: 1,264,503
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37668 - Posted: 18 Aug 2014 | 17:59:26 UTC - in response to Message 37666.
Last modified: 18 Aug 2014 | 18:01:19 UTC

@Jacob: Thanks, Einstein works now perfekt as a backup-project.



But as with Tomba, there are a lot of WU's now. And the GPU load varies between 0 and 50%, CPU-application is running as well, although I have GPU-only applications clicked in the settings. Einstein does not work the way I'm used to from GPUGrid. Hmm..


If you accidentally got tons of work units without having the Resource Share set to 0, but now have it correctly set to 0 Resource Share, you can feel free to abort any unstarted work units, without fear of wasting any work. You can abort started ones too, but it's preferable not to do that, since you'd be throwing away some work that has already been done.

Regarding GPU Usage, yeah, if it's an "OpenCL" task, those only use the GPU at certain times. So, you'll see usage fluctuate often, and maybe even be 0 most of the time, but at least you're trying to keep it busy. That's just the nature of their OpenCL app, which is indeed quite different than a GPUGrid CUDA app.

Just keep the GPUs busy, or at least try :)

MrJo
Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 2,475
Level
Met
Scientific publications
watwatwatwatwat
Message 37669 - Posted: 18 Aug 2014 | 19:18:44 UTC - in response to Message 37668.
Last modified: 18 Aug 2014 | 20:13:27 UTC

you can feel free to abort any unstarted work units, without fear of wasting any work.

Done :)

Edit:
Got a much better GPU-utilization with the following app_config.xml in the Einstein-directory:

<app_config>

<app>
<name>einsteinbinary_BRP4G</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.2</cpu_usage>
</gpu_versions>
</app>

<app>
<name>hsgamma_FGRP3</name>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.2</cpu_usage>
</gpu_versions>
</app>

</app_config>

____________
Regards, Josef

enels
Send message
Joined: 16 Sep 08
Posts: 8
Credit: 665,552,044
RAC: 386,645
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37670 - Posted: 18 Aug 2014 | 20:14:52 UTC

Seems to be working.

RaymondFO*
Send message
Joined: 22 Nov 12
Posts: 72
Credit: 10,191,216,989
RAC: 8,899,766
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 37671 - Posted: 18 Aug 2014 | 20:30:49 UTC - in response to Message 37669.
Last modified: 18 Aug 2014 | 20:39:14 UTC


Edit:
Got a much better GPU-utilization with the following app_config.xml in the Einstein-directory:

<app_config>

<app>
<name>einsteinbinary_BRP4G</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.2</cpu_usage>
</gpu_versions>
</app>



<app>
<name>hsgamma_FGRP3</name>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.2</cpu_usage>
</gpu_versions>
</app>

</app_config>


Generally for most tasks, E@H GPU utilization for the most part is not as efficient as GPUGRID. If you want a simpler option, other than the above, which is very useful and may help to address "CPU suffocation" issues that E@H GPU tasks may occur on certain computers, if not video cards, by changing the "cpu usage" value, then follow these simple steps:

1) Open up your "Einstein@Home preferences" page;

2) Scroll down to the bottom of the "default/home/work/school" subsection where you see:
"GPU utilization factor of [Name of] apps DANGEROUS! Only touch this if you are absolutely sure of what you are doing! Wrong setting might even damage your computer! Use solely on your own risk! Min: -1.0 / Max: 1.0 / Default: 1.0, negative values will disable GPU tasks of this type"

3) The number "1" is your default setting of 1 GPU task per 1 GPU card. If you want a higher amount, then change the setting as follows:

0.5 means 2 GPU tasks per GPU card;
0.25 means 4 GPU tasks per GPU card, etc..

Please remember to save the changes, and CPU default usage values for E@H remain unaffected using this method.

Hope this helps and good luck!

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,469,215,105
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 37674 - Posted: 18 Aug 2014 | 22:16:58 UTC - in response to Message 37661.

So its only Einstein for science reason you can use as backup thats really working as it should. Im sure they are happy about some additional hardcore gpus witch are running on this project :D

Or MilkyWay@home, there multiple GPU's work great too.
____________
Greetings from TJ

MrJo
Send message
Joined: 18 Apr 14
Posts: 43
Credit: 1,192,135,172
RAC: 2,475
Level
Met
Scientific publications
watwatwatwatwat
Message 37677 - Posted: 19 Aug 2014 | 6:14:07 UTC

Everything runs fine again. Thanks to all of the problem solving involved people!
____________
Regards, Josef

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 456
Credit: 810,073,458
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 37678 - Posted: 19 Aug 2014 | 6:16:40 UTC - in response to Message 37674.
Last modified: 19 Aug 2014 | 6:20:31 UTC

So its only Einstein for science reason you can use as backup thats really working as it should. Im sure they are happy about some additional hardcore gpus witch are running on this project :D

Or MilkyWay@home, there multiple GPU's work great too.


Im really not using nvidia gpus where DP Performance is needed ;) MW has a 7950 24/7 running from me that must be enough until poem has enough ati workunits ^^

but yes everything is running fine now again, great! :)
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Crunching for my deceased Dog who had "good" Braincancer..

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 37680 - Posted: 19 Aug 2014 | 8:56:07 UTC

There is now plenty of space on the server so no one should be having upload/download problems. Of course, please let us know if any of you still has problems. We still have work to do to ensure this doesn't happen again, but that's for us to worry about.

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwat
Message 37681 - Posted: 19 Aug 2014 | 9:37:49 UTC - in response to Message 37680.
Last modified: 19 Aug 2014 | 9:40:48 UTC

Thanx to all the hard work you all did to sort it out !!!

mikey
Send message
Joined: 2 Jan 09
Posts: 269
Credit: 247,832,990
RAC: 377,122
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 37683 - Posted: 19 Aug 2014 | 11:46:04 UTC - in response to Message 37642.

Well - it's Monday afternoon in Spain and still no reaction. Clearly no-one is minding the shop...


Yeah I am out of work and have units to send back, I set my pc's to no new tasks and attached to a different project.


There no reason to set No New Tasks. Just let BOINC's transmit-retry-back-off do its job until GPUGrid can complete the transfers, and make sure you are attached to backup-GPU projects.


I did that to avoid problems with the project that IS sending me work and its own deadlines. I don't like units hanging out in my cache just sitting there so keep it pretty short, about a day total. I have now finished the 'other' projects units and am back here again. I KNOW...I am a micro-manager, but it is who I am.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1073
Credit: 4,488,523,379
RAC: 398,078
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38357 - Posted: 8 Oct 2014 | 3:50:29 UTC

Uploads slowed to a crawl a few hours ago and are getting slower and slower. Other projects are uploading normally. Checked my connection with Ookla Speedtest and the speed is normal. WU uploads are even starting to stall and lose connection:

2422 GPUGRID 10-07-14 22:39 Started upload of 2x2162-NOELIA_5MG-2-3-RND6808_0_9
2423 GPUGRID 10-07-14 22:39 [error] Error reported by file upload server: [2x2162-NOELIA_5MG-2-3-RND6808_0_9] locked by file_upload_handler PID=22012
2424 GPUGRID 10-07-14 22:39 Temporarily failed upload of 2x2162-NOELIA_5MG-2-3-RND6808_0_9: transient upload error
2425 GPUGRID 10-07-14 22:39 Backing off 00:02:50 on upload of 2x2162-NOELIA_5MG-2-3-RND6808_0_9
2426 GPUGRID 10-07-14 22:39 Started upload of 2x2162-NOELIA_5MG-2-3-RND6808_0_9
2427 GPUGRID 10-07-14 22:40 Temporarily failed upload of 2x2162-NOELIA_5MG-2-3-RND6808_0_9: transient HTTP error
2428 GPUGRID 10-07-14 22:40 Backing off 00:06:50 on upload of 2x2162-NOELIA_5MG-2-3-RND6808_0_9
2429 10-07-14 22:40 Project communication failed: attempting access to reference site
2430 10-07-14 22:40 Internet access OK - project servers may be temporarily down.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1073
Credit: 4,488,523,379
RAC: 398,078
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38359 - Posted: 8 Oct 2014 | 5:15:43 UTC

Things seem to be normal again. Thanks!

Simba123
Send message
Joined: 5 Dec 11
Posts: 147
Credit: 69,970,684
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 38972 - Posted: 20 Nov 2014 | 9:17:13 UTC

Yo Guys/Gals. Is there a problem with the server again?
Uploads are maxing at 8 KBps and then stalling.

my log show this error
20/11/2014 6:40:32 PM | GPUGRID | Temporarily failed upload of 20mgx465-NOELIA_20MG2-25-50-RND2603_0_9: transient HTTP error

No problems with the internet from this machine, and WCG tasks have uploaded at 50-80 KBps while GPUGrid has been stuck.

I have 2 tasks struggling to upload at the moment.

HELP!!!

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 159
Credit: 780,146,469
RAC: 1,265
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 39407 - Posted: 7 Jan 2015 | 4:58:19 UTC

Been like this for the past hour or so.

1/6/2015 9:45:10 PM | GPUGRID | Backing off 05:36:00 on upload of 1x81-GERARD_CXCL12_LIG10-4-5-RND5536_0_9
1/6/2015 9:46:42 PM | GPUGRID | Started upload of 1x81-GERARD_CXCL12_LIG10-4-5-RND5536_0_9
1/6/2015 9:46:44 PM | GPUGRID | [error] Error reported by file upload server: [1x81-GERARD_CXCL12_LIG10-4-5-RND5536_0_9] locked by file_upload_handler PID=17978
1/6/2015 9:46:44 PM | GPUGRID | Temporarily failed upload of 1x81-GERARD_CXCL12_LIG10-4-5-RND5536_0_9: transient upload error
1/6/2015 9:46:44 PM | GPUGRID | Backing off 04:24:32 on upload of 1x81-GERARD_CXCL12_LIG10-4-5-RND5536_0_9
1/6/2015 9:58:25 PM | GPUGRID | Started upload of 1x81-GERARD_CXCL12_LIG10-4-5-RND5536_0_9
1/6/2015 9:58:27 PM | GPUGRID | [error] Error reported by file upload server: [1x81-GERARD_CXCL12_LIG10-4-5-RND5536_0_9] locked by file_upload_handler PID=17978
1/6/2015 9:58:27 PM | GPUGRID | Temporarily failed upload of 1x81-GERARD_CXCL12_LIG10-4-5-RND5536_0_9: transient upload error
1/6/2015 9:58:27 PM | GPUGRID | Backing off 04:27:53 on upload of 1x81-GERARD_CXCL12_LIG10-4-5-RND5536_0_9

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,827,795,914
RAC: 445,250
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 39408 - Posted: 7 Jan 2015 | 12:41:55 UTC - in response to Message 39407.

Sounds similar to this communication/server problem described by mundayweb,
(03. Error code -121 to -130 explained).

    ERR_UPLOAD_TRANSIENT -127

    First an explanation what transient means. Transient refers to a module that, once loaded into main memory, is expected to remain in memory for a short time.

    This is a server error.
    The file you are trying to upload is locked on the server. The file_upload_handler put an advisory lock on the file, to prevent other file upload handlers to write to the file.

    This can only be fixed by the project.

    Extra messages in 5.8 branch of BOINC.
    can't open file - Advisory file locking is not guaranteed reliable when used with stream buffered IO.
    can't lock file - File Upload Handler can't put an advisory lock on the file to prevent it being overwritten by other FUHs.
    Maintenance underway: file uploads are temporarily disabled. - You can't upload as the server is down for maintenance.


http://boincfaq.mundayweb.com/index.php?viewCat=3

I can upload and report work, as can others.
Is this specific to one task and one system?
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 159
Credit: 780,146,469
RAC: 1,265
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 39425 - Posted: 9 Jan 2015 | 6:19:32 UTC - in response to Message 39408.

It was and it cleared about 2 hours later.

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 159
Credit: 780,146,469
RAC: 1,265
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 40101 - Posted: 8 Feb 2015 | 5:21:53 UTC - in response to Message 39425.
Last modified: 8 Feb 2015 | 5:43:41 UTC

It appears this problem has resurfaced. Seems the AMD uploads are processing fine, but the Nvidea ones are not:

This so far is on two different systems

2/7/2015 10:23:11 PM | GPUGRID | Started upload of e2s134_e1s85f79-GERARD_BENTRYP_GAAMPCGEN2-0-1-RND0868_0_9
2/7/2015 10:23:12 PM | GPUGRID | [error] Error reported by file upload server: [e2s134_e1s85f79-GERARD_BENTRYP_GAAMPCGEN2-0-1-RND0868_0_9] locked by file_upload_handler PID=27570
2/7/2015 10:23:12 PM | GPUGRID | Temporarily failed upload of e2s134_e1s85f79-GERARD_BENTRYP_GAAMPCGEN2-0-1-RND0868_0_9: transient upload error
2/7/2015 10:23:12 PM | GPUGRID | Backing off 00:21:54 on upload of e2s134_e1s85f79-GERARD_BENTRYP_GAAMPCGEN2-0-1-RND0868_0_9


When I've seen the transient upload error on other projects it seems it is a case of a lack of disk space..

It was and it cleared about 2 hours later.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 783
Credit: 1,393,314,595
RAC: 1,221,607
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 40104 - Posted: 8 Feb 2015 | 9:26:06 UTC - in response to Message 40101.

When I've seen the transient upload error on other projects it seems it is a case of a lack of disk space.

When the problem is a lack of disk space, it says so in the event log message. We saw both of these at LHC recently:

Server is out of disk space
can't write file ... No space left on server

There are plenty of other failure modes which aren't reported explicitly, but the cause can be discovered by enabling <http_debug> logging for the duration.

In your case, the single file you wanted to upload was "locked by file_upload_handler PID=27570" - probably a comms glitch hadn't cleared properly. As you found, the process times out after a while, the lock is released, and the upload can proceed normally. It was an individual problem with that one single file, not an indication of server trouble in general.

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 159
Credit: 780,146,469
RAC: 1,265
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 40120 - Posted: 9 Feb 2015 | 0:12:59 UTC - in response to Message 40104.

Richard -- thanks for the reply -- the problem uploads on both computers did clear last night.

I did find it curious that I encountered with two different computers at around the same time.

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwat
Message 40512 - Posted: 19 Mar 2015 | 0:12:15 UTC

Upload issue.

3/18/2015 8:12:17 PM | GPUGRID | Started upload of e10s60_e8s18f109-NOELIA_27x3-1-2-RND6719_0_0
3/18/2015 8:12:19 PM | GPUGRID | Temporarily failed upload of e10s60_e8s18f109-NOELIA_27x3-1-2-RND6719_0_0: transient HTTP error
3/18/2015 8:12:19 PM | GPUGRID | Backing off 04:37:43 on upload of e10s60_e8s18f109-NOELIA_27x3-1-2-RND6719_0_0
3/18/2015 8:12:22 PM | | Project communication failed: attempting access to reference site
3/18/2015 8:12:23 PM | | Internet access OK - project servers may be temporarily down.

Just Me?

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 159
Credit: 780,146,469
RAC: 1,265
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 40513 - Posted: 19 Mar 2015 | 1:31:10 UTC - in response to Message 40512.

I got that on a couple of systems -- not sure. For me to resolve it I went to the newest version of BOINC -- for some reason that resolved it on both systems.

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwat
Message 40514 - Posted: 19 Mar 2015 | 1:44:24 UTC - in response to Message 40513.

THANX Barry. That did work. Not sure why but was time to upgrade anyway.

RaymondFO*
Send message
Joined: 22 Nov 12
Posts: 72
Credit: 10,191,216,989
RAC: 8,899,766
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 40515 - Posted: 19 Mar 2015 | 2:23:59 UTC

Had the same problem and when I upgraded from 7.0.28 to 7.4.36 everything worked again.

ROBtheLIONHEART
Send message
Joined: 21 Nov 13
Posts: 34
Credit: 636,026,131
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwat
Message 40516 - Posted: 19 Mar 2015 | 2:34:57 UTC
Last modified: 19 Mar 2015 | 2:36:09 UTC

When I was TS did a project update and noticed in the log something about an error in the certif. maybe related to update to HTTPS ?

megazoid
Send message
Joined: 4 Mar 15
Posts: 2
Credit: 8,626,175
RAC: 0
Level
Ser
Scientific publications
wat
Message 40517 - Posted: 19 Mar 2015 | 6:50:30 UTC
Last modified: 19 Mar 2015 | 7:11:19 UTC

I'm also having upload problems, am on the latest boinc, the lagest file gets to various percentages and then stops.

rebooted pc and managed to upload the large file but still got problems with the small ones.

19/03/2015 07:12:04 | GPUGRID | Temporarily failed upload of e16s55_e1s204f48-NOELIA_1mgx1-1-4-RND3607_0_1: transient HTTP error

JugNut
Send message
Joined: 27 Nov 11
Posts: 11
Credit: 1,003,741,897
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 40518 - Posted: 19 Mar 2015 | 7:29:23 UTC - in response to Message 40517.

Yea same here the log message I recieved says...

29741 GPUGRID 19/03/2015 6:30:15 PM Scheduler request failed: Peer certificate cannot be authenticated with given CA certificates

megazoid
Send message
Joined: 4 Mar 15
Posts: 2
Credit: 8,626,175
RAC: 0
Level
Ser
Scientific publications
wat
Message 40519 - Posted: 19 Mar 2015 | 7:50:12 UTC

are you running windows or Linux jugnut, sound like you have a possible problem with your ca-bundle.crt certificate. are you running the latest version?

JugNut
Send message
Joined: 27 Nov 11
Posts: 11
Credit: 1,003,741,897
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 40520 - Posted: 19 Mar 2015 | 8:02:25 UTC
Last modified: 19 Mar 2015 | 8:03:47 UTC

Yea that was it I updated to the latest boinc version & it's now updating fine :)

(win 7 x64)

HA-SOFT, s.r.o.
Send message
Joined: 3 Oct 11
Posts: 100
Credit: 4,855,582,826
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 40526 - Posted: 19 Mar 2015 | 15:45:41 UTC - in response to Message 40520.
Last modified: 19 Mar 2015 | 15:45:53 UTC

I can not upload any task. HTTP transient error...

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 159
Credit: 780,146,469
RAC: 1,265
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 40527 - Posted: 19 Mar 2015 | 17:23:33 UTC - in response to Message 40514.

I've encountered it now four separate times -- once I lost a completed work unit -- due to a configuration issue with BOINC I suspect.


Perhaps the change to HTTPS is causing a bit of confusion with the uploads.

Again, each time I moved to the current BOINC client, the uploads completed.

Profile Crunch3r
Send message
Joined: 16 Mar 09
Posts: 3
Credit: 207,697,314
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwat
Message 40528 - Posted: 19 Mar 2015 | 17:45:36 UTC - in response to Message 40527.
Last modified: 19 Mar 2015 | 17:48:32 UTC

The only reason for the upload issues is an outdated ca-certificate.
There's no reason to upgrade to a new boinc version. Simply replace the ca-bundle.crt file in the boinc program directory with the one below (extracted from boinc 7.4.36), restart boinc and that's it.

download -> http://www.boincunited.org/ca-bundle.crt
____________

Join BOINC United now!

HA-SOFT, s.r.o.
Send message
Joined: 3 Oct 11
Posts: 100
Credit: 4,855,582,826
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 40529 - Posted: 19 Mar 2015 | 18:04:50 UTC - in response to Message 40528.

My upload starts ok, but after some time (10, 20, 50% progress) it stops. I think it's not related to cert issue.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 783
Credit: 1,393,314,595
RAC: 1,221,607
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 40530 - Posted: 19 Mar 2015 | 18:28:29 UTC - in response to Message 40528.

The only reason for the upload issues is an outdated ca-certificate.
There's no reason to upgrade to a new boinc version. Simply replace the ca-bundle.crt file in the boinc program directory with the one below (extracted from boinc 7.4.36), restart boinc and that's it.

download -> http://www.boincunited.org/ca-bundle.crt

Thanks - that did the trick for my BOINC v6.12.34 (although I pulled the v7.4.36 bundle from another machine on my own network). Didn't even need to restart the (service-mode) client.

But I think we're not out of the woods yet, because I'm getting multiple re-directs (which may account for the stop/restart problem just reported)

19-Mar-2015 18:08:23 [---] [http] [ID#5324] Info: About to connect() to www.gpugrid.org port 80 (#3)
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Info: Trying 193.146.190.61...
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Info: Connected to www.gpugrid.org (193.146.190.61) port 80 (#3)
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Sent header to server: POST /PS3GRID_cgi/file_upload_handler HTTP/1.1
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Sent header to server: User-Agent: BOINC client (windows_intelx86 6.12.34)
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Sent header to server: Host: www.gpugrid.org
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Sent header to server: Accept: */*
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Sent header to server: Accept-Encoding: deflate, gzip
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Sent header to server: Content-Type: application/x-www-form-urlencoded
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Sent header to server: Content-Length: 318
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Sent header to server:
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Received header from server: HTTP/1.1 301 Moved Permanently
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Received header from server: Date: Thu, 19 Mar 2015 18:05:33 GMT
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Received header from server: Server: Apache/2.2.3 (CentOS)
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Info: the ioctl callback returned 0
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Received header from server: Location: https://www.gpugrid.net/PS3GRID_cgi/file_upload_handler
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Received header from server: Cache-Control: max-age=3600
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Received header from server: Expires: Thu, 19 Mar 2015 19:05:33 GMT
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Received header from server: Content-Length: 343
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Received header from server: Content-Type: text/html; charset=iso-8859-1
19-Mar-2015 18:08:23 [---] [http] [ID#5324] Received header from server:
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Info: Ignoring the response-body
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Info: Connection #3 to host www.gpugrid.org left intact
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Info: Issue another request to this URL: 'https://www.gpugrid.net/PS3GRID_cgi/file_upload_handler'
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Info: Re-using existing connection! (#1) with host www.gpugrid.net
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Info: Connected to www.gpugrid.net (193.146.190.61) port 443 (#1)
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Sent header to server: POST /PS3GRID_cgi/file_upload_handler HTTP/1.1
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Sent header to server: User-Agent: BOINC client (windows_intelx86 6.12.34)
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Sent header to server: Host: www.gpugrid.net
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Sent header to server: Accept: */*
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Sent header to server: Accept-Encoding: deflate, gzip
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Sent header to server: Referer: http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Sent header to server: Content-Type: application/x-www-form-urlencoded
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Sent header to server: Content-Length: 318
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Sent header to server:
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Received header from server: HTTP/1.1 200 OK
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Received header from server: Date: Thu, 19 Mar 2015 18:05:33 GMT
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Received header from server: Server: Apache/2.2.3 (CentOS)
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Received header from server: Cache-Control: max-age=300
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Received header from server: Expires: Thu, 19 Mar 2015 18:10:33 GMT
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Received header from server: Transfer-Encoding: chunked
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Received header from server: Content-Type: text/plain; charset=UTF-8
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Received header from server:
19-Mar-2015 18:08:24 [---] [http] [ID#5324] Info: Connection #1 to host www.gpugrid.net left intact
19-Mar-2015 18:08:25 [---] [http] [ID#5324] Info: Re-using existing connection! (#3) with host www.gpugrid.org
19-Mar-2015 18:08:25 [---] [http] [ID#5324] Info: Connected to www.gpugrid.org (193.146.190.61) port 80 (#3)
19-Mar-2015 18:08:25 [---] [http] [ID#5324] Sent header to server: POST /PS3GRID_cgi/file_upload_handler HTTP/1.1
19-Mar-2015 18:08:25 [---] [http] [ID#5324] Sent header to server: User-Agent: BOINC client (windows_intelx86 6.12.34)
19-Mar-2015 18:08:25 [---] [http] [ID#5324] Sent header to server: Host: www.gpugrid.org
19-Mar-2015 18:08:25 [---] [http] [ID#5324] Sent header to server: Accept: */*
19-Mar-2015 18:08:25 [---] [http] [ID#5324] Sent header to server: Accept-Encoding: deflate, gzip
19-Mar-2015 18:08:25 [---] [http] [ID#5324] Sent header to server: Content-Type: application/x-www-form-urlencoded
19-Mar-2015 18:08:25 [---] [http] [ID#5324] Sent header to server: Content-Length: 411278
19-Mar-2015 18:08:25 [---] [http] [ID#5324] Sent header to server: Expect: 100-continue
19-Mar-2015 18:08:25 [---] [http] [ID#5324] Sent header to server:
19-Mar-2015 18:08:26 [---] [http] [ID#5324] Info: Done waiting for 100-continue
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Received header from server: HTTP/1.1 301 Moved Permanently
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Received header from server: Date: Thu, 19 Mar 2015 18:05:35 GMT
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Received header from server: Server: Apache/2.2.3 (CentOS)
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Info: the ioctl callback returned 0
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Received header from server: Location: https://www.gpugrid.net/PS3GRID_cgi/file_upload_handler
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Received header from server: Cache-Control: max-age=3600
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Received header from server: Expires: Thu, 19 Mar 2015 19:05:35 GMT
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Received header from server: Content-Length: 343
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Received header from server: Connection: close
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Received header from server: Content-Type: text/html; charset=iso-8859-1
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Received header from server:
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Info: Closing connection #3
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Info: Issue another request to this URL: 'https://www.gpugrid.net/PS3GRID_cgi/file_upload_handler'
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Info: Re-using existing connection! (#1) with host www.gpugrid.net
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Info: Connected to www.gpugrid.net (193.146.190.61) port 443 (#1)
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Sent header to server: POST /PS3GRID_cgi/file_upload_handler HTTP/1.1
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Sent header to server: User-Agent: BOINC client (windows_intelx86 6.12.34)
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Sent header to server: Host: www.gpugrid.net
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Sent header to server: Accept: */*
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Sent header to server: Accept-Encoding: deflate, gzip
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Sent header to server: Referer: http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Sent header to server: Content-Type: application/x-www-form-urlencoded
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Sent header to server: Content-Length: 411278
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Sent header to server: Expect: 100-continue
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Sent header to server:
19-Mar-2015 18:08:35 [---] [http] [ID#5324] Received header from server: HTTP/1.1 100 Continue
19-Mar-2015 18:08:45 [---] [http] [ID#5324] Received header from server: HTTP/1.1 200 OK
19-Mar-2015 18:08:45 [---] [http] [ID#5324] Received header from server: Date: Thu, 19 Mar 2015 18:05:44 GMT
19-Mar-2015 18:08:45 [---] [http] [ID#5324] Received header from server: Server: Apache/2.2.3 (CentOS)
19-Mar-2015 18:08:45 [---] [http] [ID#5324] Received header from server: Cache-Control: max-age=300
19-Mar-2015 18:08:45 [---] [http] [ID#5324] Received header from server: Expires: Thu, 19 Mar 2015 18:10:44 GMT
19-Mar-2015 18:08:45 [---] [http] [ID#5324] Received header from server: Transfer-Encoding: chunked
19-Mar-2015 18:08:45 [---] [http] [ID#5324] Received header from server: Content-Type: text/plain; charset=UTF-8
19-Mar-2015 18:08:45 [---] [http] [ID#5324] Received header from server:
19-Mar-2015 18:08:45 [---] [http] [ID#5324] Info: Connection #1 to host www.gpugrid.net left intact

GPUGRID Role account
Send message
Joined: 15 Feb 07
Posts: 134
Credit: 1,349,535,983
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 40531 - Posted: 19 Mar 2015 | 18:58:40 UTC

1) Clients that are configured to use the HTTP presentation of GPUGRID shouldn't now be being redirected to HTTPS (for a while last night they were, just to see what would break)

2) If uploads you are having uploads fail, this is probably because the receiving process on the server will only run for so long. I'll need to know not the %age completion but the walltime that the upload had been processing for.

3) 301 redirects are not a problem - the client knows to follow them.

Matt

HA-SOFT, s.r.o.
Send message
Joined: 3 Oct 11
Posts: 100
Credit: 4,855,582,826
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 40532 - Posted: 19 Mar 2015 | 19:17:15 UTC - in response to Message 40531.

5518 GPUGRID 19.3.2015 20:17:53 Started upload of 754-NOELIA_POT-8-13-RND0792_0_0
5519 GPUGRID 19.3.2015 20:18:00 Finished upload of 754-NOELIA_POT-8-13-RND0792_0_0
5520 GPUGRID 19.3.2015 20:18:06 Started upload of 754-NOELIA_POT-8-13-RND0792_0_1
5521 GPUGRID 19.3.2015 20:18:15 Finished upload of 754-NOELIA_POT-8-13-RND0792_0_1
5522 GPUGRID 19.3.2015 20:18:34 Started upload of 754-NOELIA_POT-8-13-RND0792_0_9
5523 19.3.2015 20:19:09 Project communication failed: attempting access to reference site
5524 GPUGRID 19.3.2015 20:19:09 Temporarily failed upload of 754-NOELIA_POT-8-13-RND0792_0_9: transient HTTP error
5525 GPUGRID 19.3.2015 20:19:09 Backing off 00:04:41 on upload of 754-NOELIA_POT-8-13-RND0792_0_9
5526 19.3.2015 20:19:10 Internet access OK - project servers may be temporarily down.

klepel
Send message
Joined: 23 Dec 09
Posts: 132
Credit: 1,803,402,716
RAC: 1,518,328
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 40533 - Posted: 19 Mar 2015 | 20:42:47 UTC

Hi, All of my 3 machines have the "transient HTTP error" since yesterday evening. I have not changed anything!

BUT THERE IS NOW A PARTICULAR ERRO MASSEGE: "Scheduler request failed: Peer certificate cannot be authenticated with known CA certificates" DOES ANYBODY HAVE THE SAME PROBLEM?

Log of on is like this:

"19/03/2015 10:07:10 a.m. | GPUGRID | Sending scheduler request: Requested by project.
19/03/2015 10:07:10 a.m. | GPUGRID | Requesting new tasks for NVIDIA
19/03/2015 10:07:12 a.m. | GPUGRID | Scheduler request failed: Peer certificate cannot be authenticated with known CA certificates
19/03/2015 10:07:16 a.m. | | Project communication failed: attempting access to reference site
19/03/2015 10:07:19 a.m. | | Internet access OK - project servers may be temporarily down.
19/03/2015 01:25:04 p.m. | GPUGRID | Started upload of e18s8_e11s4f108-GERARD_CXCL12_LIG22_CGENFF2-0-2-RND4205_0_9
19/03/2015 01:25:04 p.m. | GPUGRID | Started upload of e16s6_e1s31f254-GERARD_CXCL12_Ctl11_GAAMPGAFF1-1-2-RND7394_0_0
19/03/2015 01:25:06 p.m. | GPUGRID | Temporarily failed upload of e18s8_e11s4f108-GERARD_CXCL12_LIG22_CGENFF2-0-2-RND4205_0_9: transient HTTP error
19/03/2015 01:25:06 p.m. | GPUGRID | Backing off 3 hr 29 min 2 sec on upload of e18s8_e11s4f108-GERARD_CXCL12_LIG22_CGENFF2-0-2-RND4205_0_9
19/03/2015 01:25:06 p.m. | GPUGRID | Temporarily failed upload of e16s6_e1s31f254-GERARD_CXCL12_Ctl11_GAAMPGAFF1-1-2-RND7394_0_0: transient HTTP error
19/03/2015 01:25:06 p.m. | GPUGRID | Backing off 23 min 0 sec on upload of e16s6_e1s31f254-GERARD_CXCL12_Ctl11_GAAMPGAFF1-1-2-RND7394_0_0
19/03/2015 01:25:09 p.m. | | Project communication failed: attempting access to reference site
19/03/2015 01:25:11 p.m. | | Internet access OK - project servers may be temporarily down.
19/03/2015 01:37:28 p.m. | GPUGRID | Sending scheduler request: Requested by project.
19/03/2015 01:37:28 p.m. | GPUGRID | Requesting new tasks for NVIDIA
19/03/2015 01:37:30 p.m. | GPUGRID | Scheduler request failed: Peer certificate cannot be authenticated with known CA certificates
19/03/2015 01:37:34 p.m. | | Project communication failed: attempting access to reference site
19/03/2015 01:37:36 p.m. | | Internet access OK - project servers may be temporarily down.
19/03/2015 03:17:52 p.m. | GPUGRID | update requested by user
19/03/2015 03:17:52 p.m. | GPUGRID | Sending scheduler request: Requested by user.
19/03/2015 03:17:52 p.m. | GPUGRID | Requesting new tasks for NVIDIA
19/03/2015 03:17:54 p.m. | GPUGRID | Scheduler request failed: Peer certificate cannot be authenticated with known CA certificates
19/03/2015 03:17:57 p.m. | | Project communication failed: attempting access to reference site
19/03/2015 03:18:01 p.m. | | Internet access OK - project servers may be temporarily down.
19/03/2015 03:19:39 p.m. | GPUGRID | Fetching scheduler list
19/03/2015 03:19:44 p.m. | | Project communication failed: attempting access to reference site
19/03/2015 03:19:46 p.m. | | Internet access OK - project servers may be temporarily down."

Please advice! Thanks!

Carl
Send message
Joined: 2 May 13
Posts: 7
Credit: 829,818,489
RAC: 977,261
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwat
Message 40534 - Posted: 19 Mar 2015 | 21:29:20 UTC - in response to Message 40533.

Yes, I had the same message. I upgraded to the latest version of BOINC and now all is well.

Profile Blurf
Send message
Joined: 20 Dec 11
Posts: 9
Credit: 28,974,143
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwat
Message 40537 - Posted: 20 Mar 2015 | 2:57:14 UTC

Files uploaded but now I can't report them

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 159
Credit: 780,146,469
RAC: 1,265
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 40539 - Posted: 20 Mar 2015 | 4:31:46 UTC - in response to Message 40537.
Last modified: 20 Mar 2015 | 4:33:35 UTC

Blurf similar problem -- on now the FIFTH of my computers and ONLY with GPUGrid.

On the other four, updated to the newest version of the BOINC client resolved the GPUGrid specific problem.

On this fifth computer -- it was already on that version, a repair of that version didn't resolve the problem, a uninstall reinstall didn't resolve the problem, a replacement of the credentials file didn't resolve that problem.

Since it is specific to GPUGrid (hoping the GPUGrid folks realize that), that particular computer is now running Collatz for GPU processing instead.

Oh a reset of the project didn't solve the problem, a delete the project and reattach didn't solve the problem either.

Seems it is definitely a project specific issue.

Perhaps if it affects everyone it will move to the top of the 'what the heck is going on list'.

That's two reported work units I've lost in the past 24 hours - -and two that the folks at the project are not going to get unfortunately.

I REALLY hope the folks at the project figure this one out.

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 159
Credit: 780,146,469
RAC: 1,265
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 40541 - Posted: 20 Mar 2015 | 6:12:14 UTC

By the way, I took a ca file from another workstation which was (or at least still is) working, and brought it over to the workstation which doesn't believe in GPUGrid any more.

No joy, said the server was offline (it isn't).

Something clearly is weird with GPU grid and its cert handling or addressing -- in some cases it get's resolved by updating BOINC to the newest version -- though the cert file is identical.

The one 'I don't believe GPUGrid server is alive' workstation uses a GTX-650 -- so I guess it isn't that much of a loss.

Still I wonder what in blazes is going on here.

I'm glad others are reporting the problem as that likely moves it up the 'let's look at this' list.

klepel
Send message
Joined: 23 Dec 09
Posts: 132
Credit: 1,803,402,716
RAC: 1,518,328
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 40546 - Posted: 20 Mar 2015 | 14:57:13 UTC

The good news first: All my result fill have no up-loaded to the GPUGRID servers!

The bad news is: They still are listed on the web site as running / calculating (in Spanish it is “En progreso”) because there is no communication with the reference site on two (02) of my three (03) computers:
“20/03/2015 09:39:10 a.m. | GPUGRID | update requested by user
20/03/2015 09:39:12 a.m. | GPUGRID | Fetching scheduler list
20/03/2015 09:39:17 a.m. | | Project communication failed: attempting access to reference site
20/03/2015 09:39:21 a.m. | | Internet access OK - project servers may be temporarily down.”

It is very frustrating, that the computers have finished the WUs, I will spend on the electric bill, but finally the Wus will not count as they are not reported as finished.

BarryAZ
Send message
Joined: 16 Apr 09
Posts: 159
Credit: 780,146,469
RAC: 1,265
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 40547 - Posted: 20 Mar 2015 | 16:57:37 UTC - in response to Message 40546.

I am guessing here in the absence of data, just judging from the timing here that something done regarding the shift back and forth from HTTP to HTTPS resulted in 'confusion' regarding the certification of the site (the CA certs).

Even though the cert file is the same as it was a year ago, at least *some* workstation installations see the GPUGrid site as no longer certified (out of date certs).

*Sometimes* when this has happened, particularly if you have a somewhat older version of the BOINC client, updating the BOINC client resolves this.

If you have the current version of the BOINC client and the 'can't connect' problem shows up, it seems like there is no resolution.

However, it might also be a timing thing.

Last evening, when I tried to add the GPUGrid project back in on what was my 5th workstation with the problem (and only one that already had the current BOINC client), I was not successful -- trying each and every approach discussed in this thread.

This morning on this same workstation, I was successful in connecting and downloading. I won't know if reporting works until tomorrow evening as the workstation is running a GTX 650 which has about a 34 hour process time for the large workunits.

It seems though as the problem has been reported by several users and is specific to GPUGrid, that there was some change at the project, which perhaps will clear over time. It isn't clear that folks at the project have the resources to evaluate this. (If it is a persisting problem for increasing user population it might get higher priority I suspect).

Post to thread

Message boards : Server and website : Upload Problems