Message boards : Multicore CPUs : "Hanging" WU?
Author | Message |
---|---|
Well, I have one of the new CELLGA_SHORT WUs running for over one day and 5 hours now, still showing 0% done. Is it save to assume the CPU time is wasted and the WU will never complete? ;-) Should I abort it? It's this one - http://www.gpugrid.net/result.php?resultid=555615 | |
ID: 8708 | Rating: 0 | rate:
![]() ![]() ![]() | |
Yes, please, abort it. It was expected to run for 2 hours, roughly. Thank for reporting the problem! | |
ID: 8717 | Rating: 0 | rate:
![]() ![]() ![]() | |
Ok, thanks! Aborted it now after 1 day and 16 hours... My nice credits... ;-) | |
ID: 8725 | Rating: 0 | rate:
![]() ![]() ![]() | |
Maybe it wasn't a problem with the task but with my PS3. | |
ID: 8727 | Rating: 0 | rate:
![]() ![]() ![]() | |
Hmmm, after 20 minutes the next task was hanging, this time a TONI_CELLGA_MED - http://www.gpugrid.net/result.php?resultid=565176. I aborted it, and will try another task... | |
ID: 8732 | Rating: 0 | rate:
![]() ![]() ![]() | |
If a workunit hangs. You can just try to restart the machine. | |
ID: 8817 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thanks Gianni, but I think I just found the problem... | |
ID: 8819 | Rating: 0 | rate:
![]() ![]() ![]() | |
Have same problem; | |
ID: 8822 | Rating: 0 | rate:
![]() ![]() ![]() | |
I'm trying to figure out the pattern behind those failures. | |
ID: 8823 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thanks for looking into the problem Toni! | |
ID: 8824 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thanks to you for reporting the symptoms: "runaway" processes may in fact be related to anomalous task suspend/resume, triggered by the CPU benchmarks. | |
ID: 8827 | Rating: 0 | rate:
![]() ![]() ![]() | |
Hi TG, | |
ID: 8908 | Rating: 0 | rate:
![]() ![]() ![]() | |
Could it be that this problem occurs when more than one project are using the PS3 (maybe it's related to task-switching)? I've been crunching on my PS3 for a couple of months without problems but had this problem yesterday once I attached BOINC to yoyo@home. I aborted the WU and got another one but this one seems to be hanging too. Here are the links to the WUs: | |
ID: 8938 | Rating: 0 | rate:
![]() ![]() ![]() | |
Yes, we believe that task switching is an issue: the accelerated processors are not properly "freed" somehow upon process termination, probably a shortcoming of the platform. :-( Do just PS3GRID hang, or also those from other projects? | |
ID: 8987 | Rating: 0 | rate:
![]() ![]() ![]() | |
I think all the WUs, from PS3Grid and others, hang, at least that's what happened with me. I was looking around and found a newer BOINC client optimized for the PS3 (http://www.dotsch.de/boinc/boinc6219_10.linux-ps3.tar.gz), unfortunately it's a command-line version and I tried to get it to work with BOINC Manager but it didn't work. Maybe an update to the current PS3Grid BOINC client (whether using the one mentioned above or otherwise) would fix the problem? | |
ID: 9037 | Rating: 0 | rate:
![]() ![]() ![]() | |
Just to be helpful to another BOINCer at least on my machine the hanging WU seem to be specific to gpugrid. If you set the resource share for yoyo to say 3E+38 (a huge number) and gpugrid to 1 then your ps3 will happily only crush yoyo WU and will not hang. The problem only seems to occur when the PS3 starts working on a gpugrid WU. I am not sure but the problem also seems more pronounced with the memory stick version (what I run) as compared to YDL but I think it occurs with both at times. | |
ID: 9098 | Rating: 0 | rate:
![]() ![]() ![]() | |
just joined the gpugrid; first wu is hung 0.0% | |
ID: 9156 | Rating: 0 | rate:
![]() ![]() ![]() | |
I guess I picked a bad day to join. I booted; no worky. aborted. started another, no worky. then I see a message suggesting I joined the project with the wrong link, the one from the website. so I tried again, new wu: Thu 30 Apr 2009 05:20:10 PM EDT|GPUGRID|Starting task 638000-IBUCH_GRAUS-1-100-RND2278_1 using cellmd2 version 503 | |
ID: 9160 | Rating: 0 | rate:
![]() ![]() ![]() | |
Hi sam - the progress bar does not show the actual progression for PS3 WUs. All WUs last circa 12 hours each. | |
ID: 9174 | Rating: 0 | rate:
![]() ![]() ![]() | |
1 WU since a week i crunched for gpugrid and hung, +16hrs and 0% | |
ID: 9189 | Rating: 0 | rate:
![]() ![]() ![]() | |
I have promised to be political and will simply say the PS3 portion of this project is experiencing "growing" pains and gpugrid WU often hang. The problem is not with your setup (wish it was more clear so people don't waste their time debugging a universal project problem). | |
ID: 9234 | Rating: 0 | rate:
![]() ![]() ![]() | |
after 7 finished WU this one went +39hrs with 0% progress so I canceled it. | |
ID: 9994 | Rating: 0 | rate:
![]() ![]() ![]() | |
Maybe next can help with hanging WU: | |
ID: 10111 | Rating: 0 | rate:
![]() ![]() ![]() | |
http://www.gpugrid.net/result.php?resultid=819492 | |
ID: 10691 | Rating: 0 | rate:
![]() ![]() ![]() | |
Could it be that's it's heat related my WU often hang? | |
ID: 10904 | Rating: 0 | rate:
![]() ![]() ![]() | |
i've since i last started ps3grid, with yoyo suspended no problems.... it looks sofar good ....but if it's multi-project related, it isn't nice of this project to not fix it. I'm MORE satisfied if i could get yoyo back at 10% so the ps3 could ask work from that project if gpugrid runs out,server hangs ,.... | |
ID: 11771 | Rating: 0 | rate:
![]() ![]() ![]() | |
Message boards : Multicore CPUs : "Hanging" WU?