Author |
Message |
|
Hello community!
I looked in BOINC and the progress (%) of the GPUGRID WU didn't changed. I looked ~ 5 mins. The estimate time increased and increased.
I aborted the WU.
3 GPUGRID WUs were calculated well before on this GPU.
Manufacturer OCed GTX260-216.
ACEMD2: GPU molecular dynamics v6.05 (cuda)
resultid=3186839
SWAN : FATAL : Failure executing kernel sync [nb_k_nt_p_tp_pme] [700]
Assertion failed: 0, file swanlib_nv.cpp, line 121
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
(I don't know if this is an info because of aborting)
Is this 'normal' that sometimes the progress don't change?
Is this an error, what can happen sometimes?
What to do?
Restart of BOINC, PC, ..?
Aborting?
Thanks!
____________
|
|
|
skgivenVolunteer moderator Volunteer tester
![Avatar](https://www.gravatar.com/avatar/77be8b04dc35f6033048abca3f3803c4?s=100&d=identicon) Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
![Histidine - More than 1.5B credits His](img/badges/aa/badge_his.png) Scientific publications
![Top 100% (2761st/2932) contribution to Buch et al, J. Chem. Inf. Model. 2010 wat](img/badges/papers/badge_pub_white.png) ![Top 75% (1680th/2466) contribution to Sadiq et al, Proteins 2010 wat](img/badges/papers/badge_pub_silver.png) ![Top 10% (266th/3118) contribution to Selent et al, PLoS Comput Biol 2010 wat](img/badges/papers/badge_pub_emerald.png) ![Top 1% (15th/4410) contribution to Buch et al, PNAS 2011 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (22nd/2450) contribution to Giorgino et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (15th/9662) contribution to Buch et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (27th/3113) contribution to Giorgino et al, J. Chem. Theory Comput, 2012 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (14th/5798) contribution to Sadiq et al, PNAS 2012 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 25% (352nd/1995) contribution to Venken et al, JCTC 2013 wat](img/badges/papers/badge_pub_ruby.png) ![Top 1% (15th/3349) contribution to Buch et al, JCIM 2013 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 10% (49th/3864) contribution to Dainese et al, Biochem. J. 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (62nd/4477) contribution to Pérez-Hernández et al, JCP 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (70th/2163) contribution to Bisignano et al. JCIM 2014 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (14th/1283) contribution to Doerr et al. JCTC 2014 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (45th/2838) contribution to Stanley et al, Nat Commun 2014 wat](img/badges/papers/badge_pub_emerald.png) ![Top 1% (18th/3183) contribution to Lauro et al., JCIM 2014 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (27th/3611) contribution to Ferruz et al., JCIM 2015 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (34th/4128) contribution to Ferruz et al., Sci Rep 2016 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (49th/4815) contribution to Stanley et al., Sci Rep 2016 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 10% (105th/4730) contribution to Noe et al., Nat Chem 2017 wat](img/badges/papers/badge_pub_emerald.png) ![Top 100% (1222nd/1348) contribution to Doerr et al, JCTC 2017 wat](img/badges/papers/badge_pub_white.png) ![Top 1% (35th/4634) contribution to Martinez-Rosell et al, JCIM 2018 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 50% (485th/1656) contribution to Kapoor et al., Sci Rep 2017 wat](img/badges/papers/badge_pub_gold.png) ![Top 10% (50th/1885) contribution to Ferruz et al., Sci Rep 2018 wat](img/badges/papers/badge_pub_emerald.png) ![Top 75% (551st/1022) contribution to Wang et al., ACS Cent. Sci. 2019 wat](img/badges/papers/badge_pub_silver.png) ![Top 25% (307th/1541) contribution to Rodriguez-Espigares et al., Nat Meth 2020 wat](img/badges/papers/badge_pub_ruby.png) ![Top 10% (29th/1450) contribution to Herrera-Nieto et al, Sci Rep 2020 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (334th/6232) contribution to Herrera-Nieto et al, JCIM 2020 wat](img/badges/papers/badge_pub_emerald.png) |
That specific task seemed to fail after about 3h, but some of the KKi tasks are unusually long. For a GTX260 about 5h to 8h is about right, depending on the task.
|
|
|
|
The other 'KKi'-WUs running < 8 hours (~ 28,000 secs).
If I look in BOINC, normally I see that the % progress change every few seconds.
But this time, I looked and ~ 5 mins the % progress didn't changed. The elapsed time increased also the estimate time.
If I look to the wingmen (wuid=2012518), the 1st 9400M, 3rd GTX295 got computing error.
My 2nd result was 'strange'.
The 4th result will come from a GTX275.
It could be that this WU is 'buggy'?
____________
|
|
|
skgivenVolunteer moderator Volunteer tester
![Avatar](https://www.gravatar.com/avatar/77be8b04dc35f6033048abca3f3803c4?s=100&d=identicon) Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
![Histidine - More than 1.5B credits His](img/badges/aa/badge_his.png) Scientific publications
![Top 100% (2761st/2932) contribution to Buch et al, J. Chem. Inf. Model. 2010 wat](img/badges/papers/badge_pub_white.png) ![Top 75% (1680th/2466) contribution to Sadiq et al, Proteins 2010 wat](img/badges/papers/badge_pub_silver.png) ![Top 10% (266th/3118) contribution to Selent et al, PLoS Comput Biol 2010 wat](img/badges/papers/badge_pub_emerald.png) ![Top 1% (15th/4410) contribution to Buch et al, PNAS 2011 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (22nd/2450) contribution to Giorgino et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (15th/9662) contribution to Buch et al, J. Chem. Theory Comput. 2011 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (27th/3113) contribution to Giorgino et al, J. Chem. Theory Comput, 2012 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (14th/5798) contribution to Sadiq et al, PNAS 2012 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 25% (352nd/1995) contribution to Venken et al, JCTC 2013 wat](img/badges/papers/badge_pub_ruby.png) ![Top 1% (15th/3349) contribution to Buch et al, JCIM 2013 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 10% (49th/3864) contribution to Dainese et al, Biochem. J. 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (62nd/4477) contribution to Pérez-Hernández et al, JCP 2013 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (70th/2163) contribution to Bisignano et al. JCIM 2014 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (14th/1283) contribution to Doerr et al. JCTC 2014 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (45th/2838) contribution to Stanley et al, Nat Commun 2014 wat](img/badges/papers/badge_pub_emerald.png) ![Top 1% (18th/3183) contribution to Lauro et al., JCIM 2014 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (27th/3611) contribution to Ferruz et al., JCIM 2015 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (34th/4128) contribution to Ferruz et al., Sci Rep 2016 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 1% (49th/4815) contribution to Stanley et al., Sci Rep 2016 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 10% (105th/4730) contribution to Noe et al., Nat Chem 2017 wat](img/badges/papers/badge_pub_emerald.png) ![Top 100% (1222nd/1348) contribution to Doerr et al, JCTC 2017 wat](img/badges/papers/badge_pub_white.png) ![Top 1% (35th/4634) contribution to Martinez-Rosell et al, JCIM 2018 wat](img/badges/papers/badge_pub_sapphire.png) ![Top 50% (485th/1656) contribution to Kapoor et al., Sci Rep 2017 wat](img/badges/papers/badge_pub_gold.png) ![Top 10% (50th/1885) contribution to Ferruz et al., Sci Rep 2018 wat](img/badges/papers/badge_pub_emerald.png) ![Top 75% (551st/1022) contribution to Wang et al., ACS Cent. Sci. 2019 wat](img/badges/papers/badge_pub_silver.png) ![Top 25% (307th/1541) contribution to Rodriguez-Espigares et al., Nat Meth 2020 wat](img/badges/papers/badge_pub_ruby.png) ![Top 10% (29th/1450) contribution to Herrera-Nieto et al, Sci Rep 2020 wat](img/badges/papers/badge_pub_emerald.png) ![Top 10% (334th/6232) contribution to Herrera-Nieto et al, JCIM 2020 wat](img/badges/papers/badge_pub_emerald.png) |
Yes, I would say the WU had a problem in this case, rather than your setup.
As it is a KKi task, with a known issues that occasionally prevent some tasks from running (fail very early) it is possible that this is one such task, but it managed to run on your system. |
|
|
|
The old WU in question was calculated well from a Linux GTX275 now.
Maybe it's a prob with the Windows app?
This morning I saw again a (new) WU with no % progress change: wuid=2028574.
I suspended the calculation in BOINC, waited a few seconds and after enabling of calculation the WU in question was marked immediately as computation error.
What is the prob?
The Windows app?
My system?
The old nVIDIA driver 190.38? S@h and MW@h run well.
I would like to crunch GPUGRID from time to time, but if every few times a WU have probs..
What would happen, if I don't see this in BOINC?
The WU stay in calculation mode, but crunch days and days in an idle loop and block the GPU?
____________
|
|
|
|
I see now, both WUs were calculated on GPU #2.
Maybe the GPU is buggy for/at GPUGRID? Other project are well.
____________
|
|
|