Message boards : Multicore CPUs : QC tests for windows
Author | Message |
---|---|
I'm preparing some infrastructure for the Windows 64bit CPU app. There are some beta WUs out. They don't to actual calculations (for that we have to wait for the nontrivial psi4 port), but they do test the distribution system, dependencies, and so on. | |
ID: 49817 | Rating: 0 | rate:
![]() ![]() ![]() | |
Can't seem to DL anything for my windows machine. | |
ID: 49818 | Rating: 0 | rate:
![]() ![]() ![]() | |
Zalster wrote: Must have been a quick short run. Toni wrote: They don't to actual calculations (for that we have to wait for the nontrivial psi4 port), but they do test the distribution system, dependencies, and so on. | |
ID: 49819 | Rating: 0 | rate:
![]() ![]() ![]() | |
There weren't many (approx 300), all of them in the "beta" queue. I need to fix simultaneous starts there too. | |
ID: 49821 | Rating: 0 | rate:
![]() ![]() ![]() | |
Waiting for more windows consistent queue.... | |
ID: 49844 | Rating: 0 | rate:
![]() ![]() ![]() | |
Hi, something new? My 88 thread server is hungry for some native Win work. If want to help with testing, can let me know. | |
ID: 49935 | Rating: 0 | rate:
![]() ![]() ![]() | |
I think the team is busy with Windows GPU application issues, see here for more details: https://www.gpugrid.net/forum_thread.php?id=4802 | |
ID: 49952 | Rating: 0 | rate:
![]() ![]() ![]() | |
https://www.gpugrid.net/forum_thread.php?id=4802 Also, several members of staff are on vacation at the moment, so progress will be slow until they return. | |
ID: 49953 | Rating: 0 | rate:
![]() ![]() ![]() | |
https://www.gpugrid.net/forum_thread.php?id=4802 Now gpu windows app seems to be resolved. Waiting for September for windows cpu app... | |
ID: 50146 | Rating: 0 | rate:
![]() ![]() ![]() | |
Is there any windows CPU WU's, I set everything to be enable but I get no tasks. | |
ID: 50436 | Rating: 0 | rate:
![]() ![]() ![]() | |
Coming soon (on the QC_beta app initially). | |
ID: 50437 | Rating: 0 | rate:
![]() ![]() ![]() | |
I got a few beta but they failed, looks like it was installing Python and minconda while trying to run another WU at the same time. | |
ID: 50445 | Rating: 0 | rate:
![]() ![]() ![]() | |
Installing python (miniconda actually) is normal. It should be contained in their own boinc directories. Also multiple installations should wait for each other. There is something else I'm investigating. | |
ID: 50446 | Rating: 0 | rate:
![]() ![]() ![]() | |
I see the anaconda option in the start menu, but non of the task actually worked. | |
ID: 50450 | Rating: 0 | rate:
![]() ![]() ![]() | |
The QC Beta app now seems to be working (group name TST4). To do: workaround the creation of the annoying shortcut. | |
ID: 50489 | Rating: 0 | rate:
![]() ![]() ![]() | |
Seems like they are working now. I see 100% cpu load, running 4x4cores. | |
ID: 50495 | Rating: 0 | rate:
![]() ![]() ![]() | |
Great. Did you see "black windows" like command prompts? | |
ID: 50496 | Rating: 0 | rate:
![]() ![]() ![]() | |
Didn't notice any :) | |
ID: 50497 | Rating: 0 | rate:
![]() ![]() ![]() | |
3 UT validated on i7-3770 W10 host ! :-) | |
ID: 50506 | Rating: 0 | rate:
![]() ![]() ![]() | |
Can someone give an eye on the cpu% of windows tasks? In particular, if you run only one task, does it limit itself to 4 threads? | |
ID: 50507 | Rating: 0 | rate:
![]() ![]() ![]() | |
ID: 50509 | Rating: 0 | rate:
![]() ![]() ![]() | |
Just out of curiosity, my Linux PC has been receiving Quantum Chemistry, beta test v3.32 (mt). PC number 258077. Is that supposed to happen? I was under the impression that the beta test tasks were for windows only. | |
ID: 50512 | Rating: 0 | rate:
![]() ![]() ![]() | |
Just out of curiosity, my Linux PC has been receiving Quantum Chemistry, beta test v3.32 (mt). PC number 258077. Is that supposed to happen? I was under the impression that the beta test tasks were for windows only. Same for me, but they ran OK, so they must be for Linux. | |
ID: 50513 | Rating: 0 | rate:
![]() ![]() ![]() | |
Is there any drivers in the package? I had some stability issues with my computer, which I never saw before. I had one bug check that indicated python as culprit, I doubt it but just checking. | |
ID: 50535 | Rating: 0 | rate:
![]() ![]() ![]() | |
I just picked up some of the beta test 3.33 tasks on two different Windows machines. After 10+ minutes, it looks like the tasks are only using one thread. Debug Error! Let me know if you need more information. [Edit] After 22 minutes, the second machine received the same error and the task aborted. | |
ID: 50540 | Rating: 0 | rate:
![]() ![]() ![]() | |
Is there any drivers in the package? I had some stability issues with my computer, which I never saw before. I had one bug check that indicated python as culprit, I doubt it but just checking. There are no "device drivers" needed. The app is CPU only: it might crash, but shouldn't affect your PC. | |
ID: 50541 | Rating: 0 | rate:
![]() ![]() ![]() | |
I just picked up some of the beta test 3.33 tasks on two different Windows machines. After 10+ minutes, it looks like the tasks are only using one thread. The root error is here PSIO_ERROR: unit = 97, errval = 10 PSIO_ERROR: 10 (lseek failed) your machine appears to be the only one with this specific failure. Do you have space in your HDD, or other uncommon setups (e.g. FAT instead of NTFS)? | |
ID: 50542 | Rating: 0 | rate:
![]() ![]() ![]() | |
I think I have the same problem on my hostid 193594 | |
ID: 50543 | Rating: 0 | rate:
![]() ![]() ![]() | |
Toni asked: The root error is here The error happened on two different computers. Machine 476647 has 100 GB disk space available to BOINC. HDD is formatted as NTFS. I can't think of anything that is non-standard. Both machines have VirtualBox installed, but that is something that is encountered often in the volunteer computing world. Let me know if you have more questions. | |
ID: 50545 | Rating: 0 | rate:
![]() ![]() ![]() | |
I see the same on 2 computers, one has 238GB free with boinc allowed to use upto 218GB. | |
ID: 50546 | Rating: 0 | rate:
![]() ![]() ![]() | |
These WUs are failing indeed. I have no obvious explanation but am investigating. Thanks. | |
ID: 50547 | Rating: 0 | rate:
![]() ![]() ![]() | |
I continue to download wu for linux (QC and Beta). Nothing for windows. | |
ID: 50555 | Rating: 0 | rate:
![]() ![]() ![]() | |
If you receive beta WUs for linux, the you will get them for windows too. I had to cancel all outstanding ones because of a nasty bug (which raised a debug dialog). | |
ID: 50559 | Rating: 0 | rate:
![]() ![]() ![]() | |
NOTE: This should be in the Linux section, not Windows, so you can move it, though I expect it would apply to either. Aborting task 1718_36_33_32_50_14f550fc_n00001-SDOERR_SELE6-0-1-RND8308_2: exceeded disk limit: 61716.49MB > 57220.46MB However, I retain 96 GB free in my root partition, and the BOINC startup message says: "max disk usage: 184.49 GB". So apparently that limit is coming from somewhere else(?). | |
ID: 50563 | Rating: 0 | rate:
![]() ![]() ![]() | |
any progress with the planned QC app for Windows? | |
ID: 50638 | Rating: 0 | rate:
![]() ![]() ![]() | |
this shows a vast imbalance, the reason for which is that QC CPU tasks are still available for Linux users only. And not all distro are welcome. I tried some major distro and i have some problems (for example with Fedora). No problems with "old" Mint | |
ID: 50639 | Rating: 0 | rate:
![]() ![]() ![]() | |
I am running SuSE Leap 42.3 on my main Linux box, and Leap 15.0 on my HP laptop. The kernel of Leap 15.0 is more advanced and is equal to that of the Enterprise version of SuSE Linux, SLES.It is also being updated regularly, while updates of 42.3 seem finished. But I have most of my tools on it, including Kaffeine, and it runs GPUGRID tasks, both CPU and GPU since the box has a GTX 750 Ti GPU board. | |
ID: 50640 | Rating: 0 | rate:
![]() ![]() ![]() | |
Do you have a task number to check? Thx | |
ID: 50641 | Rating: 0 | rate:
![]() ![]() ![]() | |
Do you have a task number to check? 18961112 I don't know if it is helpful. I killed this wu after 2h of crunching: wu stopped at 10% and time continued "to restart" from 0 If you need i try other wus | |
ID: 50644 | Rating: 0 | rate:
![]() ![]() ![]() | |
Do you have a task number to check? Thanks. What distro is it? Is it updated (i.e. security patches?). The other problem seems to be that the failure was not seen as such, and thus restarted. | |
ID: 50645 | Rating: 0 | rate:
![]() ![]() ![]() | |
this shows a vast imbalance, the reason for which is that QC CPU tasks are still available for Linux users only. I run Ubuntu , not had a problem ____________ ![]() ![]() | |
ID: 50646 | Rating: 0 | rate:
![]() ![]() ![]() | |
After installing nVidia driver 411.70 on my Windows 10 PC and a Windows 1809 upgrade, my first GPU task completed and validated on it. So far GPU tasks were validated only on my Linux box with SuSE Leap 42.3 and a GTX 750 Ti. | |
ID: 50647 | Rating: 0 | rate:
![]() ![]() ![]() | |
After installing nVidia driver 411.70 on my Windows 10 PC and a Windows 1809 upgrade ... your Win10 1809 Upgrade installed the NVIDIA 411.70 driver? Mine did NOT. I made the upgrade on one of my machines today, but there is still the 388... driver (on the other hand, the CPU usage values of the various apps in the Task Manager are no longer shown correctly; from what I was told: a known bug). | |
ID: 50648 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thanks. What distro is it? Is it updated (i.e. security patches?). The other problem seems to be that the failure was not seen as such, and thus restarted. Fedora 28 with security update. Now i return to Mint.... | |
ID: 50649 | Rating: 0 | rate:
![]() ![]() ![]() | |
No, I installed it via Geforce. Now GPUGRID tasks run successfully and also Einstein@home GPU tasks. | |
ID: 50650 | Rating: 0 | rate:
![]() ![]() ![]() | |
any progress in developing and testing QC tasks for Windows ? | |
ID: 50654 | Rating: 0 | rate:
![]() ![]() ![]() | |
Fedora 28 with security update. I have 58 Gb free for gpugrid and, again, 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED Oh, my... | |
ID: 50655 | Rating: 0 | rate:
![]() ![]() ![]() | |
The same thing happens to me on SELE6 tasks. | |
ID: 50656 | Rating: 0 | rate:
![]() ![]() ![]() | |
any progress in developing and testing QC tasks for Windows ? +1 | |
ID: 50659 | Rating: 0 | rate:
![]() ![]() ![]() | |
No sorry, for this week and maybe the next we have other priorities. But we will definitely come back to it soon. | |
ID: 50666 | Rating: 0 | rate:
![]() ![]() ![]() | |
No sorry, for this week and maybe the next we have other priorities. But we will definitely come back to it soon. And also this week has gone.... | |
ID: 50740 | Rating: 0 | rate:
![]() ![]() ![]() | |
And also this week has gone.... Their priorities are most likely scientific. There is the side we don't see, the biology/chemistry side. The computation is only half of the puzzle. | |
ID: 50741 | Rating: 0 | rate:
![]() ![]() ![]() | |
Their priorities are most likely scientific. There is the side we don't see, the biology/chemistry side. The computation is only half of the puzzle. I know, i know, and i like they work on science side. But i don't know how much science they made with 74 pc on their cpu project... | |
ID: 50755 | Rating: 0 | rate:
![]() ![]() ![]() | |
Maybe now that IBM has bought RedHat there will be more Linux users. | |
ID: 50758 | Rating: 0 | rate:
![]() ![]() ![]() | |
Stefan wrote on October 10: No sorry, for this week and maybe the next we have other priorities. But we will definitely come back to it soon. any progress? | |
ID: 50967 | Rating: 0 | rate:
![]() ![]() ![]() | |
No, sorry. Still working on the paper. I would say a good estimate is from January considering we will need to make a sprint to finish everything before holidays. | |
ID: 50972 | Rating: 0 | rate:
![]() ![]() ![]() | |
And yes I realize we are losing incredible computational power by not having the Windows App. Priorities are as they are though :/ | |
ID: 50973 | Rating: 0 | rate:
![]() ![]() ![]() | |
Stefan, thanks for the quick information :-) | |
ID: 50974 | Rating: 0 | rate:
![]() ![]() ![]() | |
I would say a good estimate is from January considering we will need to make a sprint to finish everything before holidays. We are ready! :-P | |
ID: 51201 | Rating: 0 | rate:
![]() ![]() ![]() | |
I would say a good estimate is from January considering we will need to make a sprint to finish everything before holidays. +1 | |
ID: 51202 | Rating: 0 | rate:
![]() ![]() ![]() | |
I received the second batch of Quantum Chemistry, beta test v3.33 (mt) for Windows today, except for 2 errors (so far), they are finishing successfully. | |
ID: 51259 | Rating: 0 | rate:
![]() ![]() ![]() | |
Got the following error message for a Windows QC_beta task: 16:34:31 (7484): wrapper: running .\qmml3\python.exe (run.py) Looks like it ran for ~61 minutes before it croaked. Please let me know if you need more information. | |
ID: 51260 | Rating: 0 | rate:
![]() ![]() ![]() | |
I received the second batch of Quantum Chemistry, beta test v3.33 (mt) for Windows today, except for 2 errors (so far), they are finishing successfully. I am still getting the 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED errors even with disc usage space for boincs set at 500 gigs. | |
ID: 51271 | Rating: 0 | rate:
![]() ![]() ![]() | |
My first Windows wu: 195 (0xc3) EXIT_CHILD_FAILED | |
ID: 51275 | Rating: 0 | rate:
![]() ![]() ![]() | |
My first Windows wu: Same here, I had a few those errors. | |
ID: 51276 | Rating: 0 | rate:
![]() ![]() ![]() | |
I am still getting the 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED errors even with disc usage space for boincs set at 500 gigs. It is not your setting. The project needs to change it. But now they are fixing the problem by going back to the smaller ones; see Stefan's post on the subject. | |
ID: 51277 | Rating: 0 | rate:
![]() ![]() ![]() | |
Just now I tried to download QC tasks and/or QC beta tasks for windows. | |
ID: 51284 | Rating: 0 | rate:
![]() ![]() ![]() | |
Take 3 for test. | |
ID: 51285 | Rating: 0 | rate:
![]() ![]() ![]() | |
Erich56 wrote: So all the "unsent" tasks shown on the Server Status Page are still for Linux only? Only WU Quantum Chemistry, beta test 3.33 (mt) is currently available for Windows. I think on server status page they are Quantum Chemistry, beta test. You probably need to check your preferences: Use CPU - yes Run test applications? - yes Quantum Chemistry (CPU, beta): yes | |
ID: 51286 | Rating: 0 | rate:
![]() ![]() ![]() | |
Use CPU - yes Thanks for the hint, before I had forgotten to put "Run test applications" on "yes". I did that now, and one tasks was downloaded and got startet. What makes me wonder though is that after a few minutes the progress percentage in the BOING Manager got stuck at 1.098%, although the time is progressing, and the Windows task manager shows that Python is running (and using about 214MB RAM). What I just noticed is that whereas under "remaining time", before some 6 hours where shown; now, about 13 minutes later, the value for "remaining time" has gone up to about 18 hours. Is this okay, or is the tasks faulty? | |
ID: 51288 | Rating: 0 | rate:
![]() ![]() ![]() | |
What makes me wonder though is that after a few minutes the progress percentage in the BOING Manager got stuck at 1.098%, although the time is progressing, and the Windows task manager shows that Python is running (and using about 214MB RAM). now, 20 minutes after start, the progress bar is still at 1.098%, the remaining time is shown as 1 day + 6 hours, and Python is using 3.194MB RAM (which is not a problem yet at this point). | |
ID: 51289 | Rating: 0 | rate:
![]() ![]() ![]() | |
the task failed with the error | |
ID: 51291 | Rating: 0 | rate:
![]() ![]() ![]() | |
Erich56 wrote: Is this okay, or is the tasks faulty? As far as I've noticed, all these WU behave this way. I have finished a bunch of them and their computation time varies from 0,5 to 3 hours. At average 1,5 hours. Erich56 wrote: the task failed with the error I think it's "ok". 11 of 14 of these WU completed successfully. Failed WUs have the same error. https://www.gpugrid.net/results.php?hostid=164968 | |
ID: 51295 | Rating: 0 | rate:
![]() ![]() ![]() | |
105 errors / failed on 188 calculated WU's | |
ID: 51299 | Rating: 0 | rate:
![]() ![]() ![]() | |
so far, all 3 tasks which got downloaded failed with this exit_disk_limit_exceeded error after about 3900 - 4800 seconds. | |
ID: 51300 | Rating: 0 | rate:
![]() ![]() ![]() | |
As a result, from 30 tasks taken only 14 successfully completed. | |
ID: 51304 | Rating: 0 | rate:
![]() ![]() ![]() | |
Since I had started with these tasks yesterday, 20 were processed so far, out of which only 2 succeeded. | |
ID: 51305 | Rating: 0 | rate:
![]() ![]() ![]() | |
Since I had started with these tasks yesterday, 20 were processed so far, out of which only 2 succeeded. This means your file size was 41204.15MB, and the limit is 28610.23MB, not the other way around. The trick to this, is to find which file contains the limit and change it to a larger number and this task would have finished successfully. | |
ID: 51310 | Rating: 0 | rate:
![]() ![]() ![]() | |
The trick to this, is to find which file contains the limit and change it to a larger number and this task would have finished successfully. any advice how to get this accomplished ? | |
ID: 51311 | Rating: 0 | rate:
![]() ![]() ![]() | |
The trick to this, is to find which file contains the limit and change it to a larger number and this task would have finished successfully. Honestly, I have no idea. I attempted to go to the …/BOINC/slots directory, where there are several directories named 0 ,1, 2, 3 etc. Find the one has the minicoda / wrapper files, and edit with notepad the bonic_task_state.xml file. It should open and look something like this: <active_task> <project_master_url>http://www.gpugrid.net/</project_master_url> <result_name>4534r1-TONI_TST10-0-1-RND0591_4</result_name> <checkpoint_cpu_time>30.765630</checkpoint_cpu_time> <checkpoint_elapsed_time>128.215471</checkpoint_elapsed_time> <fraction_done>0.010989</fraction_done> <peak_working_set_size>149794816</peak_working_set_size> <peak_swap_size>139321344</peak_swap_size> <peak_disk_usage>575110930</peak_disk_usage> </active_task> Edit the peak disk usage number to something larger and save the file. In both attempts I failed. WUs errored out. Maybe I should have bumped the numbers higher. I added a zero at the end. The last disk space usage before the crash was over 44 gigs. Give it try, if you like. Something similar to this, was done in the past for oversize output files. It worked. | |
ID: 51313 | Rating: 0 | rate:
![]() ![]() ![]() | |
Give it try, if you like. thanks for the hints. I remember a similar procedure for LHC/ATLAS tasks sometime last year, when some of them had a "exit_disk_limit_exceeded" problem. however, to get this done was somehow tricky; from what I remember, one had to do something like stopping the task, even shutting down BOINC, and then increasing the figure (by quite an amount). However, at this point I cannot try anything, since no betas are available. | |
ID: 51315 | Rating: 0 | rate:
![]() ![]() ![]() | |
Give it try, if you like. I don't like to use tricks. I prefer a bugfixed version of the app. | |
ID: 51317 | Rating: 0 | rate:
![]() ![]() ![]() | |
I prefer a bugfixed version of the app. me, too ! | |
ID: 51318 | Rating: 0 | rate:
![]() ![]() ![]() | |
Approx. 30% of the WUs are outside the disk limits as it was noted. I will raise the limit for the next batch. For the time being, just let them fail. | |
ID: 51319 | Rating: 0 | rate:
![]() ![]() ![]() | |
I will raise the limit for the next batch ... Toni, any rough idea when the next batch will be sent out? | |
ID: 51321 | Rating: 0 | rate:
![]() ![]() ![]() | |
half an hour ago, I got a few tasks downloaded. So let's see whether they will succeed this time, or fail again. | |
ID: 51322 | Rating: 0 | rate:
![]() ![]() ![]() | |
the first task from today got finished successfully after some 4.250 secs, which is very nice :-) | |
ID: 51323 | Rating: 0 | rate:
![]() ![]() ![]() | |
Toni wrote Approx. 30% of the WUs are outside the disk limits as it was noted. I will raise the limit for the next batch. For the time being, just let them fail. 1 of 4 tasks which I downloaded and crunched this afternoon showed the "196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED" error again. How come? http://gpugrid.net/result.php?resultid=20388212 | |
ID: 51327 | Rating: 0 | rate:
![]() ![]() ![]() | |
Toni wroteApprox. 30% of the WUs are outside the disk limits as it was noted. I will raise the limit for the next batch. For the time being, just let them fail. That's because the WUs are from the current batch, not the next batch. You lucked out and got 3 good ones. | |
ID: 51333 | Rating: 0 | rate:
![]() ![]() ![]() | |
Toni wroteApprox. 30% of the WUs are outside the disk limits as it was noted. I will raise the limit for the next batch. For the time being, just let them fail. yesterday, another 3 tasks were downloaded, all of them failed. Total wasted CPU time about 20.000 secs. Obviously, all these tasks are still from the current faulty batch. Question for Toni: Why are these tasks not withdrawn? And when will the next batch with the increased disk limit be sent out? I hate to waste that much of my CPU work. | |
ID: 51342 | Rating: 0 | rate:
![]() ![]() ![]() | |
Toni wroteApprox. 30% of the WUs are outside the disk limits as it was noted. I will raise the limit for the next batch. For the time being, just let them fail. Actual, it is better not to cancel this batch, and let it run its course, until the WUs all become "Too many errors (may have bug)" WUs, because they will in time disappear from being posted. If you cancel them before that, the WUs will stay posted forever. It is a fault in the system. You may have noticed errors from years ago, still posted. | |
ID: 51344 | Rating: 0 | rate:
![]() ![]() ![]() | |
okay, I understand. | |
ID: 51345 | Rating: 0 | rate:
![]() ![]() ![]() | |
Although no "unsent" tasks were shown in the Project Status Page this morning, my PC downloaded 3 WUs. | |
ID: 51359 | Rating: 0 | rate:
![]() ![]() ![]() | |
a few minutes ago, the fourth WU within short time failed. | |
ID: 51360 | Rating: 0 | rate:
![]() ![]() ![]() | |
Perhaps Toni or someone else from the GPUGRID team could post a short message when the new batch which is supposed not to contain this disk limit error will be available. It seems there is no hurry... | |
ID: 51372 | Rating: 0 | rate:
![]() ![]() ![]() | |
Perhaps Toni or someone else from the GPUGRID team could post a short message when the new batch which is supposed not to contain this disk limit error will be available. I find it a pity that the GPUGRID people obviously don't invest more efforts in developping a well functioning Windows app for QC. By this, they forego huge computational power. A look at the project status page shows well what I talk about: presently, there are some 175.000 unsent tasks for which only 100 Linux users are taking care of. From what can be seen when when enough GPU tasks available, there are about 1000 users crunching them, mostly Windows of course. I imagine that crunching CPU tasks on Windows would attract even more people, perhaps the double number or even more (even if the number would be only 1000, this would mean a 10-fold of what it is now under Linux). So, instead of having 100 users crunching QC on Linux, there would maybe 2000 people crunching QC on Windows. Hence, I repeat my statement from above: they forego huge computational power. Why so? | |
ID: 51377 | Rating: 0 | rate:
![]() ![]() ![]() | |
We have stated this in the past but I'll explain it here again. GPUGRID is related to the lab. In the lab we don't have a C++ developer, they are all chemists, biologists and statisticians. Best bet would be me as a computer scientist but I've been busy with many other things recently. So the development of the Windows app has fallen on Raimondas who is very nice to do this as he works in a company (Acellera) which obviously has clients to take care of and does not have an immediate gain from making this work on GPUGRID and Toni who tests it out of personal interest. The current paper we are working on takes priority so once it is finished which is supposed to be this week Raimondas and Toni might finish up the app. According to their latest news the app works mostly fine now except lacking some optimization so it should not be too far off. | |
ID: 51381 | Rating: 0 | rate:
![]() ![]() ![]() | |
... Raimondas and Toni might finish up the app. According to their latest news the app works mostly fine now except lacking some optimization so it should not be too far off. thanks for the thorough information, this helps us to understand what's going on. So we'll keep our fingers crossed that at QC app for Windows will be available soon :-) My guess is that many crunchers will use it. | |
ID: 51386 | Rating: 0 | rate:
![]() ![]() ![]() | |
My guess is that many crunchers will use it. Please, see if you can run their servers dry. Then I can put my Linux machine on something else. I like to use them where they are needed most. | |
ID: 51388 | Rating: 0 | rate:
![]() ![]() ![]() | |
I think the Quantum Chem app should be a subproject and then there should be badges. If you offer badges, they will come. | |
ID: 51390 | Rating: 0 | rate:
![]() ![]() ![]() | |
thanks for the thorough information, this helps us to understand what's going on. +1. I didn't know the QC development situation. Another solution is to open the QC code (if it is possible) to community. TN-Grid, for example, had a lot of help from two volunteers that helped the team to introduce the sse/avx extensions | |
ID: 51395 | Rating: 0 | rate:
![]() ![]() ![]() | |
TN-Grid, for example, had a lot of help from two volunteers that helped the team to introduce the sse/avx extensions I think on every project where I have seen sse/avx, etc. it has been the volunteers that do it. It takes special expertise. Also, there has been at least one project (POEM, if you remember it) where a volunteer did an outstanding job with an OpenCl app for GPUs, though I don't have his name at the moment. And a very capable developer has done both CUDA and OpenCl apps for XANSONS for COD. If you ever visit Rosetta, you will quickly find out that rjs5 knows a whole lot about extensions and parallelism; I have seen him at Einstein too. If you put out a call, they will come. | |
ID: 51396 | Rating: 0 | rate:
![]() ![]() ![]() | |
If you ever visit Rosetta, you will quickly find out that rjs5 knows a whole lot about extensions and parallelism; I know Rjs5 (i chatted privately with him time ago). He have a deep knowledge about c, c++ and optimizations. If you put out a call, they will come. I don't know. For example, in Rosetta, the developers seems to be not interested in optimizations. (and, also, they have restrictions on open the code). | |
ID: 51401 | Rating: 0 | rate:
![]() ![]() ![]() | |
Actually the current beta QC version should not be too different from the final one. One common reason for failure is that the computations need huge temporary files. We can't make them need less space, but the disk limit will be raised (this occurred for the linux app too). | |
ID: 51402 | Rating: 0 | rate:
![]() ![]() ![]() | |
I just got another test WU - and again, it failed with the | |
ID: 51404 | Rating: 0 | rate:
![]() ![]() ![]() | |
Actually the current beta QC version should not be too different from the final one. One common reason for failure is that the computations need huge temporary files. We can't make them need less space, but the disk limit will be raised (this occurred for the linux app too). Will these be released again for Linux? What was the limit raised to? I installed 4TB HDD into 3 machines in anticipation of these tasks. Would like to see how they run. ____________ ![]() ![]() | |
ID: 51405 | Rating: 0 | rate:
![]() ![]() ![]() | |
I installed 4TB HDD into 3 machines in anticipation of these tasks. Would like to see how they run. Me too. I am putting 500 GB on all my new machines (and some old ones too). If that isn't enough, I can do more. | |
ID: 51406 | Rating: 0 | rate:
![]() ![]() ![]() | |
How much disk space do the QC tasks actually need, on the average? | |
ID: 51407 | Rating: 0 | rate:
![]() ![]() ![]() | |
@[VENETO] boboviz: The code is open source https://github.com/psi4/psi4/ | |
ID: 51409 | Rating: 0 | rate:
![]() ![]() ![]() | |
I've got 2 linux machines running QC tasks. | |
ID: 51410 | Rating: 0 | rate:
![]() ![]() ![]() | |
@[VENETO] boboviz: The code is open source https://github.com/psi4/psi4/ Great!!! | |
ID: 51411 | Rating: 0 | rate:
![]() ![]() ![]() | |
About a month ago, Stefan wrote: We have stated this in the past but I'll explain it here again. GPUGRID is related to the lab. In the lab we don't have a C++ developer, they are all chemists, biologists and statisticians. Best bet would be me as a computer scientist but I've been busy with many other things recently. So the development of the Windows app has fallen on Raimondas who is very nice to do this as he works in a company (Acellera) which obviously has clients to take care of and does not have an immediate gain from making this work on GPUGRID and Toni who tests it out of personal interest. The current paper we are working on takes priority so once it is finished which is supposed to be this week Raimondas and Toni might finish up the app. According to their latest news the app works mostly fine now except lacking some optimization so it should not be too far off. My posting now is not to push or pester, just asking for information on the current status of the project. Many thanks in advance - | |
ID: 51572 | Rating: 0 | rate:
![]() ![]() ![]() | |
I highly doubt anything has changed as they still don't have a C++ developer. I would just lay low and let it happen when it happens. | |
ID: 51573 | Rating: 0 | rate:
![]() ![]() ![]() | |
I highly doubt anything has changed as they still don't have a C++ developer. There is no student/graduate student/Phd/etc in University able to develop in C++? I'll wait for windows app, in meantime i crunch with my linux vm | |
ID: 51578 | Rating: 0 | rate:
![]() ![]() ![]() | |
Got this error message on the current batch of QC test for Windows: | |
ID: 51653 | Rating: 0 | rate:
![]() ![]() ![]() | |
And another: 3/28/2019 9:07:38 AM | GPUGRID | Aborting task 1445r7-TONI_TST11-0-1-RND1466_0: exceeded disk limit: 44800.51MB > 28610.23MB Please let us know when this is fixed and I will start up some more test tasks. | |
ID: 51658 | Rating: 0 | rate:
![]() ![]() ![]() | |
For now it does not seem a widespread problem. Can you reset the project? | |
ID: 51659 | Rating: 0 | rate:
![]() ![]() ![]() | |
Did a project reset and started three more tasks. Two task finished successfully and the other aborted with: 3/28/2019 10:37:48 AM | GPUGRID | Aborting task 170r5-TONI_TST11-0-1-RND6393_2: exceeded disk limit: 34955.30MB > 28610.23MB As a separate test, I tried the tasks on a different Windows PC number 489422. All four tasks aborted with the following message: <core_client_version>7.14.2</core_client_version> I did a project reset and got the same result. Am I missing something? | |
ID: 51661 | Rating: 0 | rate:
![]() ![]() ![]() | |
Can you reset the project? I try on a new machine (Intel Xeon and Windows 7) All errors: 17:51:36 (6372): install_miniconda.bat exited; CPU time 40.357459 | |
ID: 51662 | Rating: 0 | rate:
![]() ![]() ![]() | |
I try on a new machine (Intel Xeon and Windows 7) That's strange. No errors on my other machine (other Xeon) with Win10 (1809 version) 64 bit. | |
ID: 51665 | Rating: 0 | rate:
![]() ![]() ![]() | |
On my windows 7 computer, I am getting "195 (0xc3) EXIT_CHILD_FAILED" on all the beta WUs. | |
ID: 51666 | Rating: 0 | rate:
![]() ![]() ![]() | |
The "ImportError" messages (Windows) likely mean that your processor does not support AVX2. This is something we'll try to sort out. | |
ID: 51670 | Rating: 0 | rate:
![]() ![]() ![]() | |
Another common error is disk space. It's because some molecules we process are indeed too large, and we don't have prior indication on when this occurs. which means that this kind of error cannot be precluded to begin with? | |
ID: 51671 | Rating: 0 | rate:
![]() ![]() ![]() | |
which means that this kind of error cannot be precluded to begin with? We'll refine the task sizes with time. | |
ID: 51672 | Rating: 0 | rate:
![]() ![]() ![]() | |
"Conda timeouts" are also seen. These should be transient, due to the failed connection between your machine and the conda software repository (where we host the packages). These seem to be the errors on all my QC. No problems with HD space since I replaced them with 4 TB HDD... ____________ ![]() ![]() | |
ID: 51673 | Rating: 0 | rate:
![]() ![]() ![]() | |
Having the Disk Limit Exceeded error too. I'm using a shucked WD 8TB Helium-filled drive with plenty of space available, so disk space is not an issue here as well. | |
ID: 51676 | Rating: 0 | rate:
![]() ![]() ![]() | |
Why are most of my Quantum WU's erroring out except one? When I read the file it points towards mini-conda. Updated Anaconda but still errors? | |
ID: 51679 | Rating: 0 | rate:
![]() ![]() ![]() | |
Why are most of my Quantum WU's error except one? When I read the file it points towards mini-conda. Updated Anaconda but still errors? | |
ID: 51680 | Rating: 0 | rate:
![]() ![]() ![]() | |
Having the Disk Limit Exceeded error too. I'm using a shucked WD 8TB Helium-filled drive with plenty of space available, so disk space is not an issue here as well. The failure of the task has not necessarily to do with the size of the drive. Definitely not in your case (unless you did not allocate enough disk space for BOINC in the BOINC settings). The task itself includes a disk limitation in it's parameters, which are set by the project people. And obviously there are (many) cases where that's not enough. | |
ID: 51681 | Rating: 0 | rate:
![]() ![]() ![]() | |
My both tasks achieved in about 3 minutes 1.098% progress and stopped develop. After about 25 minutes of "empty transit" I broke the tasks because I had recognized, that it had no sense. | |
ID: 51686 | Rating: 0 | rate:
![]() ![]() ![]() | |
On January 26, Stefan wrote: We have stated this in the past but I'll explain it here again. GPUGRID is related to the lab. In the lab we don't have a C++ developer, they are all chemists, biologists and statisticians. Best bet would be me as a computer scientist but I've been busy with many other things recently. So the development of the Windows app has fallen on Raimondas who is very nice to do this as he works in a company (Acellera) which obviously has clients to take care of and does not have an immediate gain from making this work on GPUGRID and Toni who tests it out of personal interest. The current paper we are working on takes priority so once it is finished which is supposed to be this week Raimondas and Toni might finish up the app. According to their latest news the app works mostly fine now except lacking some optimization so it should not be too far off. Any new timeline at this point? | |
ID: 51902 | Rating: 0 | rate:
![]() ![]() ![]() | |
On January 26, Stefan wrote: still no anwer after one week. To me it seems that the project has been abandoned, even more as for quite a while there have not been any QC tasks available at all, for Linux not either. Would just be great if the GPUGRID people would keep us posted about what's going on. | |
ID: 51950 | Rating: 0 | rate:
![]() ![]() ![]() | |
To me it seems that the project has been abandoned... Would just be great if the GPUGRID people would keep us posted about what's going on. +1 | |
ID: 52109 | Rating: 0 | rate:
![]() ![]() ![]() | |
I assume the project is dead, isn't it? | |
ID: 52171 | Rating: 0 | rate:
![]() ![]() ![]() | |
I assume the project is dead, isn't it? would be nice to get at least any kind of reply, whatever it says | |
ID: 52320 | Rating: 0 | rate:
![]() ![]() ![]() | |
I assume the project is dead, isn't it? it's amazing that the GPUGRID team is not willing to give as any information regarding this project. Why is that so? What's the problems of telling us what's going on? At any rate, from what it looks: the project is dead. Would just be nice if you people let us volunteers know. | |
ID: 52382 | Rating: 0 | rate:
![]() ![]() ![]() | |
At any rate, from what it looks: the project is dead. Would just be nice if you people let us volunteers know. And would be nice to know if we volunteers have produced scientific results with our crunching on this app... | |
ID: 52558 | Rating: 0 | rate:
![]() ![]() ![]() | |
At any rate, from what it looks: the project is dead. Would just be nice if you people let us volunteers know. + 1 | |
ID: 52559 | Rating: 0 | rate:
![]() ![]() ![]() | |
At any rate, from what it looks: the project is dead. Would just be nice if you people let us volunteers know. + 1 | |
ID: 52776 | Rating: 0 | rate:
![]() ![]() ![]() | |
I created a dedicated VM for this sub-project and i crunched. | |
ID: 53380 | Rating: 0 | rate:
![]() ![]() ![]() | |
If you are looking for QC for CPU only, there is a new project on BOINC called | |
ID: 53386 | Rating: 0 | rate:
![]() ![]() ![]() | |
Message boards : Multicore CPUs : QC tests for windows