Advanced search

Message boards : News : Experimental QMML WUs

Author Message
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 48432 - Posted: 19 Dec 2017 | 15:19:45 UTC

We are experimenting with CPU workunits. Right now they are Linux only. Please note that you may need to install "gcc" in your machine for them to work. More details in the Multicore forum!

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1284
Credit: 4,926,706,959
RAC: 6,516,431
Level
Arg
Scientific publications
watwatwatwatwat
Message 48435 - Posted: 19 Dec 2017 | 17:37:15 UTC
Last modified: 19 Dec 2017 | 18:09:51 UTC

I have not been able to get any of the QC tasks for a week now. I have GCC installed. A post said the most recent tasks have had their priority lowered to allow "low reliability" hosts to get work. I am a new member with no credits since I have been unable to get any work. Is my no credit status still preventing me from getting work?
[Edit]
I guess this place is just like SETI, make a complaint in the forums and the servers read it and make adjustments. I have a Long GPU and a QC task now.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 332
Credit: 3,772,896,065
RAC: 4,765,302
Level
Arg
Scientific publications
watwatwatwatwat
Message 48443 - Posted: 20 Dec 2017 | 3:38:26 UTC - in response to Message 48435.

I have not been able to get any of the QC tasks for a week now. I have GCC installed. A post said the most recent tasks have had their priority lowered to allow "low reliability" hosts to get work. I am a new member with no credits since I have been unable to get any work. Is my no credit status still preventing me from getting work?
[Edit]
I guess this place is just like SETI, make a complaint in the forums and the servers read it and make adjustments. I have a Long GPU and a QC task now.


Well GPU work is scarce. Nearly every day I run out. And CPU work is still under testing. ATM there is no work for anyone.

http://www.gpugrid.net/server_status.php

gianni
Send message
Joined: 11 Jul 08
Posts: 18
Credit: 105,098
RAC: 0
Level

Scientific publications
watwatwat
Message 48447 - Posted: 20 Dec 2017 | 9:38:51 UTC - in response to Message 48443.

as soon as it is working properly there will be lots of work

Mathieu
Send message
Joined: 16 Nov 16
Posts: 2
Credit: 65,554,798
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 48450 - Posted: 20 Dec 2017 | 11:34:21 UTC

Well i am running gpugrid/milkyway@home from windows in order to heat my bedroom in a less wasteful way than a stupid electric resistor heating, unfortunately it seems since the beginning of this experiment gpugrid doesn't use much gpu on windows ^^
So this morning and until further notice i switch to full milkyway@home, i am freezing here !

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,617,042,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 48452 - Posted: 20 Dec 2017 | 12:03:24 UTC - in response to Message 48450.

Well i am running gpugrid/milkyway@home from windows in order to heat my bedroom in a less wasteful way than a stupid electric resistor heating, unfortunately it seems since the beginning of this experiment gpugrid doesn't use much gpu on windows ^^
So this morning and until further notice i switch to full milkyway@home, i am freezing here !

I've been heating my home with BOINC for years and I've actually found GPUGrid to have the highest wattage output of most projects. There are multiple ways to increase GPU utilization and therefore not only increase GPU power consumption but a bit more CPU as well. Enabling SWAN_SYNC by searching environment variables in windows search then in the bottom right clicking enviornment variables and adding SWAN_SYNC and changing the 0 to a 1, can yield some better results.

Milkyway at home uses the GPU in a very efficient way, taking data from only the GPU cache rather than the GDDR5 memory. The memory uses 15+ watts and is basically unused for the milkyway application. If you want more heat from a backup project consider switching to einstein@home. Einstein also uses quite a bit of CPU to support the GPU computing so this will help with heating.

Another option is to buy another nvidia GPU if your motherboard is large enough to support another card. This will definitely be enough to heat your room.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 332
Credit: 3,772,896,065
RAC: 4,765,302
Level
Arg
Scientific publications
watwatwatwatwat
Message 48457 - Posted: 20 Dec 2017 | 13:46:58 UTC

PrimeGrid will heat up a GPU. The GPU boost on my pascal cards is the lowest there meaning it runs the card hard and hits thermal limits. Math projects tend to be the best as they can easily fork to utilize many cores.

Linux will utilize a processor better in many instances. And if the GPU is fast enough, 2x GPUGrid tasks can be run in parallel. The longest GPUGrid task atm last only 16-17 hours at 2x on a 1070.

MW is double precision so some of the best cards for MW are still older 7970/280x/etc AMD cards. Those are heaters for sure. I wonder how well the V100 will run DP projects.

Mathieu
Send message
Joined: 16 Nov 16
Posts: 2
Credit: 65,554,798
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 48470 - Posted: 21 Dec 2017 | 10:09:32 UTC
Last modified: 21 Dec 2017 | 10:13:18 UTC

Thanks for the responses PappaLitto and mmonnin, everything seems to work again this morning for no particular reasons.

I did some test, gpugrid + milkyway@home seems a good mix for me, gpugrid uses more GPU Watt but few CPU (~10%) and GPU usage don't get over 80% (maybe caused by memory fetching delay during computing?), milkyway@home use near 100% CPU and GPU usage but less GPU Watt for the reason pointed by PappaLitto.
It may be a hardware problem from my GPU, last winter i had no problem running BOINC for weeks, now from time to time i got screen freezes and automatic restart without any error message.



+-----------------------------------------------------------------------------+
|.NVIDIA-SMI.388.59.................Driver.Version:.388.59....................|
|-------------------------------+----------------------+----------------------+
|.GPU..Name............TCC/WDDM.|.Bus-Id........Disp.A.|.Volatile.Uncorr..ECC.|
|.Fan..Temp..Perf..Pwr:Usage/Cap|.........Memory-Usage.|.GPU-Util..Compute.M..|
|===============================+======================+======================|
|...0..GeForce.GTX.970....WDDM..|.00000000:01:00.0..On.|..................N/A.|
|.75%...80C....P2...135W./.180W.|...1345MiB./..4096MiB.|.....91%......Default.|
+-------------------------------+----------------------+----------------------+

Failboat
Send message
Joined: 24 Apr 16
Posts: 2
Credit: 35,317,258
RAC: 0
Level
Val
Scientific publications
watwatwat
Message 48479 - Posted: 21 Dec 2017 | 23:58:30 UTC

Had an 8-core QC unit named c53-TONI_QMML314rst-0-1-RND8160.

After 2 hours elapsed and 15.5 hours of CPU time it was at 50% complete, but the time remaining incremented upwards by 1 with every elapsed second, rather than decreasing. Aborted it.

talister
Send message
Joined: 4 Aug 09
Posts: 1
Credit: 625,371,576
RAC: 530,646
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48481 - Posted: 22 Dec 2017 | 18:34:19 UTC - in response to Message 48479.

Had two units (c71-TONI_QMML314rst-0-1-RND2036_2 and c95-TONI_QMML314rst-0-1-RND7455_2) which reached 69.568% and then stalled out there for a total runtime of 26 and 21 hours respectively. This is on a 8-core i7-6700K running CentOS(effectively RHEL) 7.4.

Have aborted them.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48482 - Posted: 22 Dec 2017 | 20:02:44 UTC - in response to Message 48481.

Had two units (c71-TONI_QMML314rst-0-1-RND2036_2 and c95-TONI_QMML314rst-0-1-RND7455_2) which reached 69.568% and then stalled out there for a total runtime of 26 and 21 hours respectively. This is on a 8-core i7-6700K running CentOS(effectively RHEL) 7.4.

Have aborted them.

I have found that I have to run only one work unit at a time to prevent them from stalling.

Trotador
Send message
Joined: 25 Mar 12
Posts: 103
Credit: 9,769,314,893
RAC: 32,536
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48490 - Posted: 24 Dec 2017 | 6:42:50 UTC

Tasks seem to return to 0% completion after BOINC restart in UBUNTU.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48491 - Posted: 24 Dec 2017 | 7:52:21 UTC - in response to Message 48490.

Tasks seem to return to 0% completion after BOINC restart in UBUNTU.

They return to 1% complete after a reboot for me on Ubuntu. But then, after about 1/2 hour, they go back to where they left off, more or less.
I think it is just a startup period that they have to get through first.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 48492 - Posted: 24 Dec 2017 | 8:05:09 UTC - in response to Message 48491.
Last modified: 24 Dec 2017 | 8:05:40 UTC

@jim thanks! I think I understand now.

There is both a startup phase (actually two), during which the latest version of the software is checked online. This should be relatively fast, but occurs at low %.

Then, the loop over the molecules contained in the task are resumed. The progress bar is currently updated only at the end of each completed molecule.

So, I can confirm that, apart from the progress bar, the behaviour is correct and does not imply that the task is repeating work already done.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48495 - Posted: 24 Dec 2017 | 12:38:07 UTC - in response to Message 48492.

By the way, I have found that while running two work units at once seems to always cause problems with stuck work units and errors, running only one at once does not solve all problems. They often still get stuck, but a reboot fixes it. So it seems to be a necessary, but not sufficient condition for my machines to work properly.

Trotador
Send message
Joined: 25 Mar 12
Posts: 103
Credit: 9,769,314,893
RAC: 32,536
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48505 - Posted: 25 Dec 2017 | 6:51:35 UTC

Units stuck either at 78.698% or 69.568% with over 1 day processed time and remaining processing time increasing.

Should I let them continue or abort?

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48507 - Posted: 25 Dec 2017 | 13:36:50 UTC - in response to Message 48505.
Last modified: 25 Dec 2017 | 13:53:04 UTC

Units stuck either at 78.698% or 69.568% with over 1 day processed time and remaining processing time increasing.

Should I let them continue or abort?

A reboot usually fixes them, but as noted below they will return to a low value before you see any progress past the point where you left them.

EDIT: But I am wondering whether that is "elapsed time", in which case one day is probably too long, or else "CPU time" (shown in parenthesis in BoincTasks). One day of CPU time is not unreasonably long for those percentages, and I would just wait a couple of hours to see if you make progress past those points. They do get stuck there for a while, but temporarily.

(That is the problem with running this project. You never know if it is working normally or not.)

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 48508 - Posted: 25 Dec 2017 | 16:14:11 UTC - in response to Message 48507.

I just rebooted one myself that was stuck at 78.698% too long, and it failed. You never know what you will get.
http://www.gpugrid.net/result.php?resultid=16790478

Profile Tiger
Send message
Joined: 30 Jan 15
Posts: 7
Credit: 402,017,837
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwat
Message 48566 - Posted: 31 Dec 2017 | 16:38:01 UTC - in response to Message 48450.

:D

Post to thread

Message boards : News : Experimental QMML WUs

//