Advanced search

Message boards : Number crunching : GPUGRID active users are falling dramatically!

Author Message
Betting Slip
Send message
Joined: 5 Jan 09
Posts: 566
Credit: 1,757,497,625
RAC: 1,773,912
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47358 - Posted: 3 Jun 2017 | 9:24:46 UTC
Last modified: 3 Jun 2017 | 10:13:46 UTC

Why are active users falling on this project?

Lost over 33% of active users since April. That's bad and doesn't bode well for the future.

Only faster cards are holding flops/s up.

Time I think for this project to get proactive on how to attract and retain high end users, communicate more and stop taking users for granted.



Source: https://boincstats.com/en/stats/45/project/detail/user

I posted this over a year ago https://gpugrid.net/forum_thread.php?id=4304#43468

Jim1348
Send message
Joined: 28 Jul 12
Posts: 429
Credit: 944,247,802
RAC: 2,583,371
Level
Glu
Scientific publications
watwatwatwatwatwatwatwat
Message 47359 - Posted: 3 Jun 2017 | 15:55:19 UTC - in response to Message 47358.

Maybe the occasional very long work units are discouraging some people? I don't think the bonuses are that important, but it is a psychological thing. Also, the reduced output of the app under Windows may be a problem, though I am mainly on Linux now and am glad to have the work.

kain
Send message
Joined: 3 Sep 14
Posts: 96
Credit: 113,125,141
RAC: 103,861
Level
Cys
Scientific publications
watwatwat
Message 47361 - Posted: 3 Jun 2017 | 20:58:02 UTC

It is called summer in the north hemisphere. High temps = air conditioning and air conditioning doesn't work well with high end computers crunching for the better future...

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 566
Credit: 1,757,497,625
RAC: 1,773,912
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47362 - Posted: 3 Jun 2017 | 21:45:22 UTC - in response to Message 47361.

It is called summer in the north hemisphere. High temps = air conditioning and air conditioning doesn't work well with high end computers crunching for the better future...


I have had ambient temperatures up to 32c in summer and still run computers without air conditioning. Are you trying to tell me that's responsible for a 33% drop in users that started in April?

I would like to believe you but I can't just yet.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1787
Credit: 9,544,984,844
RAC: 2,798,536
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47363 - Posted: 4 Jun 2017 | 0:49:26 UTC - in response to Message 47362.
Last modified: 4 Jun 2017 | 0:50:59 UTC

What happened between 21st & 22nd of April? (actually a month before this date)
21. April: 3133 users
22. April: 2840 users
That's 293 less (~10%) in a day.
There's a "thing" called charityengine.
It actually installs BOINC manager and connects it to different projects (at least the last time I've encountered this it did. I did not have the guts to install it on my computers, as they are selling the computing power their users submit for them). I think there's a lot of users on every project thanks to charityengine, but I think these users could quit after a short period if they don't win a fortune.
Maybe it's not them, but another similar "thing".

Erich56
Send message
Joined: 1 Jan 15
Posts: 311
Credit: 1,217,077,427
RAC: 2,906,193
Level
Met
Scientific publications
watwat
Message 47364 - Posted: 4 Jun 2017 | 6:20:05 UTC - in response to Message 47358.

Why are active users falling on this project?

Lost over 33% of active users since April. That's bad and doesn't bode well for the future.

Only faster cards are holding flops/s up.

Time I think for this project to get proactive on how to attract and retain high end users, communicate more and stop taking users for granted.


really surprised? I am not!

It all started out around mid-April when all of a sudden the crunching software became invalid and stopped working (which should never ever happen that way).

So, a new software needed to be put together, and since this had to be done in a hurry, it was rather buggy. Here just a few examples of what one could read in various threads of the forum, and what I myself have been experiencing since:

- the new software is around 30% slower :-(

- GPU overclocking is much less possible than before (at least for Maxwell cards; no idea how it is with Pascals).

- tasks stop for unknown reasons, and only continue if they are switched off (suspended) and switched on again manually (so, if a cruncher does not notice such a stop for say 10 hours, the system runs idle for 10 hours).

- the new software does not work well with BOINC: when pushing the "suspense" button in the BOINC manager (either in "Tasks" or in "Projects"), it takes several minutes until the task reacts and stops.

In the recent past, GPUGRID tasks have become even more GPU-straining and long-lasting; for example "ADRIA_FOLDGREED10_crystal_ss_contacts_100_ubiquitin" (also the _50_ubiquitin) - on a GTX750ti (with some unvoluntary stops inbetween, as mentioned above), it can take 3 days or more until this task gets finished.
That's why I had suggested that on the Project Preference page, besides "short runs" (which virtually don't exist any more) and "long runs", a third category like "extra long runs" (or whatever wording suits) is being implemented, so that the many GTX750Ti crunchers can exclude such long tasks from download.

And here we are at the next problem:
back at GPUGRID, no-one really seems to care which problems the crunchers have and which suggestens they are presenting.
Reading much in the forum, I can think of so many other people writing about all kinds of problems, making useful suggestions and also putting questions now and then. However: NO REACTION AT ALL !

So, coming back to the beginning of this posting: I am NOT surprised that people are turning away from this project. Sorry to say this :-(

By accident, in the forum of another BOINC project I participate, yesterday I read a statement from the project people there:
"Of course having happy volunteers is very important for the health of a project; so it is something that should be addressed ..."
Why is this different with GPUGRID?

randi
Send message
Joined: 9 Nov 16
Posts: 2
Credit: 1,843,000
RAC: 14
Level
Ala
Scientific publications
wat
Message 47367 - Posted: 5 Jun 2017 | 16:10:18 UTC
Last modified: 5 Jun 2017 | 16:11:07 UTC

I have been waiting a long time to get a task.

Normally I say no to long tasks because they take a VERY long time on my computer.
Recently I changed that to yes, but I am still not getting any tasks.

6/5/2017 12:07:49 | GPUGRID | update requested by user
6/5/2017 12:07:54 | GPUGRID | Sending scheduler request: Requested by user.
6/5/2017 12:07:54 | GPUGRID | Requesting new tasks for NVIDIA GPU
6/5/2017 12:07:56 | GPUGRID | Scheduler request completed: got 0 new tasks
6/5/2017 12:07:56 | GPUGRID | No tasks sent

Erich56
Send message
Joined: 1 Jan 15
Posts: 311
Credit: 1,217,077,427
RAC: 2,906,193
Level
Met
Scientific publications
watwat
Message 47368 - Posted: 5 Jun 2017 | 16:30:02 UTC - in response to Message 47367.

I have been waiting a long time to get a task.

I guess this is kind of not quite the right thread to post your problem.

A lot of statements and opinions about the problem of not getting tasks are contained in this thread here:

http://gpugrid.net/forum_thread.php?id=4574

you may look this up, perhaps you get an idea what's wrong.

Erich56
Send message
Joined: 1 Jan 15
Posts: 311
Credit: 1,217,077,427
RAC: 2,906,193
Level
Met
Scientific publications
watwat
Message 47404 - Posted: 12 Jun 2017 | 9:45:50 UTC - in response to Message 47364.

...
It all started out around mid-April when all of a sudden the crunching software became invalid and stopped working (which should never ever happen that way).

So, a new software needed to be put together, and since this had to be done in a hurry, it was rather buggy. Here just a few examples of what one could read in various threads of the forum, and what I myself have been experiencing since:

- the new software is around 30% slower :-(

- GPU overclocking is much less possible than before (at least for Maxwell cards; no idea how it is with Pascals).

- tasks stop for unknown reasons, and only continue if they are switched off (suspended) and switched on again manually (so, if a cruncher does not notice such a stop for say 10 hours, the system runs idle for 10 hours).

- the new software does not work well with BOINC: when pushing the "suspense" button in the BOINC manager (either in "Tasks" or in "Projects"), it takes several minutes until the task reacts and stops.

I am curious how much longer it will take the GPUGRID people to acknowledge that the current software is buggy and needs to be repaired!
More or less every day, I get annoyed by these bugs cited above :-(

Stefan
Volunteer moderator
Project developer
Project scientist
Send message
Joined: 5 Mar 13
Posts: 250
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 47407 - Posted: 12 Jun 2017 | 10:07:17 UTC - in response to Message 47404.

Fixing bugs with BOINC is relatively pointless from our perspective (and time-intensive). We are considering rather other options like moving out of it, but don't ask when or how as it's more an idea than a scheduled plan.

I am sorry for those inconveniences that this causes.

The reason we cannot address technical issues with BOINC is that we don't have anyone in the lab anymore who knows his way around it and that priorities are higher on getting scientific work done. Of course you have a point that this will eventually bite us in the ass since we won't be able to do scientific work without crunchers but it's a tricky thing to manage.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1021
Credit: 972,630,439
RAC: 1,138,026
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47411 - Posted: 12 Jun 2017 | 13:39:13 UTC - in response to Message 47407.

Thanks for replying.

It is unclear to me, which bugs are actual BOINC bugs (if any?), which bugs are NVIDIA bugs (if any?), and which bugs are GPUGrid app bugs.

I hope you guys can get the resources you need to triage and solve the bugs. If you think any are BOINC or NVIDIA bugs, please let the community know, so we can (continue) to offer help in solving those. I have been asking for a while, with no response.

Jacob

Erich56
Send message
Joined: 1 Jan 15
Posts: 311
Credit: 1,217,077,427
RAC: 2,906,193
Level
Met
Scientific publications
watwat
Message 47433 - Posted: 14 Jun 2017 | 13:50:53 UTC - in response to Message 47407.

Fixing bugs with BOINC is relatively pointless from our perspective (and time-intensive). We are considering rather other options like moving out of it, but don't ask when or how as it's more an idea than a scheduled plan.

I am sorry for those inconveniences that this causes.

The reason we cannot address technical issues with BOINC is that we don't have anyone in the lab anymore who knows his way around it and that priorities are higher on getting scientific work done. Of course you have a point that this will eventually bite us in the ass since we won't be able to do scientific work without crunchers but it's a tricky thing to manage.

Sorry Stefan for contradicting.
I don't think that any of the deficits in the crunching software 9.18 have to do with BOINC. So blaming BOINC, at least the way I see it, is simply wrong.

As said before, this software was obviously compiled in a hurry, overnight so to speak, without much (thorough) testing.
All the bugs had not existed with the previous software.

The content of the second paragraph of your postings makes me worry even more.
Again, as I said in another posting, a project of the magnitude of GPUGRID definitely needs a certain amount of infrastructure expertise. Just having the scientits there is not enough.

If, for example, no one at GPUGRID is able to reply to my posting
http://gpugrid.net/forum_thread.php?id=4561&nowrap=true#47204
from a month ago, then something needs to be improved. Definitely so.
Otherwise, GPUGRID really risks to loose more and more crunchers. Which would be too bad - I personally feel that GPUGRID is a fantastic project! And that's why I am participating :-)
So, please put your heads together to come up with a solution!

Jim1348
Send message
Joined: 28 Jul 12
Posts: 429
Credit: 944,247,802
RAC: 2,583,371
Level
Glu
Scientific publications
watwatwatwatwatwatwatwat
Message 47434 - Posted: 14 Jun 2017 | 14:20:21 UTC - in response to Message 47433.

I don't think that any of the deficits in the crunching software 9.18 have to do with BOINC. So blaming BOINC, at least the way I see it, is simply wrong.

FWIW, I have always advocated optimizing the apps for the latest hardware, since I think you get more bang for the crunching buck that way. If it leaves the older cards behind, so be it. You avoid precisely the type of problems that we are seeing here.

I usually have fairly new cards, and you will get a lot of complaints from people with older cards that they are being abandoned, or that they are being "forced" to buy new cards (I love that one).

So you have a choice. Make the one that is best for the science.

Erich56
Send message
Joined: 1 Jan 15
Posts: 311
Credit: 1,217,077,427
RAC: 2,906,193
Level
Met
Scientific publications
watwat
Message 47435 - Posted: 14 Jun 2017 | 15:13:16 UTC - in response to Message 47434.
Last modified: 14 Jun 2017 | 15:50:18 UTC

If it leaves the older cards behind, so be it. You avoid precisely the type of problems that we are seeing here.

one thing that's interesing:

the GTX750Ti in the host with Windows10 now shows problems with the new software.
the GTX750Ti in the host with WindowsXP does NOT show any problems - although this software is also new, but not the same as for Windows10.

I guess there won't be many crunchers using WindowsXP; so, many of the crunchers using their GTX750Ti with Windows10 might have problems now. And I also guess that there are many crunchers with a GTX750Ti. What can be done: throw the GTX750Ti's away? :-(

Last year, I bought two GTX780Ti just for GPUGRID crunching, Euro 700 each. So far, they work perfectly with WindowsXP. When GPUGRID support will end in April of next year, I'll need to change to Windows10. And then all the problems will begin.
However, I don't think that I will exchange them for two new Pascals. Paying some 1400 Euros every two years just to have the latest generation of cards in order to have GPUGRID running smoothly?

Jim1348
Send message
Joined: 28 Jul 12
Posts: 429
Credit: 944,247,802
RAC: 2,583,371
Level
Glu
Scientific publications
watwatwatwatwatwatwatwat
Message 47436 - Posted: 14 Jun 2017 | 16:02:42 UTC - in response to Message 47435.
Last modified: 14 Jun 2017 | 16:22:56 UTC

the GTX750Ti in the host with Windows10 now shows problems with the new software.
the GTX750Ti in the host with WindowsXP does NOT show any problems - although this software is also new, but not the same as for Windows10.

I guess that says something about WDDM, but I don't know what. It would be fun to trace it down, but GPUGrid just does not have the staff it seems. That is why they have to avoid unnecessary risks if they can. It is not a perfect solution, but seems to be the best under the circumstances.

I was planning to wait for Volta, but that will be a long time, so I migrated out of the lower-end cards into a few Pascals for higher efficiency in the warmer months, though it is still a mix. The prices are much more reasonable in the U.S., especially on sales. But everything has gone through the roof now, apparently with high demand for AMD cards even spilling over into Nvidia.

mikey
Send message
Joined: 2 Jan 09
Posts: 268
Credit: 190,419,115
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 47470 - Posted: 18 Jun 2017 | 11:17:29 UTC - in response to Message 47435.

If it leaves the older cards behind, so be it. You avoid precisely the type of problems that we are seeing here.

one thing that's interesing:

the GTX750Ti in the host with Windows10 now shows problems with the new software.
the GTX750Ti in the host with WindowsXP does NOT show any problems - although this software is also new, but not the same as for Windows10.

I guess there won't be many crunchers using WindowsXP; so, many of the crunchers using their GTX750Ti with Windows10 might have problems now. And I also guess that there are many crunchers with a GTX750Ti. What can be done: throw the GTX750Ti's away? :-(

Last year, I bought two GTX780Ti just for GPUGRID crunching, Euro 700 each. So far, they work perfectly with WindowsXP. When GPUGRID support will end in April of next year, I'll need to change to Windows10. And then all the problems will begin.
However, I don't think that I will exchange them for two new Pascals. Paying some 1400 Euros every two years just to have the latest generation of cards in order to have GPUGRID running smoothly?


On other projects some people have gone back to older drivers for their older gpu's and that brings back the gpu's under Win10 again. In short try older drivers and see if your Win10 machine can crunch again, it may just work for you too.

ChristianVirtual
Send message
Joined: 16 Aug 14
Posts: 5
Credit: 176,451,625
RAC: 3,193,193
Level
Ile
Scientific publications
watwat
Message 47471 - Posted: 18 Jun 2017 | 11:33:31 UTC
Last modified: 18 Jun 2017 | 11:55:59 UTC

For what it is worth: no issues with Linux / CentOS on my 980Ti,1080 and 1080Ti ... come over to the bright side of life ;-)

Update: opps, sorry, just saw the version difference ... kind of still learning the technical details here ...

Erich56
Send message
Joined: 1 Jan 15
Posts: 311
Credit: 1,217,077,427
RAC: 2,906,193
Level
Met
Scientific publications
watwat
Message 47473 - Posted: 18 Jun 2017 | 16:13:12 UTC - in response to Message 47470.

On other projects some people have gone back to older drivers for their older gpu's and that brings back the gpu's under Win10 again. In short try older drivers and see if your Win10 machine can crunch again, it may just work for you too.

The new crunching software acemd 918.80 only works with the latest drivers.
My two Windows 10 machines had run with 376.53 before, and with the new crunching software I had to update to 381.65 to get GPUGRID run.
Furthermore, Matt was pointing out clearly that the new software requires the newest drivers.

In other words: no way to install older drivers for getting problems solved :-(

Post to thread

Message boards : Number crunching : GPUGRID active users are falling dramatically!