Advanced search

Message boards : Number crunching : nv 9800 using full cpu core (linux)

Author Message
Profile trigggl
Send message
Joined: 6 Mar 09
Posts: 25
Credit: 102,324,681
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 18248 - Posted: 5 Aug 2010 | 16:26:43 UTC

I was wondering why it was looking like a task would take 12 days on my 9800 when others have had success. I finally noticed that there was a acemd2 task that showed up at the top of my process chart once every blue moon. I decided to set the niceness of the task to 0 and then the task stayed at the top of the process chart and started making some progress. I finally realized that the task was needing the full core, not just the 0.27 that boinc was claiming it was using. So, I am now baby sitting my boinc client to make sure I only enable one yoyo task at once so that gpugrid can do its thing.

Anyone else running linux have this problem or had it and know of a fix? I do know that the GPU is being used now because the temp goes up. It's at 160F instead of 130F.
____________
Gentoo is not just a penguin.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18257 - Posted: 6 Aug 2010 | 0:03:04 UTC - in response to Message 18248.

This is normal for Linux and significantly improves runtime for the GPUGrid task
Let this task complete, to see how it gets on.

Profile trigggl
Send message
Joined: 6 Mar 09
Posts: 25
Credit: 102,324,681
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 18258 - Posted: 6 Aug 2010 | 1:46:14 UTC - in response to Message 18257.

This is normal for Linux and significantly improves runtime for the GPUGrid task
Let this task complete, to see how it gets on.

Looks like I got 6,800 for 12 hours.

http://www.gpugrid.net/result.php?resultid=2778086

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18261 - Posted: 6 Aug 2010 | 11:40:13 UTC - in response to Message 18258.

That is spot on; just what you would expect for that card.

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 18262 - Posted: 6 Aug 2010 | 18:27:42 UTC - in response to Message 18248.
Last modified: 6 Aug 2010 | 19:25:28 UTC

Anyone else running linux have this problem or had it and know of a fix? I do know that the GPU is being used now because the temp goes up. It's at 160F instead of 130F.

Others have this problem as well, it's been discussed in this thread.
I'm yet trying it again with my new graphics card, and it looks a bit better, though not good: Xorg is using considerably less (~5-10%) and I have yet to see any values above 100% for the WU.

The main problem still exists: It's still pretending to use only 0.15% CPU and thus lets 4 other projects crunch in parallel. It definitely should acknowledge this behaviour officially, as it's no problem, and it only has to do so for penguins obviously. Just put in the relevant field in the WU instead of "0.15 CPU + 1 GPU" the numbers "1 CPU + 1 GPU", everything would be fine.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 18269 - Posted: 7 Aug 2010 | 12:19:54 UTC

I'm just crunching an Einstein-Cuda, as well testing my new GT240, and it behaves considerably better.
I know that it's not as good in utilising the GPU, I can see it at the low temperatures of it, but is states quite clear: "Aktiv (1.00 CPUs + 1.00 NVIDIA GPUs)".

So in regard of programming and utilisation Einstein's bad, in regard of behaviour towards other projects Einstein's fine.
Here's the opposite: In regard of programming and utilisation GPUGrid's good, in regard of behaviour towards other projects GPUGrid's bad.

I think it should be very easy to correct the latter one.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18272 - Posted: 7 Aug 2010 | 17:04:21 UTC - in response to Message 18269.

Einstein barely uses the GPU so their 1.00 CPUs + 1.00 NVIDIA GPUs is also wrong. On Linux GPUGrid basically allocates one CPU core/thread to facilitate the GPU. This 'bug' may be fixed after Sept but it does speed up the tasks. You might want to consider leaving a core free in Boinc, running GPUGrid, 3 normal CPU tasks, and a light FreeHAL task (they only use about 3 or 4% CPU).

Profile trigggl
Send message
Joined: 6 Mar 09
Posts: 25
Credit: 102,324,681
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 18273 - Posted: 7 Aug 2010 | 17:40:05 UTC - in response to Message 18272.
Last modified: 7 Aug 2010 | 17:45:00 UTC

Well, I have found a way to keep my yoyo work units to one at a time for most of the time. I just have to estimate the length of time for each work unit and write an 'at' command for each one. There's probably a way I could write a bash script to handle this automatically, but who's got the time?

Perhaps it's possible to find the minimum of CPU % required and fix it with cc_config or app_info on a case by case basis.

Profile trigggl
Send message
Joined: 6 Mar 09
Posts: 25
Credit: 102,324,681
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 18276 - Posted: 7 Aug 2010 | 23:55:01 UTC - in response to Message 18273.
Last modified: 7 Aug 2010 | 23:56:18 UTC

I think I have a script that will work for yoyo. Modify for your own secondary project (change "yoyo" to a partial of the one your working on).

#!/bin/sh

until [ $(boinccmd --get_results | grep -c yoyo) == 0 ] ; do

until [ $(boinccmd --get_results | grep yoyo -A 10 | grep -c task.state..1) -lt 2 ] ; do
TASK=$(boinccmd --get_results | grep yoyo -B 2 -A10 | grep GUI..no -B 11 | grep " name" | tail -n 1)
task=$(echo ${TASK:9})
boinccmd --result http://www.rechenkraft.net/yoyo/ $task suspend
done

if [ $(boinccmd --get_results | grep yoyo -A 10 | grep -c task.state..1) == 0 ] ; then
TASK=$(boinccmd --get_results | grep yoyo -B 2 -A10 | grep GUI..yes -B 11 | head -n 1)
task=$(echo ${TASK:9})
boinccmd --result http://www.rechenkraft.net/yoyo/ $task resume
fi

sleep 60

done


If active tasks are more than one, it suspends the last ones. If tasks active run down to 0, it resumes the top one. I think this will even work when allowing new tasks.

Profile Saenger
Avatar
Send message
Joined: 20 Jul 08
Posts: 134
Credit: 23,657,183
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 18281 - Posted: 8 Aug 2010 | 12:34:58 UTC - in response to Message 18272.

Einstein barely uses the GPU so their 1.00 CPUs + 1.00 NVIDIA GPUs is also wrong. On Linux GPUGrid basically allocates one CPU core/thread to facilitate the GPU. This 'bug' may be fixed after Sept but it does speed up the tasks. You might want to consider leaving a core free in Boinc, running GPUGrid, 3 normal CPU tasks, and a light FreeHAL task (they only use about 3 or 4% CPU).

As the Linux-App (6.04 or 6.06) is not the same as the Windows-App (6.05 or 6.11), why not put the right numbers in there? Why cheat the other projects? I don't think it's that hard to say to BOINC that you use 1 full core, and it only needs to be done for Linux.

It's imho just laziness and disregard for the crunchers that prevents this.
____________
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki

Profile trigggl
Send message
Joined: 6 Mar 09
Posts: 25
Credit: 102,324,681
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 18285 - Posted: 8 Aug 2010 | 14:24:36 UTC

My script above didn't work properly. I fixed it below.

#!/bin/sh
until [ $(boinccmd --get_results | grep -c yoyo) == 0 ] ; do
until [ $(boinccmd --get_results | grep yoyo -A 10 | grep report..no -A 7 | grep -c GUI..no) -lt 2 ] ; do
TASK=$(boinccmd --get_results | grep yoyo -B 2 -A 10 | grep GUI..no -B 11 | grep " name" | tail -n 1)
task=$(echo ${TASK:9})
boinccmd --result http://www.rechenkraft.net/yoyo/ $task suspend
done

if [ $(boinccmd --get_results | grep yoyo -A 10 | grep -c task.state..1) == 0 ] ; then
TASK=$(boinccmd --get_results | grep yoyo -B 2 -A 10 | grep GUI..yes -B 11 | head -n 1)
task=$(echo ${TASK:9})
boinccmd --result http://www.rechenkraft.net/yoyo/ $task resume
fi
sleep 60
done

Basically what this does:

Until I run out of work,

until non-finished active tasks is less than 2
suspend bottom non-suspended tasks

if the one running task completes
resume the top suspended one.

wait a minute so I'm not hammering the boinc command.

Profile trigggl
Send message
Joined: 6 Mar 09
Posts: 25
Credit: 102,324,681
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 18311 - Posted: 9 Aug 2010 | 21:00:55 UTC - in response to Message 18285.

Also, I created an app_info.xml that seemed to work. By claiming a full core, it stayed active and stopped other processes from interrupting. It seemed like it was a little slower, though, so I disabled it.

My script works well enough (almost too well) as long as boinc isn't interrupted.

Post to thread

Message boards : Number crunching : nv 9800 using full cpu core (linux)

//