Advanced search

Message boards : News : CUDA 6.5 app for Linux now available on acemdbeta and acemdshort

Author Message
Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38291 - Posted: 4 Oct 2014 | 13:58:56 UTC

Experiences here, please.

Nicolas_orleans
Send message
Joined: 25 Jun 14
Posts: 14
Credit: 446,219,525
RAC: 0
Level
Gln
Scientific publications
watwatwat
Message 38294 - Posted: 4 Oct 2014 | 16:02:33 UTC

Ubuntu 14.04.1 - Nvidia driver 343.22 - BOINC Client 7.3.15 - 1 GTX 770 + 1 GTX 750 Ti - Both working on acemdshort 8.46 (CUDA 6.5)

Scheduling questions :
1/ My account is set up to get only acemdlong (no acemdshort or acemdbeta), why did I get an acemdshort ?
2/ My system is GK104 + GM107, so based on http://www.gpugrid.net/forum_thread.php?id=3874 I was thinking to get CUDA 6.0 application ?

* If you have a GM204 you will get a CUDA65 application, or nothing if your driver is too old
* If you have a GM107 you will get a CUDA60 application, or nothing if your driver is too old
* If you have a Fermi or Kepler you will get a CUDA60 application, or CUDA42 if your driver is too old


Regarding performance, nothing special right now. GM107 and GK104 appear to crunch well with CUDA 6.5 :)

biodoc
Send message
Joined: 26 Aug 08
Posts: 89
Credit: 656,130,328
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38296 - Posted: 4 Oct 2014 | 18:28:19 UTC

64-bit linux mint 17, GTX980, nvidia driver 343.22

finished NOELIA_SH2 WU:

http://www.gpugrid.net/result.php?resultid=13168490

Everything ok so far.

captainjack
Send message
Joined: 9 May 13
Posts: 112
Credit: 820,718,399
RAC: 1,200,373
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwat
Message 38297 - Posted: 4 Oct 2014 | 18:47:29 UTC

64-bit Linux Ubuntu, Driver 340.24.

Finished 2 on a GTX 770 and they validated.

One more under way on a GTX 660ti and one more on a GTX 770.

Everything looks normal so far.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38298 - Posted: 4 Oct 2014 | 22:19:44 UTC - in response to Message 38294.

Nicolas,

You got a short because there's no long app yet.

Matt

biodoc
Send message
Joined: 26 Aug 08
Posts: 89
Credit: 656,130,328
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38303 - Posted: 5 Oct 2014 | 11:24:04 UTC

Some data. In windows setting SWAN_SYNC=0 results in 100% CPU usage of one core. It doesn't seem to work in Linux (24-50% CPU usage with or w/o env variable)

The trend for the 980 is slower on Linux vs Win 8.1

I may try rolling back the driver on the machine with the 780Ti to see performance of the Cuda 6.0 app.

I couldn't get the formatting quite right below but you should be able to figure it out I think.

OS Nvidia driver App WU GPU Task Run time CPU time
Linux Mint 17 LTS 343.22 v8.46 (cuda65) NOELIA_SH2 780Ti 13169477 12687.47 7820.49
Linux Mint 17 LTS 343.22 v8.46 (cuda65) NOELIA_SH2 780Ti 13163102 11840.67 6655.4
Linux Mint 17 LTS 343.22 v8.46 (cuda65) NOELIA_SH2 780Ti 13162843 12907.36 7793.96
Linux Mint 17 LTS 343.22 v8.46 (cuda65) NOELIA_SH2 780Ti 13161646 12550.37 7635.82

Linux Mint 17 LTS 343.22 v8.46 (cuda65) NOELIA_SH2 980 13168672 10164.36 4482.03
Linux Mint 17 LTS 343.22 v8.46 (cuda65) NOELIA_SH2 980 13168490 9709.92 4853.28
Linux Mint 17 LTS 343.22 v8.46 (cuda65) NOELIA_SH2 980 13163548 10188.12 3704.3
Linux Mint 17 LTS 343.22 v8.46 (cuda65) NOELIA_SH2 980 13162688 10997.22 4198.59
Linux Mint 17 LTS 343.22 v8.46 (cuda65) NOELIA_SH2 980 13162663 10776.91 3976.07
Linux Mint 17 LTS 343.22 v8.46 (cuda65) NOELIA_SH2 980 13162086 11160.55 5360.76

Win 8.1 344.16 v8.46 (cuda65) NOELIA_SH2 980 13162099 8725.56 8700.52
Win 8.1 344.16 v8.46 (cuda65) NOELIA_SH2 980 13161299 9600.29 9566.91
Win 8.1 344.16 v8.46 (cuda65) NOELIA_SH2 980 13159641 9297.7 9262.39

biodoc
Send message
Joined: 26 Aug 08
Posts: 89
Credit: 656,130,328
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38306 - Posted: 5 Oct 2014 | 14:12:07 UTC
Last modified: 5 Oct 2014 | 14:12:27 UTC

This Error was due to me rolling back the nvidia drivers on my 780Ti.

http://www.gpugrid.net/result.php?resultid=13164376

I'm testing NOELIA_SH2 WUs using the cuda 6.0 app.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,629,417,844
RAC: 9,838,275
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38308 - Posted: 5 Oct 2014 | 14:56:46 UTC - in response to Message 38303.

Some data. In windows setting SWAN_SYNC=0 results in 100% CPU usage of one core. It doesn't seem to work in Linux (24-50% CPU usage with or w/o env variable)

The trend for the 980 is slower on Linux vs Win 8.1

I may try rolling back the driver on the machine with the 780Ti to see performance of the Cuda 6.0 app.

I couldn't get the formatting quite right below but you should be able to figure it out I think.


You should use the "pre" tag instead of "code", and white spaces instead of tabs.
It is also practical to abbreviate the OS name, just like you did it with Windows.
Linux = Linux Mint 17 LTS

Nvidia OS dirver Application WorkUnit GPU Task Run time CPU time Linux 343.22 v8.46 (cuda65) NOELIA_SH2 780Ti 13169477 12687.47 7820.49 Linux 343.22 v8.46 (cuda65) NOELIA_SH2 780Ti 13163102 11840.67 6655.4 Linux 343.22 v8.46 (cuda65) NOELIA_SH2 780Ti 13162843 12907.36 7793.96 Linux 343.22 v8.46 (cuda65) NOELIA_SH2 780Ti 13161646 12550.37 7635.82 Linux 343.22 v8.46 (cuda65) NOELIA_SH2 980 13168672 10164.36 4482.03 Linux 343.22 v8.46 (cuda65) NOELIA_SH2 980 13168490 9709.92 4853.28 Linux 343.22 v8.46 (cuda65) NOELIA_SH2 980 13163548 10188.12 3704.3 Linux 343.22 v8.46 (cuda65) NOELIA_SH2 980 13162688 10997.22 4198.59 Linux 343.22 v8.46 (cuda65) NOELIA_SH2 980 13162663 10776.91 3976.07 Linux 343.22 v8.46 (cuda65) NOELIA_SH2 980 13162086 11160.55 5360.76 Win 8.1 344.16 v8.46 (cuda65) NOELIA_SH2 980 13162099 8725.56 8700.52 Win 8.1 344.16 v8.46 (cuda65) NOELIA_SH2 980 13161299 9600.29 9566.91 Win 8.1 344.16 v8.46 (cuda65) NOELIA_SH2 980 13159641 9297.7 9262.39

biodoc
Send message
Joined: 26 Aug 08
Posts: 89
Credit: 656,130,328
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38309 - Posted: 5 Oct 2014 | 15:34:22 UTC

Thanks for reformatting the table and the format suggestions. I'll give it a go next time.

Profile [VENETO] sabayonino
Send message
Joined: 4 Apr 10
Posts: 47
Credit: 545,196,862
RAC: 424,804
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38327 - Posted: 6 Oct 2014 | 19:10:12 UTC
Last modified: 6 Oct 2014 | 19:22:36 UTC

Gentoo Linux
Drivers 343.22
Boinc 7.2.42

GTX760 : ok
http://www.gpugrid.net/result.php?resultid=13174830

GTX780ti : ok
http://www.gpugrid.net/result.php?resultid=13175351

GTX780Ti : WU became unstable
http://www.gpugrid.net/result.php?resultid=13170156

GTX780Ti :

SWAN : FATAL : Cuda driver error 700 in file 'swanlibnv2.cpp' in line 1968.
A problem occurred in the CUDA runtime and ACEMD must terminate.
Your simulation can be restarted from the last checkpoint.

If the output of 'dmesg' contains 'NVRM Xid' messages
the GPUs may be in an inconsistent state and the machine
should be rebooted before trying again.

If the problem recurs please report it to Acellera

http://www.gpugrid.net/result.php?resultid=13169219


dmesg | grep NVRM
[ 3.005434] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 343.22 Thu Sep 11 16:27:43 PDT 2014
[18264.924884] NVRM: GPU at PCI:0000:01:00: GPU-bccee218-a355-bb82-82ae-36a976f3e841
[18264.924892] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000001, engmask 00000101, intr 10000000
[19295.343778] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000001, engmask 00000101, intr 10000000
[32561.926905] NVRM: Xid (PCI:0000:01:00): 31, Ch 00000001, engmask 00000101, intr 10000000




Error WUs

Valid WUs

Nicolas_orleans
Send message
Joined: 25 Jun 14
Posts: 14
Credit: 446,219,525
RAC: 0
Level
Gln
Scientific publications
watwatwat
Message 38330 - Posted: 6 Oct 2014 | 19:38:51 UTC

Ubuntu 14.04.1 - Nvidia driver 343.22 - BOINC Client 7.3.15 - 1 GTX 770 + 1 GTX 750 Ti - acemdshort 8.46 (CUDA 6.5)

21 tasks completed, no issues
http://www.gpugrid.net/results.php?hostid=177839&offset=0&show_names=0&state=0&appid=17

biodoc
Send message
Joined: 26 Aug 08
Posts: 89
Credit: 656,130,328
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38341 - Posted: 7 Oct 2014 | 12:18:34 UTC

Finished 24 "short" tasks on my GTX980 successfully. No errors.

http://www.gpugrid.net/results.php?hostid=176602&offset=0&show_names=0&state=0&appid=17

Are we ready to move to the "long" WUs?

biodoc
Send message
Joined: 26 Aug 08
Posts: 89
Credit: 656,130,328
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38354 - Posted: 7 Oct 2014 | 19:23:54 UTC

Is SWAN_SYNC functional in the new cuda 6.5 linux app?

I added SWAN_SYNC=0 in /etc/environment and printenv SWAN_SYNC returns the value 0.

CPU usage is still quite low though.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38355 - Posted: 7 Oct 2014 | 21:23:41 UTC - in response to Message 38354.

On Linux you need SWAN_SYNC=1

Matt

biodoc
Send message
Joined: 26 Aug 08
Posts: 89
Credit: 656,130,328
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 38364 - Posted: 8 Oct 2014 | 9:41:30 UTC - in response to Message 38355.

On Linux you need SWAN_SYNC=1

Matt


I'm seeing similar CPU time per WU whether I set SWAN_SYNC to "0" or "1" on the 780Ti or 980 (different machines).

On win 8.1, I saw a dramatic increase in CPU time per WU when I set SWAN_SYNC to "0".

BTW, for linux users using recent nvidia drivers, if you set coolbits to 12 you have more monitoring info (GPU utilization, PCIe bandwidth utilization, memory usage), manual fan control and overclocking/underclocking options.

http://www.phoronix.com/scan.php?px=MTY1OTM&page=news_item

Here's the device section of my xorg.conf file:

Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
Option "Coolbits" "12"
EndSection

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 38369 - Posted: 8 Oct 2014 | 19:29:03 UTC - in response to Message 38364.

Well, don't worry about it too much. Provided the cpu's not over-subscribed there's negligible performance degradation.

Post to thread

Message boards : News : CUDA 6.5 app for Linux now available on acemdbeta and acemdshort