Advanced search

Message boards : Graphics cards (GPUs) : CUDA 3.1 has been released

Author Message
Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,836,565,599
RAC: 343,737
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17634 - Posted: 16 Jun 2010 | 9:57:24 UTC

CUDA 3.1 has been released to the General Public by NVidia.
This does not mean GPUGrid is releasing CUDA 3.1 compiled apps/tasks just yet, but the techs said they have compiled and tested 3.1 apps and get some improvement in performance. I expect they will move when the existing 3.0 compiled applications are completed, and following a few runs with the new apps (Betas/tests). They have to give Fermi users time to install the new driver as well as finish the existing CUDA 3.0 compiled tasks.
So, if you have a Fermi, install this latest driver.

Note CUDA 3.1 will only improve Fermi's!

Note the Beta was 257.15!
New in Version 257.21

* Adds support for Blu-ray 3D with NVIDIA 3D Vision technology. Learn more about the hardware and software requirements here .
* Increases performance for GeForce GTX 400 Series GPUs in several PC games. The following are examples of some of the most significant improvements measured with GeForce GTX 480. Results will vary depending on your GPU and system configuration:

o Up to 14% in Aliens vs. Predator (1920x1200 noAA/AF – Tessellation on)
o Up to 4% in Batman: Arkham Asylum (1920x1200 4xAA/16xAF PhysX=High)
o Up to 5% in BattleForge (1920x1200 4xAA/16xAF – Very High settings)
o Up to 5% in Call of Duty: Modern Warfare 2 (1920x1200 4xAA/16xAF)
o Up to 4% in Crysis: Warhead (1920x1200 4xAA/16xAF – Enthusiast setting)
o Up to 24% in Enemy Territory: Quake Wars (1920x1200 no AA/AF)
o Up to 9% in Far Cry 2 (2560x1600 8xAA/16xAF)
o Up to 25% in Just Cause 2 (2560x1600 no AA/AF - Concrete Jungle)
o Up to 7% in Metro 2033 (1920x1200 no AA/16xAF – Tessellation on)
o Up to 40% in Metro 2033 with SLI ((1920x1200 4xAA/16xAF – Tessellation on)
o Up to 8% in S.T.A.L.K.E.R.: Call of Pripyat (1920x1200 no AA/AF – Day)
o Up to 110% in Stone Giant with SLI (2650x1600 – Tessellation on, DoF on)
o Up to 6% in The Chronicles of Riddick: Dark Athena (2560x1600 no AA/AF)
o Up to 9% in Unigine: Tropics (2560x1600 no AA/AF – OpenGL)
o Up to 5% in 3DMark Vantage (Performance and Extreme Presets)
o Up to 19% with Transparency AA (1920x1200 4xTrSS – measured in Crysis)

* Upgrades PhysX System Software to version 9.10.0223.
* Adds support for OpenGL 4.0 for GeForce GTX 400 Series GPUs.
* Adds support for CUDA Toolkit 3.1 which includes significant performance increases for double precision math operations. See CUDA Zone for more details.
* Adds support for new extreme Antialiasing modes for 3-way SLI PCs, including up to SLI48x AA for GeForce 200 series GPUs and up to SLI96x AA for GeForce GTX 400 series GPUs.
* Adds support for a new ‘Quality’ mode for NVIDIA’s Ambient Occlusion control panel feature.
* Adds a new NVIDIA Control Panel setup page for SLI and PhysX for ultimate control over multi-gpu configurations.
* Adds a new NVIDIA Control Panel feature for ultimate control over CUDA GPUs, allowing the user to effectively choose which GPU will power each CUDA application.
* 3D Vision customers can download the v257.21 3D Vision drivers here.
* Includes numerous bug fixes. Refer to the release notes on the documentation tab for information about the key bug fixes in this release.

Additional Information:

* Installs HD Audio driver version 1.0.9.1 (for supported GPUs).
* Supports the new GPU-accelerated features in Adobe CS5.
* Supports GPU-acceleration for smoother online HD videos with Adobe Flash 10.1. Learn more here.
* Supports the new version of MotionDSP's video enhancement software, vReveal, which adds support for HD output. NVIDIA customers can download a free version of vReveal that supports up to SD output here.
* Supports DirectCompute with Windows 7 and GeForce 8-series and later GPUs.
* Supports OpenCL 1.0 (Open Computing Language) for all GeForce 8-series and later GPUs.
* Supports OpenGL 3.3 for GeForce 8-series and later GPUs.
* Supports single GPU and NVIDIA SLI technology on DirectX 9, DirectX 10, DirectX 11, and OpenGL, including 3-way SLI, Quad SLI, and SLI support on SLI-certified Intel X58-based motherboards.
* Supports GPU overclocking and temperature monitoring by installing NVIDIA System Tools software.

NVidia

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1895
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17648 - Posted: 17 Jun 2010 | 10:39:56 UTC - in response to Message 17634.

I can't actually find CUDA3.1 out.
http://developer.nvidia.com/object/cuda_archive.html

GDF

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,836,565,599
RAC: 343,737
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17654 - Posted: 17 Jun 2010 | 15:42:02 UTC - in response to Message 17648.

Sorry, that was a bad title choice.
By public I meant end user (cruncher) support, rather than the toolkit (which is not required to crunch with, just compile with).

Cuda 3.1 Support was released with the latest 257.21 driver.
However the CUDA 3.1 Developer Tools, as you point out, are still in Beta form!

It appears that NVidia have one leg that wants to go for a run and the other is still sleeping, or is the saying, the left hand does not know what the right hand is doing?

Still, you can at least test with the Beta, if you need to.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1895
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17655 - Posted: 17 Jun 2010 | 16:29:09 UTC - in response to Message 17654.

What makes it faster are the dll distributed with the 3.1 toolkit.
Only when that is publicly available we can distribute it.

gdf

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,679,584,594
RAC: 9,991,696
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17754 - Posted: 29 Jun 2010 | 11:14:51 UTC

CUDA 3.1 is now really released by NVidia.
GDF started a new thread about it in the news section

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17756 - Posted: 29 Jun 2010 | 13:11:25 UTC

Driver updated on my machine ... I am ready to rock whenever GPUGrid is.

GDF - will the new version be tested in beta first?
Should I check "Run test applications" in my preferences?

Thank you,
____________
Thanks - Steve

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1895
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17757 - Posted: 29 Jun 2010 | 14:35:01 UTC - in response to Message 17756.

Yes,
first in beta.

gdf

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1895
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17774 - Posted: 30 Jun 2010 | 19:30:39 UTC - in response to Message 17757.

There are 20 beta units out with CUDA3.1.

gdf

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17776 - Posted: 30 Jun 2010 | 20:52:18 UTC - in response to Message 17774.
Last modified: 30 Jun 2010 | 20:55:29 UTC

I received 3 of them 6.28 application for gtx295.

I was thinking that cuda 3.1 was special for fermi-cards?

I will give you the times for this wu's.

Time is 10 min 37 secs for two wu's windows xp driver 257.21

Good result?
____________
Ton (ftpd) Netherlands

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,836,565,599
RAC: 343,737
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17777 - Posted: 30 Jun 2010 | 21:37:59 UTC - in response to Message 17776.

Not good, I would say; think they were ment for Fermi cards to test on.

One of ftpd's runs on his GTX295

Tried to configure my settings to only pick up test apps, but we can no longer do that!

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1844
Credit: 10,679,584,594
RAC: 9,991,696
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17778 - Posted: 30 Jun 2010 | 21:47:53 UTC - in response to Message 17777.

I just missed those 3.1 WUs. But I think they should take much longer than that, especially on a GTX295. Anyway, how could a GTX295 get a fermi WU?

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1895
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17779 - Posted: 30 Jun 2010 | 21:55:46 UTC - in response to Message 17778.

CUDA3.1 works well also for all other cards. So any card can take it.
Some fermi card has picked it up and the performance is good.

gdf

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17781 - Posted: 30 Jun 2010 | 22:54:32 UTC

I just had one finish on a GTX285 clocked up to 1656 on Win7 x64
Time to complete = 10 minutes 1 sec.
Slightly heavier in CPU usage but not to bad (.23 normal, .29 beta)
Utilization is still low but we know that is an NVidia / Microsoft issue on Vista/ Win7 and not GPUGrid's app.
____________
Thanks - Steve

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17784 - Posted: 1 Jul 2010 | 1:24:07 UTC
Last modified: 1 Jul 2010 | 1:33:21 UTC

Another one ... this time 480GTX clocked at 1688 on WinXP 32
runtime = 3 minutes 39 seconds.

and another showed up as I was posting, same system
runtime = 3 minutes 29 seconds

both with SWAN_SYNC = 0

I have not tried without SWAN_SYNC to see if it is still necessary. My guess is that these run so fast I'll not be able to catch one before it runs so we might need to wait until they are full size and widely distributed before we can answer.
____________
Thanks - Steve

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 136
Credit: 589,133,518
RAC: 182,079
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17785 - Posted: 1 Jul 2010 | 5:16:29 UTC

65nm GTX 260 (CUDA2.2/2.3 on GPUGRID unusable) drivers 257.21 on XP x_64:

factory OC 1.4GHz: 8 minutes 1 second
a little OCed by Riva 1.48GHz: 7 minutes 46 second
both with SWAN_SYNC = 0

P.S. Sorry for my deleting standard tasks, there is no way how to run only test tasks.

Profile nenym
Send message
Joined: 31 Mar 09
Posts: 136
Credit: 589,133,518
RAC: 182,079
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17786 - Posted: 1 Jul 2010 | 7:06:12 UTC
Last modified: 1 Jul 2010 | 7:13:42 UTC

Another test tasks received.
Results can been seen at host ID 31329, I am trying some different OC.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1895
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17787 - Posted: 1 Jul 2010 | 7:29:24 UTC - in response to Message 17784.
Last modified: 1 Jul 2010 | 7:29:48 UTC

Another one ... this time 480GTX clocked at 1688 on WinXP 32
runtime = 3 minutes 39 seconds.

and another showed up as I was posting, same system
runtime = 3 minutes 29 seconds

both with SWAN_SYNC = 0

I have not tried without SWAN_SYNC to see if it is still necessary. My guess is that these run so fast I'll not be able to catch one before it runs so we might need to wait until they are full size and widely distributed before we can answer.


It takes 4.351 ms/step. This is the fastest result I have ever seen (due to the overclock). Windows XP is as fast as Linux it seems.

gdf

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17789 - Posted: 1 Jul 2010 | 8:32:10 UTC

Another one on gtx480 takes 4 min 06 secs.

OK?
____________
Ton (ftpd) Netherlands

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1895
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17790 - Posted: 1 Jul 2010 | 8:45:03 UTC - in response to Message 17789.

Yes. 4.870 ms/step is about right.
I think that the beta works well.

gdf

coldFuSion
Send message
Joined: 22 May 10
Posts: 20
Credit: 85,355,427
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17806 - Posted: 1 Jul 2010 | 18:20:20 UTC

GTX470 OC 700/1400/1850
no SWAN_SYNC environment variable

# Time per step (avg over 50000 steps): 6.745 ms
# Approximate elapsed time for entire WU: 337.234 s

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,836,565,599
RAC: 343,737
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17807 - Posted: 1 Jul 2010 | 19:25:49 UTC - in response to Message 17806.

This cuda3.1 WU ran OK on Win XP using a GTX470 OC'd to 707MHz; Time per step 5.371 ms
Approximate elapsed time for entire WU: 268.547 s

Got a cuda30 client failure just before picking this WU up!
Full dump here, http://www.gpugrid.net/result.php?resultid=2597382
Looks like something else tried used the RAM.

Would be nice to see normal sized Cuda 3.1 Fermi Work Units. These are too small to judge performance gain from.

Post to thread

Message boards : Graphics cards (GPUs) : CUDA 3.1 has been released