Titan X

Message boards : Graphics cards (GPUs) : Titan X

Author	Message
Michael P. Gainor Send message Joined: 23 Jan 16 Posts: 15 Credit: 69,388,188 RAC: 0 Level Scientific publications	Message 44082 - Posted: 2 Aug 2016 \| 19:27:19 UTC
	Well today at 9:00AM EST I bought the new NVidia Pascal Titan X. According wccftech they think at times it may hit 12 Tflops, due to something similar to INTEL Turbo boost. It should arrive tomorrow so we should see what it does. Will the new Titian work out of the box with GPU GIRD or will I have to wait for CUDA 8?
	ID: 44082 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,201,255,749 RAC: 386 Level Scientific publications	Message 44084 - Posted: 2 Aug 2016 \| 19:32:31 UTC - in response to Message 44082.
	You (like all of us with lesser Pascal GPUs) have to wait for the public release of CUDA 8.
	ID: 44084 \| Rating: 0 \| rate: / Reply Quote

Michael P. Gainor Send message Joined: 23 Jan 16 Posts: 15 Credit: 69,388,188 RAC: 0 Level Scientific publications	Message 44085 - Posted: 2 Aug 2016 \| 20:01:01 UTC - in response to Message 44084. Last modified: 2 Aug 2016 \| 20:01:24 UTC
	I was afraid as much, I was hoping someone found a work around :)
	ID: 44085 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 44089 - Posted: 3 Aug 2016 \| 13:41:03 UTC
	You can expect 25-30% more performance from a new Titan X versus a GTX1080 for about a ~70% higher price. That's actually "kinf od reasonable" in the Titan world, as the previous ones were even more expensive relative to the best high-end GPU. In the past and current version GPU-Grid struggles to fully utilize those ever wider GPUs, though. If this holds true for the CUDA 8 version (and I have no reason to assume anything else), the real performance advantage may be down to ~20%. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 44089 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 44090 - Posted: 4 Aug 2016 \| 9:42:27 UTC - in response to Message 44089. Last modified: 4 Aug 2016 \| 9:54:09 UTC
	It was the case that the GTX980Ti didn't scale (if not the 980) and the GTX1080 (and possibly 1070) outperform the GTX980Ti in most trials, so it's likely that the GTX1080 won't scale well for here, never mind the Titan X (Pascal). While we can't show this for here it's already apparent at Folding: Folding@home (sp) GPU ns/day % Watts Perf/W % Gflops (sp) GFlops % Titan X ? ? 250 ? 10974 boost 124 GTX1080 135.6 100 180 100 8873 boost 100 GTX1070 125.8 92.8 150 111.4 6463 boost 72.8 GTX1060 84.8 62.5 120 93.75 4372 boost 49.3 GTX980Ti 119.4 88.1 250 63.4 6060 boost 68.3 GTX980 94.1 69.4 165 75.7 4981 boost 56.1 GTX970 76.7 56.6 145 70.3 3920 boost 44.2 GTX960 47.0 34.7 120 51.6 2413 boost 27.2 Observed performance per Watt as a relative percentage: Observed relative performance=92.8% (1070 vs 1080) 150/180=83.333W (relative power usage) 92.8/83.33 %=111.4% GFlops (SP) = 2shadersclock speed. Reference boost frequencies used! Note that both series boost higher than reference values but boost varies by model, conditions and can be controlled and constrained. While this is based on actual observed performances, it’s still somewhat theoretical. To be accurate you would need to use actual observed power usages and actual boosted GFlops (calculated from reference). That said it’s still a good indicator. Numbers taken from AnandTech’s GPU 2016 Benchmarks, http://www.anandtech.com/bench/product/1715?vs=1714 Although the primary observation is that the GTX1070 offers best performance/Watt, it's likely that both it and the 1080 could be significantly tweaked for performance/Watt by capping the power &/or temps, and it's also possible to run 2 apps on the one big GPU to improve overall throughput (when there is an abundance of apps). With more basic apps such as Einstein (CUDA 3.2 & 5.5) and MW you may see a more linear performance relative to the Pascals GFlops (as these apps don’t utilize the more complex CUDA instruction sets). GPUGrid is more similar to Folding but the app is different so it may bottleneck in different places. For that reason a performance chart will likely look similar but the choicest card(s) might be different... Other hardware factors. The Titan has 3MB L2 cache whereas the GTX1080 has 2MB. The Titan’s bus width (& ROPs ratio) are slightly (7%) higher, so there are less potential hardware bottlenecks. Should help it scale but it is still 24% bigger than a 1080. Note that the GTX1060’s (GP106) cache is only 1.5MB, which might explain the slightly poorer performance at Folding. While 1.5MB is likely to be a factor at GPUGrid too, how significant that is remains to be seen. PS the Titan X (Pascal) isn’t full-fat; the Quadro P6000 has two more SM’s for 3840 CUDA cores (not that I recommend either card for here – both are far too costly). ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 44090 \| Rating: 0 \| rate: / Reply Quote

Jim1348 Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level Scientific publications	Message 44091 - Posted: 4 Aug 2016 \| 11:29:11 UTC - in response to Message 44090.
	That is a great comparison. Thanks.
	ID: 44091 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 44097 - Posted: 6 Aug 2016 \| 20:59:58 UTC - in response to Message 44090.
	With more basic apps such as Einstein (CUDA 3.2 & 5.5) Don't let the CUDA version fool you: Einstein uses complex and carefully optimized code. They're not using advanced library functions from nVidia; instead they're doing the complicated stuff on their own or with other libraries. And currently they're streaming through their complete array(s) with each operation, in the way a GPU is classically supposed to work. This makes their code significantly dependent on GPU memory bandwidth (my GTX970 runs at >80% memory controller utilization at 3500 MHz), which means any bigger GPU doesn't scale as well as its GFlops suggest, but is slowed down according to its memory bandwidth. And some other factors.. e.g. AMD Fury is not the home run at Eisntien one would expect due to its massive bandwidth, because a driver bug prohibits them from running more than 1 task concurrently, which is not enough to saturate a fast GPU. Pascals are OK at Einstein, especially with eco tuning, but are not the homeruns which their raw GFlops suggest. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 44097 \| Rating: 0 \| rate: / Reply Quote

Michael P. Gainor Send message Joined: 23 Jan 16 Posts: 15 Credit: 69,388,188 RAC: 0 Level Scientific publications	Message 44100 - Posted: 8 Aug 2016 \| 13:05:11 UTC - in response to Message 44097.
	So far I have found scaling to be pretty good in Folding@home when they are using there newer core 21, I get about 1.07 million Points per day about double what I got with my 980ti. However, on their older core 18 the scaling isn’t nearly as good. Although no one at F@H has ever confirmed this I have always found that every 100k PPD oddly enough is very similar to how many teraflops I should be getting.
	ID: 44100 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Graphics cards (GPUs) : Titan X

	About	Science	Volunteers	Performance	Forum	Join us	Donate