Advanced search

Message boards : Graphics cards (GPUs) : Fermi

Author Message
Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15301 - Posted: 18 Feb 2010 | 17:00:20 UTC

http://www.nvidia.com/object/fermi_architecture.html

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15305 - Posted: 18 Feb 2010 | 17:25:13 UTC - in response to Message 15301.

I'm getting at least one and maybe two. We just need to have them available so we can see how quick they process WUs!

____________
Thanks - Steve

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15316 - Posted: 18 Feb 2010 | 21:42:08 UTC

Nice link with that video explaining the basics! Just be sure to check its price and power supply requirements before you buy a couple of them ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15366 - Posted: 22 Feb 2010 | 17:12:46 UTC

http://www.nvidia.com/object/paxeast.html
____________
Thanks - Steve

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15382 - Posted: 23 Feb 2010 | 15:59:09 UTC - in response to Message 15366.
Last modified: 23 Feb 2010 | 15:59:28 UTC

Dont waste your money - Fermi is a dead duck.

It will release late March early April, but with very limited quantities, each one being fielded at a huge loss, and each knocking hard on the 300w PCIe certification limit. It will be a PR stunt to keep them going, by May / June you will not get one for love nor money because NVidia cannot afford to subsidise more. The current design is grossly uneconomical, and can be wiped out in a performance sense by current ATI cards, let alone the ATI cards due for release this summer.

NVidia will not fix its silicon engineering problems until it can go to 28nm production, that will not be until 2011. On top of that a viable production level NVidia DX11 facility is just a marketing dream at present, NVidia is not capable of producing production standard DX11 for sustained mass deployment until it goes 28nm. Even if it took a suicidal route of keep to a "fix" for the current Fermi design, it would take a minimum of 6 months to get to market, and the final fix would be uneconomic, it would break an already theoreticaly insolvent NVidia.

By the time Fermi2 comes out 2011, ATI will be two generations ahead and NVidia is strategically left for dust (which frankly, it already is). 2010 will be a year where NVidia internally positions itself for a re-branding and market repositioning. By 2011, it will be finished in its current guise, and a rump of its former self.

275, 280, 285 are already formally EOL, and the 295 all but EOL - OEMs will not get any further 295s - there are no shots left in the NVidia armoury.

I have been an NVidia fan for many years, but the party's over ...... I will be changing to ATI for my next PC build.

NVidia did not listen to warnings given about engineering architecture issues in 2007, and will now pay the price for that.

Regards
Zy

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15383 - Posted: 23 Feb 2010 | 16:18:26 UTC - in response to Message 15382.

Can you reveal your sources?

NVidia is still profitable: http://www.istockanalyst.com/article/viewarticle/articleid/3838318
____________
Thanks - Steve

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15384 - Posted: 23 Feb 2010 | 16:49:55 UTC - in response to Message 15383.
Last modified: 23 Feb 2010 | 17:07:40 UTC

Thats based on the first three quaters of 2009, they still had four mainstream cards to sell - they dont now .... and will not until Firmi is out. NVidia is not only about graphics cards at consumer level, there is no suggestion from me it will go out of business, but it will retreat from the classic PC Graphics card market, it has nothing left. It will go into specialist graphics applications in 2011, it cant compete with ATI.

Firmi is a disaster, the returns from the final runs of silicon end of January had a mere 2-5% yield, the silicon engineering is wrong. To fix it without a total redesign, they have to increase core voltage, that pushes up the watts to get the power they need, that in turn butts them up against the PCIe certification limit. They must, must, come in under 300w else with no PCIe certification, OEMs will not touch them. To come in under 300w they will have to disable parts of the GPU when going at full speed.

Source - my opinion from searching the Web. Sit down one day and google real, real hard along the lines of:

GTX480, GF100 design, GF100 silicon, Firmi silicon, Firmi engineering, NVidia architecture, TSMC Silicon NVidia etc.

Keep away from "card reviews" and the like, they just regurgitate NVidia specs. Delve into comments at a pure engineering level. The story is not a happy one.

Regards
David

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1925
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 15385 - Posted: 23 Feb 2010 | 17:21:50 UTC - in response to Message 15384.

Fermi launch date:
http://www.theinquirer.net/inquirer/news/1593186/fermi-finally-appear

gdf

cenit
Send message
Joined: 11 Nov 09
Posts: 23
Credit: 668,841
RAC: 0
Level
Gly
Scientific publications
watwat
Message 15386 - Posted: 23 Feb 2010 | 17:42:05 UTC - in response to Message 15385.

there are some really harsh news about Fermi...

look this one:
The chip is unworkable, unmanufacturable, and unfixable.

it also contains a lot of links and technical data to support the statement... ugh!

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15387 - Posted: 23 Feb 2010 | 17:54:02 UTC - in response to Message 15385.

They will be out of stock by 1 Jul, maybe earlier depends on how fast take up is from existing users. After that, they cant produce more - dont have the cash to subsidise the fix they need to apply to the silicon. Its the 250 spin saga all over again, but this time there is nothing to make up for it like the 295 producing a cash flow. Current cards are just a PR stunt pre Firmi, they cant stand up to ATI on their own performance.

There's just too many indicators. Another one, why are ATI/AMD dragging their heals on a FFT fix? They say they have one and will be deployed in the next OpenCL release "in a few months". A few months?? When both ATI and NVidia are (at least publically) shouting virtues of OpenCL. Look at it from the other angle, with NVidia engineering problems, why should ATI accelarate OpenCL dev/fixes, without NVidia in its current form there is no real need for ATI to push OpenCL.

All speculative, for sure, its only an opinion. However there is far too many indicators supporting this, and equally significant, no retraction of the main issues from NVidia or their partners/supporters to some very strong reviews and statements made re the silicon issue.

Time will tell I guess, but certainly there is enough to say "hold for now", if the Firmi production reviews pick this up, and they will, or the cards suddenly become scarce "due to extrordinary demand" end June, then we will know. However I suspect the Industry experts and analysts will have blown this on out before then once the production card is out there and its engineering can be picked apart. Then the next phase will begin with Spin covering the delay over Firmi2, but I dont think they will get away with it this time.

Regards
Zy

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15388 - Posted: 23 Feb 2010 | 17:59:42 UTC - in response to Message 15386.
Last modified: 23 Feb 2010 | 18:03:45 UTC

there are some really harsh news about Fermi...

look this one:
The chip is unworkable, unmanufacturable, and unfixable.

it also contains a lot of links and technical data to support the statement... ugh!


Charlie is a known Nvidia hater - he would spit on their grave given the chance, so we have to take some of his ire with a pinch of salt. However even if only half of that report is true, NVidia are sunk without trace.

In fairness to him, he has been right with his predictions on NVidia for the last 2 years, especially on the engineering aspects.

I'm trying to keep an open mind ... really am ... but its real hard to with whats out there.

Regards
Zy

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1925
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 15389 - Posted: 23 Feb 2010 | 18:25:15 UTC - in response to Message 15388.

Luckily we need to wait only 1 month now to know what is true.

gdf

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15410 - Posted: 24 Feb 2010 | 20:49:02 UTC - in response to Message 15384.
Last modified: 24 Feb 2010 | 20:49:55 UTC

Thats based on the first three quaters of 2009, they still had four mainstream cards to sell - they dont now .... and will not until Firmi is out. NVidia is not only about graphics cards at consumer level, there is no suggestion from me it will go out of business, but it will retreat from the classic PC Graphics card market, it has nothing left. It will go into specialist graphics applications in 2011, it cant compete with ATI.

Firmi is a disaster, the returns from the final runs of silicon end of January had a mere 2-5% yield, the silicon engineering is wrong. To fix it without a total redesign, they have to increase core voltage, that pushes up the watts to get the power they need, that in turn butts them up against the PCIe certification limit. They must, must, come in under 300w else with no PCIe certification, OEMs will not touch them. To come in under 300w they will have to disable parts of the GPU when going at full speed.

Source - my opinion from searching the Web. Sit down one day and google real, real hard along the lines of:

GTX480, GF100 design, GF100 silicon, Firmi silicon, Firmi engineering, NVidia architecture, TSMC Silicon NVidia etc.

Keep away from "card reviews" and the like, they just regurgitate NVidia specs. Delve into comments at a pure engineering level. The story is not a happy one.

Regards
David


Stupid question, I'll ask anyways. If GTX470/480 comes out clocked at 676Mhz to meet with PCIe certification limits, if Nvidia gave the option to go beyond this 300W (at UR own risk), would they get that "edge", people are looking for? OCing always voids waranties. So why not give the option (at UR own risk) & package that extra PCIe 12V connector in a way that will void all warranty, if unsealed...
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15413 - Posted: 24 Feb 2010 | 22:32:02 UTC - in response to Message 15410.

You will probably be able to OC the Fermi cards, just like the previous cards. However, almost any attempt at doing so will get you past the 300 W. So nVidia don't have to do much to allow this ;)

And it's not like your system suddenly fails if you draw more than that. At least in quality hardware there are always some safety margins built in. A nice example of this is ATIs 2-chip 5970: it draws almost exactly 300 W under normal conditions, but when OCed goes up to ~400 W. It works, but the heat and noise is unpleasant to say the least. And the power supply circuitry is quite challanged by such a load and can become a limiting factor (could be helped by a more expensuve board design).

Is that the "edge"? It depends.. I wouldn't like to draw about twice the power of an ATI for a few 10 % more performance. For me it would have to be at least ~80% more performance for 100% more power. Oh, and don't forget that you can OC the ATIs as well ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15419 - Posted: 25 Feb 2010 | 2:46:27 UTC - in response to Message 15413.

Would you like to order some Fermi cards before they are built, and before the power requirements and other specs are released? You can.

http://techreport.com/discussions.x/18515#jazz

I'll let you decide if it looks worth it.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15422 - Posted: 25 Feb 2010 | 6:37:35 UTC

Seems that my last last in this thread has turned out questionable - when I found the site that lists them for sale again, it now shows both GTX4xx models as out of stock.

Also I found an article saying that the amount of memory shown on that web page is questionable for the Fermi design.

http://www.brightsideofnews.com/news/2010/2/21/nvidia-geforce-gtx-480-shows-up-for-preorder-and-the-specs-look-fishy.aspx

Also an article showing Nvidia's announcement plans for that series:

http://www.nvidia.com/object/paxeast.html

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15428 - Posted: 25 Feb 2010 | 13:22:46 UTC - in response to Message 15413.
Last modified: 25 Feb 2010 | 14:17:49 UTC

NVidia has taken huge steps over the last year towards the low and mid range card market for many reasons, none of them being ATI. This should not be seen as a migration from the top end market, but rather a necessary defence and improvement of their low to mid range market. This is where most GPUs are sold, so its the main battlefield, and NVidia have a bigger threat than ATI, in this war.

NVidias mid range cards such as the GT220 and GT240 offer much lower power usage than previous generations, making these cards very attractive to the occasional gamer, the home media centre users (with their HDMI interface), the office environment, and of course to GPUgrid. The GT240 offers up a similar performance to a 8800GS or 9600GT in out and out GPU processing, but incorporates new features and technologies. Yet, where the 8800GS and 9800GT use around 69W when idle and 105W under high usage, the GT240 uses about 10W idle and up to 69W (typically 60 to 65W) when in high use. As these cards do not require any special power connectors, are quiet and not oversized they fit into more systems and are more ergonomically friendly.

WRT crunching GPUGrid, the GT240 will significantly outperform these and many more powerful, and power hungry, CC1.1 cards (such as a GTS 250, 150W TDP) because the GT240 is CC1.2, and is much, much more reliable (G215 core rather than G92)!

NVidia has also entered the extra-PC market, creating chips for things such as TVs and mobiles. However, this is not just an attempt to stave off Intel’s offensive on the low end GPU market. NVidia have found allies in the form of ARM; Intel also started to take the low end CPU fight to ARM, with the release of Atoms and continued development of small CPUs of limited power. However ARM are not an easy target, not by a long way. ARM sell over 1Billion CPU chips every year, for various devices including just about every mobile on the planet. So NVidia have found very solid support from ARM (especially their computer on a chip Cortex ARM9 product, Tegra 2, which could be making its way to a TV, game console, PDA or netbook near you any time soon).

So Intel’s recent move towards low end CPUs and GPUs, and more importantly their development of single chip CPU and GPUs naturally makes them the common enemy of NVidia and ARM. If NVidia failed to respond to this challenge they would have their market place threatened not just by ATI & AMD but by Intel on the main battlefield (where most cards are).


A few things to note about PCIE technologies:
PCIe operates at 2.5GHz
PCIe 2 operates at 5GHz
PCIe 2.1 operates at 5GHz
PCIe 3 will operate at 8GHz, but this has not yet been ratified – rather, delayed until the second quarter of 2010, and perhaps then some.
It is no surprise that it has not been decided upon, Fermi has not been released!

This highlights the secretive nature of GPU manufacturers, too scared to reveal what they are up to even if it hampers their own sales. It lends weight to the argument that the Fermi cards will be an NVidia loss leader; however we are not privy to the internal cost analysis, just the speculation and you could argue that Fermi is a trail blazing technology; a flagship for newly developed technology, which will lend itself to new mid range cards over the next year or so and set a new bar for performance.

It is worth noting that Intel developed PCIe, so they might want to try to scupper the technology to knobble both ATI and NVidia, saying as Intel don’t have a top end GPU; they might want to deny other GPU manufacturers the opportunity to compete on something they cannot control.

It strikes me as likely, that people who buy a Fermi in a month’s time might have to buy a new motherboard sometime later in the year to fully appreciate the card. Today it would have to work on a PCIe2 (or PCIe2.1) based motherboard, as these are the best that there is, but I would not be surprised if better performance will be brought with a future PCIe3 slotted motherboard:
PCIe2 uses an 8b/10b encoding scheme (at the data transmission layer). This results in a 20% overhead. The PCIe3 will use a 128b/130b encoding system, resulting in a <2% overhead, but this is only one area of improvement.

Profile [AF>Libristes] Dudumomo
Send message
Joined: 30 Jan 09
Posts: 45
Credit: 382,925,244
RAC: 6
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15429 - Posted: 25 Feb 2010 | 13:40:35 UTC - in response to Message 15428.

Thanks SKGiven.
Interesting post.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 169
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15430 - Posted: 25 Feb 2010 | 14:45:49 UTC - in response to Message 15428.

NVidia has taken huge steps over the last year towards the low and mid range card market for many reasons, none of them being ATI. This should not be seen as a migration from the top end market, but rather a necessary defence and improvement of their low to mid range market. This is where most GPUs are sold, so its the main battlefield, and NVidia have a bigger threat than ATI, in this war.

The highest profit area however is the high end and the truth is that NVidia has made numerous missteps in the high end market. AMD/ATI is executing much better at the moment. NVidia spent much money developing the G200 series and now they're becoming more expensive and harder to find as they reach EOL status. Fermi sounds like it might be in trouble. It's too bad as we need strong competition to spur advances and hold down prices.

It strikes me as likely, that people who buy a Fermi in a month’s time might have to buy a new motherboard sometime later in the year to fully appreciate the card. Today it would have to work on a PCIe2 (or PCIe2.1) based motherboard, as these are the best that there is, but I would not be surprised if better performance will be brought with a future PCIe3 slotted motherboard:
PCIe2 uses an 8b/10b encoding scheme (at the data transmission layer). This results in a 20% overhead. The PCIe3 will use a 128b/130b encoding system, resulting in a <2% overhead, but this is only one area of improvement.

Interesting but not very relevant to our needs. Even the slowest versions of PCIe presently are overkill as far as interfacing the GPU and CPU for the purpose of GPU computing. Maybe gaming is a different matter, but NVidia seems to be moving away from the high end gaming market anyway.

As far as Intel, over the years they've made several attempts to take over the graphics card market and have miserably failed every time. It seems that some of their much ballyhooed technologies have been recently scrapped this time too. Still, Intel is huge, has many resources, and has never been one to let the law stand in it's way when attempting to dominate a market. As always, time will tell.

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15439 - Posted: 25 Feb 2010 | 18:35:40 UTC - in response to Message 15430.

Interesting but not very relevant to our needs.

Probably, but I was talking generally because I wanted to make the point about the politics behind the situation; if preventing PCIe3 means another company cant release a significantly better product, that could be used here, some will see that as a battle won.

I expect gaming would improve in some ways with a faster PCIe interface; otherwise there would be no need for PCIe3. I doubt that all the holdups are due to the fab problems.

NVidia seems to be moving away from the high end gaming market anyway.


Well since the GTX295s that is true, because the battle was on a different front, but Firmi will be a competitive top end card. We will see in a month, or two, hopefully!

Intel have been trying desperately for years to be all things to all users, but their flood the market strategy often leaves their biggest competitor to a new product to be one of their own existing products. I dont see Intel as a graphics chip developer and I think most people would feel the same.
I think most of their attempts to take over the graphics card market have been through ownership legislations and control methods, so no wonder their miserable attempts failed, they never actually made a good GPU.

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15442 - Posted: 25 Feb 2010 | 19:46:41 UTC - in response to Message 15439.

Semi Accurate reported the following, with some caution.
GTX480: 512 Shaders @600MHz or 1200MHz high clock (hot clock)
GTX470: 448 Shaders @625MHz or 1250MHz high clock (hot clock)
If correct the GTX470 will be about 10% slower than the GTX480, but cost 40% less (going by another speculative report)!

I am not interested in how it compairs to ATI cards playing games, just the cost and that they don’t crash tasks!

Dont like the sound of only 5000 to 8000 cards being made. You could sell them in one country, and our wee country might be down the list a bit ;)

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15446 - Posted: 25 Feb 2010 | 22:27:16 UTC

We started another discussion on Fermi in the ATI thread, so I'm going to reply here:

me wrote:
They desperately wanted to become the clear number 1 again, but currently it looks more like "one bird in the hand would have been better than two on the roof".

SK wrote:
If a bird in the hand is worth two in the bush, NVidia are releasing two cards, the GTX 480 and GTX 470, costing $679.99 and $479.99, how much is the bush worth?


I was thinking along the lines of
- Fermi is just too large for TSMCs 40 nm process, so even if they can get the yield up and get the via reliability under control, they can't fix the power consumption. So the chip will always be expensive and power constrained, i.e. it will be slowed down by low clock speeds due to the power limit.
- Charlie reports 448 shaders at 1.2 GHz for the current top bin of the chip.
- Had they gone for ~2.2 bilion transistors instead of 3.2 billion they'd get ~352 shaders and they'd get a much smaller chip, which is (a) easier to get out of the door at all and (b) yields considerably better. Furthermore they wouldn't be too limited by power consumption (they'd end up at ~200 W at the same voltage and clock speed as the current Fermi), so they could drive clock speed and voltage up a bit. At a very realistic 1.5 GHz they'd achieve 98% of the performance of the current Fermi chips.
- However, at this performance level they'd probably trade blows with ATI instead of dominating them, like a full 512 shader Fermi at 1.5 GHz probably would have been able to.
- The full 512 shader Fermi are the two birds on the roof. A 2.2 billion transistor chip is the one in the hand.. or at least in catching range.

I didn't read the recent posts and didn't have the time yet to reply to the ones I did read. Will probably find the time over the week end :)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15453 - Posted: 26 Feb 2010 | 3:00:53 UTC

Looks like Nvidia has planned another type of Fermi cards - the Tesla 20 series.

http://www.nvidia.com/object/cuda_home_new.html

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15457 - Posted: 26 Feb 2010 | 9:04:48 UTC - in response to Message 15453.

You just got to love their consistent naming schemes. But there's still hope the actual products will called 20x0, isn't it?

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15463 - Posted: 26 Feb 2010 | 18:12:54 UTC - in response to Message 15428.

Thanks for the interesting post SKGiven, on the politics of chip manufacturers. Not to make any claims of better knowledge, as just reading my other posts will reveal that I'm just a N00B with his own POV on things.

That some might point out that the largest profit is from sale of High End GPU sales, but that even slight profit, in the vast low to mid range would result in an overall higher gross profit, but that nice' markets can even be expanded as Apple has done. What the future brings is what the future brings.

AMD & Intel sells off their less perfect chips as cheaper models, close to perfect as extreme editions. Even if Nvidia does turn out crappier than desired Fermi, surely they can find use for them still...

I'm an Nvidia fan, don't feel the need to apologize for being being one, but I'm thankful that Ati is around to deal out blows on the behalf of Nvidia consumers.
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15469 - Posted: 27 Feb 2010 | 9:26:13 UTC - in response to Message 15463.

NVidia stopped producing their 55nm GTX285 chips. Lets hope they simply use existing improvements in technology, tweak the designs a bit, and move it to 40nm, with DDR5. If Asus can stick two GTX285 chips onto the one card, I think everyone would be able to if the chips were 40nm instead of 55nm. The cards would run cooler and clock better. A 20% improvement on two 1081GFlops (MADD+MUL, not Boinc) might not make it the fastest graphics card out there, but it would still be a top end card and more importantly, sellable and therefore profitable.

Fermi is too fat and NVidia made the mistake of thinking they could redesign and rebuild fabrication equipment as if it was a GPU core design. Bad management to let the project continue as it did. Firmi should have been shelved and NVidia should have been trying to make competitive and profitable cards, rather than an implausible monster card.

The concept of expanding the width of a wafer to incorporate 3.2billion transistors is in itself daft. Perhaps for 28nm technology, but not 40nm. It is about scalability. The wider the chip the greater the heat, and the more difficult it is to manufacturer. Hence the manufacturing problems, and low clocks (600-625MHz). Lets face it, one of my GT240s is factory clocked to 600MHz and it is just about a mid-range card. A GTX280 built on 65nm technology clocks in at 602MHz, from June 2008! A GTS250 clocks at 738MHz. If you go forward in one way but backwards in another, there is no overall progress.

NVidia should have went for something less bulky, and scale could have been achieved through the number of GPU chips on the card. You don’t see Intel or AMD trying to make such jumps between two levels of chip fab technology. Multi chips are the way of the future, not fat chips.

Why not 3 or 4 GTX285 chips at 40nm onto one card with DDR5? Its not as if they would have to rebuild the fabrication works.
Firmi wont be anything until it can be made at 28nm.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15471 - Posted: 27 Feb 2010 | 13:29:06 UTC - in response to Message 15469.

I also want faster cards to crunch GPUIGrid with but just because Nvidia has had major delays doesn't mean the sky is falling :-)

Nvidia does not own nor is the one making changes to the fab equipement. TSMC, the fab, is currently producing wafers for ATI on 40 nm so while on one hand I agree that making a very large die is more difficult the fab plant and equipment is the same.

Was nvidia agressive in persuing their vision? Yes.
Did they encounter delays because they were so agressive? Yes.
They already shrunk the 200 arch and they already produced a 2 chip card, I think it is time for a fresh start.

Your comparison of core speed leaves scratching my head because I know that the GTX 285 at 648 crunches faster than a GTS250 at 702 MHz.

Which manufacturers are persuing a multi-chip approach?

Are you suggesting that no one buy an Nvidia card based on fermi arch until they shrink it to 28 nm?
____________
Thanks - Steve

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 169
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15472 - Posted: 27 Feb 2010 | 13:55:32 UTC - in response to Message 15469.

You don’t see Intel or AMD trying to make such jumps between two levels of chip fab technology.

Intel tried it with the Itanium and ended up losing technology leadership to AMD for a number of years. If they hadn't been able to manipulate and in some cases buy the benchmark companies in order to make the P4 look decent they would have lost much more.

Why not 3 or 4 GTX285 chips at 40nm onto one card with DDR5? Its not as if they would have to rebuild the fabrication works.
Firmi wont be anything until it can be made at 28nm.

The 285 shrink would still have to have major redesign to support new technologies and even at 40nm you'd probably be hard pressed to get more than 2 on a card without pressing power and heat limits. Then would it even be competitive with the ATI 5970 for most apps? By the time NVidia got this shrink and update done where would AMD/ATI be? I think you have a good idea but the time has probably passed for it. It will be interesting to see if Fermi is as bad as SemiAccurate says. We should know within the next few months about speed. power, heat and availability.

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15473 - Posted: 27 Feb 2010 | 14:25:56 UTC - in response to Message 15471.
Last modified: 27 Feb 2010 | 14:28:44 UTC

I read that the fabrication plants equipment is actually different; they had to redesign some of it.
NVidia has produced at least 2 dual GPU cards, so why stop or leave it at that. The CPU manufacturers have demonstrated the way forward by moving to quad, hex and even 8 core CPUs. This is scaling done correctly! Trying to stick everything on the one massive chip is backwards, more so for GPUs.
http://www.pcgameshardware.com/aid,705523/Intel-Core-i7-980X-Extreme-Edition-6-core-Gulftown-priced/News/
http://www.pcgameshardware.com/aid,705032/AMD-12-core-CPUs-already-on-sale/News/
I was not comparing a GTX 285 to a GTS 250. I was highlighting that a card, almost 2 years old, will have the same clock speed as Firmi, to show that they have not made any progress on this front, and when compared to a GTS250 at 702 MHz, you could argue that Firmi has lost ground in this area and this deflates other advantages of the Firmi cards.

multi-chip approach?

    ATI:
    Gecube X1650 XT Gemini, HD 3850 X2, HD 3870 X2, HD 2850 X2, 4870 X2...
    NVidia:
    Geforce 6800 GT Dual, Geforce 6600 dual versions, Geforce 7800 GT Dual, Geforce 7900 GX2, Geforce 9800 GX2, GTX295 and Asus’s limited ed. variant with 2 GTX 285 cores... None?!?
    Quantum also made a single board Sli 3dfx Voodoo2.



Are you suggesting that no one buy an Nvidia card based on fermi arch until they shrink it to 28 nm?

No, but I would suggest that the best cannot be achieved from Firmi technology if made at 40nm; but it could (and perhaps will) be seen at 28nm (as 28nm research is in progress).

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15474 - Posted: 27 Feb 2010 | 14:47:31 UTC - in response to Message 15472.

Nvidia could have at least produced 40n GTX 285 type chips for a dual GPU board, but they chose to concentrate on low to mid range cards for mass production and keep researching Firmi technologies. To some extent this all make sense. There were no issues with the lesser cards, so they wasted not time in pushing them out, especially to OEMs. The GT200 and GT200b cards on the other hand did have manufacturing issues, enough to deter them from expanding the range. It would have been risky to go to 40nm. Who would really be buying a slightly tweaked 40nm version of a GTX 260 (ie a GTX 285 with shaders switched off, due to production issues)?

But perhaps now that they have familiarised themselves with 40nm chip production (GT 210, 220 and 240) more powerful versions will start to appear, if only to fill the future mid range card line-up for NVidia. Lets face it, a lot of their technologies are reproductions of older ones. So the better late than never, if its profitable, approach may be applied.

The last thing gamers and crunchers need is for a big GPU manufacturer to diminish into insignificance. Even to move completely out of one arena would be bad for competition.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15475 - Posted: 27 Feb 2010 | 14:58:19 UTC

Guys.. you need to differentiate between the GF100 chip and the Fermi architecture!

The GF100 chip is clearly:
- too large for 40 nm
- thereby very difficult to manufacture
- thereby eats so much power that its clock speed has to be throttled, making it slower than expected

However, the design itself is a huge improvement over GT200! Besides DX 11 and various smaller or larger tweaks it brings along a huge increase in geometry power, a vastly improved cache system and it doesn't need to waste extra transistors for special double precision hardware at 1/8th the single precision speed. Instead it uses the current execution units in a clever way and thereby achieves 1/2 the single precision performance in DP, even better than ATI at 2/5th. Fermi even supports IEEE standard DP, the enhanced programming models (in C) etc.

All of this is very handy and you don't need 512 shader processors to use it! All they need to do is to use these features in a chip of a sane size. and since they already worked all of this out (since the first batch of GT100 chips returned last autumn the design must have been ready since last summer) it would be downright stupid not to use it in new chips. And to waste engineering ressources to add features to GT200 and shift it to 40 nm (which was the starting point of the Fermi design anyway).

Like ATI they must first have been informed about the problems with TSMCs 40 nm process back in the beginning of 2008. At that time they could have scaled Fermi down a little.. just remove a few shader cluster, set the transistor budget to not much more than 2 billion and get a chip which would actually have been manufacturable. But apparanently they choose to ignore the warning, whereas ATI choose to be careful.

But now that "the child already fell into the well", as the Germans say, it's mood discussion these issues. What nVidia really has to do is to get a mainstream version of Fermi out of the door.

Oh, and forget about 28 nm: if TSMC has got that much trouble at 40 nm, why would they fare any better at 28, which is a much more demanding target? Sure, they'll try hard not to make the same mistakes again.. but that's far more complicated than rocket science.

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15477 - Posted: 27 Feb 2010 | 15:34:05 UTC

A few more words on why GF100 is too large:

Ever since G80 nVidia designed the comparably simple shader processors to run at high clock speeds. That is a smart move, as it gives you more performance out of the same amount of transistors. The G92 chip is a very good example of this: at the 65 nm process it reached 1.83 GHz at the high end and ~150 W, whereas it could also be used in the 9800GT Green at 1.38 GHz at 75 W. These numbers are acutally from the later 55 nm versions, but these provided just a slight improvement.

This was a good chip, because its power requirements allowed the same silicon to be run as high performance chip and as a energy efficient one (=lower voltages and clocks).

G92 featured "just" 750 million transistors. Then nVidia went from there to 1.4 billion transistors for the GT200 chip at practically the same process node. The chip became so big and power hungry (*) that its clock speeds had to be scaled back considerably. nVidia no longer had the option to build a high performance version. The initial GTX260 with 192 cores cost them at least twice as much as a G92 (double the amount of transistors, actual cost is higher as yield drops with area squared), yet provided just the same single precision performance and about similar performance in games. 128 * 1.83 = 192 * 1.25, simple as that. And even worse: the new chip was not even consuming less power under load, as would have been expected for a "low clock & voltage" efficient version. That's due to the leakage from twice the amount of transistors.

(*) Cooling a GPU with stock coolers becomes difficult (=loud and expensive) at about 150 W and definitely painful at 200 W.

At 55 nm the GT200 eventually reached up to 1.5 GHz, but that's still considerably lower than the previous generation - due to the chip being very big and running into the power limit. ATI on the other hand could drive their 1 billion transistor chip quite hard (1.3 V compared to 1.1 V for nVidia), extract high clock speeds and thereby more performance from cheaper silicon. They loose power efficiency this way, but proved to be more competitve at the end.

That's why nVidia stopped making GT200 - because it's too expensive to produce for what you can sell it. Not because they wouldn't be interested in the high end market any more, that's total BS. Actually the opposite is true: GF100 was designed very agressively to banish ATI from the high end market.

From a full process shrink (130-90-65-45-32 nm etc.) you could traditionally expect doulbe the amount of transistors at comparable power consumption. However, recent shrinks fell somewhat short of this mark. With GF100 nVidia took an already power-limited design (GT200 at 1.4 billion transistors) and stretched this rule quite a bit (GF100 at 3.2 bilion transistors). And had to deal with a process which performs much worse than the already-not-applicable-any-more rule. Put it all together and you run into real trouble. I.e. even if you can fix the yield issues your design will continue to be very power limited. That means you can't use the clock speed potential of your design at all, you get less performance and have to sell at a lower price. At the same time you have to deal with the high production cost of a huge chip. That's why I thought they went mad when I first heard about the 3.2 billion transistors back in september..

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15478 - Posted: 27 Feb 2010 | 15:48:52 UTC

SK wrote:
The CPU manufacturers have demonstrated the way forward by moving to quad, hex and even 8 core CPUs. This is scaling done correctly! Trying to stick everything on the one massive chip is backwards, more so for GPUs.


Sorry, that's just not true.

One cpu core is a hand crafted, highly optimized piece of hardware. Expanding it is difficult and requires lots of time, effort and ultimatly money. Furthermore we've run into the region of diminishing returns here: there's just so much you can do to speed up the execution of one thread. That's why they started to put more cores onto single chips. Note that they try to avoid using multiple chips, as it adds power consumption and a performance penalty (added latency for off-chip communication, even if you can get the same bandwidth) if you have to signal from chip to chip. The upside is lower manufacturing costs, which is why e.g. AMDs upcoming 12 core chips will be 2 chips with 6 cores each.

For a GPU the situation is quite different: each shader processor (512 for Fermi, 320 for Cypress) is (at least I hope) a highly optimized and possibly hand crafted piece of hardware. But all the other shader processors are identical. So you could refer to these chips as 512- or 320-core processors. Which wouldn't be quite true, as these single "cores" can only be used in larger groups, but I hope you get the point: GPUs are already much more "multi core" than CPUs. Spreading these cores over several chips increases yield, but at the same time reduces performance due to the same reason it hurts CPUs (power, latency, bandwidth). That's why scaling in SLI or Crossfire is not perfect.

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15480 - Posted: 27 Feb 2010 | 16:25:41 UTC

SK wrote:
NVidias mid range cards such as the GT220 and GT240 offer much lower power usage than previous generations, making these cards very attractive to the occasional gamer, the home media centre users (with their HDMI interface), the office environment, and of course to GPUgrid. The GT240 offers up a similar performance to a 8800GS or 9600GT in out and out GPU processing, but incorporates new features and technologies. Yet, where the 8800GS and 9800GT use around 69W when idle and 105W under high usage, the GT240 uses about 10W idle and up to 69W (typically 60 to 65W) when in high use. As these cards do not require any special power connectors, are quiet and not oversized they fit into more systems and are more ergonomically friendly.


IMO you're being a little too optimistic here ;)

First off: the new shader model 1.2 cards are great for GPU-Grid, no questions. However, the applications the performance of these cards is normally judged by are games. And these don't really benefit from 1.2 shaders and DX 10.1. Performance wise it trades blows with the 9600GT. I'd say overall it's a little slower, but nothing to worry about. And performance-wise it can't touch either a 9800GT or its Green edition. Power consumption is quite good, but please note that idle power of 9600GT and 9800GT lies between 20 and 30 W, not 70 W! BTW: the numbers XBit-Labs measures are lower than what you'll get from the usual total system power measurements, as they measure the card directly, so power conversion inefficiency in the PSU is eleminated.

If the 9600GT was the only alternative to the GT240 it would be a nice improvement. However, the 9800GT Green is its real competitor. In Germany you can get either this one or a GT240 GDDR5 for 75€. The 9800GT Green is faster and doesn't have a PCIe connector either, so both are around 70 W max under load. The GT240 wins by 10 - 15 W at idle, whereas the GT9800 Green wins at performance outside of GPU-Grid.

I'm not saying the GT240 is a bad card. It's just not as much of an improvement over the older cards as you made it sound.

I've got a nice suggestion for nVidia to make better use of these chips: produce a GT250. Take the same chip, give the PCB an extra PCIe 6 pin connector and let it draw 80 - 90 W under load. Current clock speeds of these cards already reach 1.7 Ghz easily, so with a slight voltage bump they'd get to ~1.8 GHz at the power levels I mentioned. That would give them 34% more performance, would take care of the performance debate compared to the older generation and would still be more power efficient than the older cards. nVidia would need a slight board modification, a slightly larger cooler and possible some more MHz on the memory. Overall cost might increase by 10€. Sell it for 20€ more and everyone's happy.

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15482 - Posted: 27 Feb 2010 | 17:24:09 UTC

Regarding the Article from SemiAccurate: the basic facts Charlie states are probably true, but I really disagree with his analysis and conclusions.

He states that:
(1) GF100 went from revision A1 to A2 and is now at A3
(2) A-revisions are usually to fix logic errors in the metal interconnect network
(3) GF100 has huge yield issues
(4) and draws too much power, resulting in low clock speeds, resulting in lower performance
(5) nVidia should use a B revision, i.e. changes to the silicon, to combat (4)
(6) since they didn't they used the wrong tool too adress the problem and are therefore stupid

(1-5) are probably true. But he also states (correctly, according to Anand, AMD and nVidia) that there were serious problems with the reliability of the vias. These are the little metal wires used to connect the transistors. These are part of the metal interconnects. Could it be that nVidia used the A-revisions to fix or at least relieve those problems in order to improve yields? Without further knowledge I'd rather give nVidia the benefit of the doubt here and assume they're not total idiots rather than claiming (6).

(7) nvidia designed GF100 to draw more amperes upon voltage increases compared to Cypress

That's just a complicated way of saying: every transistor consumes power. Increasing the average power draw of every transistor by say a factor of 1.2 (e.g. due a voltage increase) takes a 100 W chip to 120 W, whereas a 200 W chip faces a larger increase by 40 W to 240 W.

(8) Fermi groups shader processors into groups of 32
(9) this is bad because you can only disable them in full groups (to improve yields from bad chips)

G80 and G92 used groups of 16 shaders, whereas GT200 went to 24 shader clusters. Fermi just takes this natural evolution to more shading power per texture units and fixed function hardware one step further. Going with a smaller granularity also means more control logic (shared by all shaders of one group), which means you need more transistors, get a larger and more expensive chip, which runs hotter but may extract more performance from the same number of shaders for certain code.
Charlies words are "This level of granularity is bad, and you have to question why that choice was made in light of the known huge die size." I'd say the choice was made because of the known large die size, not despite of it. Besides: arguing this way would paint AMD in an even worse light, as they can "only" disable shaders in groups of 80 in the 5000 series.

(10) nVidia should redesin the silicon layer to reduce the impact of the transistor variances

I'm not aware of any "magic" design tricks one can use to combat this. Except changing the actual process, but this would be TSMCs business. One could obviously use less transistors (at least one of the measures AMD choose), but that's a chip redesign, not a revision. One could also combat transistor gate leakage by using a thicker gate oxide (resulting in lower clock speeds), but that adresses neither the variances nor the other leakage mechanims.
If anyone can tell me what else could be done I'd happily listen :)

(11) The only way to rescue Fermi is to switch to 28 nm

While I agree that GF100 is just too large for TSMCs 40 nm and will always be very power constrained, I wouldn't count on 28 nm as the magic fairy (as I said somewhere above). They'd rather get the high-end to mainstream variant of the Fermi design out of the door as soon as possible. If they're not pushing this option since a couple of months the term "idiots" would be quite an euphemism..

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15483 - Posted: 27 Feb 2010 | 17:35:58 UTC - in response to Message 15480.

AMD did not suddenly jump to 2x6 cores. They moved from single to dual (on one die, unlike Intel who preferred to use glue), then moved to quads, hex and now 12 cores. Along the way they moved from 90nm to 40nm, step by step. In fact Intel will be selling a 6core 32nm CPU soon.

Unfortunately, NVidia tried to cram too much onto one core too early.
As you say Mr.S, an increase in Transistor quantity results in energy loss (leakage in the form of heat), and probably rises to the square too. The more transistors you have, the higher the heat loss and the less you can clock the GPU.

There is only 2 ways round this; thinner wafers, or to break up the total number of transistors into cool manufacturable sizes. I know this may not scale well, but at least it scales.

The 65nm G200 cards had issues with heat, and thus did not clock well - the first GTX260 sp192 (65nm) clocked to just 576MHz, poor compaired to some G92 cores. The GTX 285 managed to clock to 648MHz, about 12.5% faster in itself. An excellent result given the 240:80:32 core config compared to that of the GTX 260s, 192:64:28. This demonstrated that with design improvements and dropping from 65 to 55nm excess leakage could be overcome. But to try to move from 1.4Billion transistors to 3Billion, even on a 40nm core was madness; not least because 40nm was untested technology and a huge leap from 55nm.

They, “bit off more than they could chew”!

If they had designed a 2Billion transistor GPU that could be paired with another on one board and also dropped to 40nm they could have made a very fast card, and one we could have been using for some time - it would have removed some manufacturing problems and vastly increased yields at 40nm, making the project financially viable in itself.

I think they could have simply reproduced a GTX 285 core at 40nm that could have been built into a dual GPU card, called it a GTX 298 or something and it would have yielded at least a 30% performance compared to a GTX 295. An opportunity missed.

I still see a quad GPU card as being a possibility, even more so with Quad Sli emerging. If it is not on any blueprint drafts I would be surprised.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15485 - Posted: 27 Feb 2010 | 19:12:07 UTC - in response to Message 15483.

AMD did not suddenly jump to 2x6 cores. They moved from single to dual (on one die, unlike Intel who preferred to use glue), then moved to quads, hex and now 12 cores. Along the way they moved from 90nm to 40nm, step by step. In fact Intel will be selling a 6core 32nm CPU soon.


I know, but what's the point? What I was trying to say is that the CPU guys are trying to avoid multi chip packages. They lead to larger chips (additional circuitry needed for communication), higher power consumption (from the additional circuitry) and lower performance (due to higher latency) and higher costs due to the more complex packaging. The only upsides are improved yields on the smaller chips and faster time to market if you already have the smaller ones. So they only use it when either (i) the yields on one large chip would be too low and offset all the penalties from going multi chip or (ii) the cost for a separate design, mask set etc. for the large chip would be too high compared to the expected profit from the chip, so in the end using 2 existing chips is more profitable and good enough.

There is only 2 ways round this; thinner wafers, or to break up the total number of transistors into cool manufacturable sizes. I know this may not scale well, but at least it scales.


Thinner wafers?

I think they could have simply reproduced a GTX 285 core at 40nm that could have been built into a dual GPU card, called it a GTX 298 or something and it would have yielded at least a 30% performance compared to a GTX 295. An opportunity missed.


And I'd rather see this chip based on the Fermi design ;)

I still see a quad GPU card as being a possibility, even more so with Quad Sli emerging.


Quad SLI: possibly, just difficult to make it scale without microstutter.

Quad chips: a resounding no. Not as an add-in card for ATX or BTX cases. Why? Simple: a chip like Cypress at ~300 mm² generally gets good enough yield and die harvesting works well. However, push a full Cypress a little and you easily approach 200 W. Use 2 of them and you're at 400 W, or just short of 300 W and downclocked as in the case of the 5970. What could you gain by trying to put 3 Cypress onto one board / card? You'd have to downclock further and thereby reduce the performance benefit, while at the same time struggling to stay within power and space limits. The major point is that producing a Cypress-class chip is not too hard, and you can't put more than 2 of these onto a card. And using 4 Junipers is going to be slower and more power hungry than 2 Cypress due to the same reasons AMD is not composing their 12 core CPU from 12 single dies.
I could imagine an external box with its own PSU, massive fans and 800+ W of power draw though. But I'm not sure there's an immediate large market for this ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15486 - Posted: 27 Feb 2010 | 19:28:25 UTC - in response to Message 15483.

I still see a quad GPU card as being a possibility, even more so with Quad Sli emerging. If it is not on any blueprint drafts I would be surprised.


Sounds reasonable, IF they can reduce the amount of heat per GPU enough that the new boards don't produce any more heat than the current ones. How soon do you expect that?

Or do you mean boards that occupy four slots instead of just one or two?

If they decide to do it by cutting the clock rate in half instead, would you buy the result?

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15487 - Posted: 27 Feb 2010 | 21:14:23 UTC - in response to Message 15486.
Last modified: 27 Feb 2010 | 21:54:26 UTC

Yeah, I was technically inaccurate to say that the 9800 GT uses 69W when idle; I looked it up and that is what I found! It would be more accurate, albeit excessive, to say that the system usage with the GPU installed and sitting idle can increases by up to 69W, with the GPU perhaps using about 23W (the accurate bit); the additional losses coming from the GPUs power dependent components; the PSU and motherboard (the vague and variable bit). Of course if I say this for the 9800GT I would need to take the same approach with the GT240 (6W idle), and mention that in reality idle power consumption also depends on your motherboards ability to turn power off, your PSU, chipset and motherboards general efficiencies. Oh, and the operating systems power settings, but again excessive for a generalisation.

- XbitLabs compared a GT240 to a 9800GT (p4) and then replace the 9800GT with a GTS250 (p6)! Crunching on GPUGrid, and overclocked by between 12 and 15%, my temps, on 3 different types of GT240s stay around 49degrees C. The only exception is one card that sits a bit too close to a chipset heatsink in an overclocked i7 system. How they got 75degrees from native is beyond me. Even my hot (58deg) cards fan is at 34%.

Anyway, I think the GT240 is a good all round midrange card for its many features, and an excellent card for GPUGrid. For the occasional light gamer, that does not use the system as a media centre or for GPUGrid, you are correct, there are other better options, but these cards are less use here, and I don’t think gamers that don’t crunch are too interested in this GPUGrid forum.

I've got a nice suggestion for nVidia to make better use of these chips: produce a GT250. Take the same chip, give the PCB an extra PCIe 6 pin connector and let it draw 80 - 90 W under load. Current clock speeds of these cards already reach 1.7 Ghz easily, so with a slight voltage bump they'd get to ~1.8 GHz at the power levels I mentioned. That would give them 34% more performance, would take care of the performance debate compared to the older generation and would still be more power efficient than the older cards. nVidia would need a slight board modification, a slightly larger cooler and possible some more MHz on the memory. Overall cost might increase by 10€. Sell it for 20€ more and everyone's happy.


Your suggested card would actually be a descent gaming card, competitive with ATI’s HD 5750. NVidia has no modern cards (just G92s) to fill the vast gap between the GT240 and the GTX 260sp216 55nm. A GTX 260 is about two or two and a half times as fast on GPUGrid. Perhaps something is on its way? A GTS 240 or GTS 250 could find its way onto a 2xx 40nm die and into the mainstream market (its presently an OEM card and likely to stay that way as is). Anything higher would make it a top end card.

I could see two or four 40nm GPUs on one board, if they had a smaller transistor count. Certainly not Fermi (with 3Billion), at least not until 28nm fabrication was feasible, and thats probalby 2years+ away, by which time there will be other GPU designs. I think the lack of limitations for a GPU card offers up more possibility than 2 or 4 separate cards, and even though the circutry would increase, it would be less than the circuitry of 4 cards together. I expect there will be another dual card within the next 2 years.

Fermi could do with less transistors, even if that means a chip redesign!
A design that allows NVidia to use more imperfect chips, by switching areas off might make them more financially viable. We know AMD can disable cores if they are not up to standard. Could NVidia take this to a new level? If it means 4 or 5 cards with less and less ROPs & shaders, surely thats better for the consumer too? GTX 480, 470, 460, 450 ,440, 430. More choice, better range, better prices.

NVidia and AMD want to cram more and more transistors into a chip. But this increase in density causes heat increase. Could chips be designed to include spaces, like fire-breakers?

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15488 - Posted: 27 Feb 2010 | 23:41:05 UTC - in response to Message 15485.
Last modified: 27 Feb 2010 | 23:55:45 UTC

Quad chips: a resounding no. Not as an add-in card for ATX or BTX cases. Why? Simple: a chip like Cypress at ~300 mm² generally gets good enough yield and die harvesting works well. However, push a full Cypress a little and you easily approach 200 W. Use 2 of them and you're at 400 W, or just short of 300 W and downclocked as in the case of the 5970. What could you gain by trying to put 3 Cypress onto one board / card? You'd have to downclock further and thereby reduce the performance benefit, while at the same time struggling to stay within power and space limits. The major point is that producing a Cypress-class chip is not too hard, and you can't put more than 2 of these onto a card. And using 4 Junipers is going to be slower and more power hungry than 2 Cypress due to the same reasons AMD is not composing their 12 core CPU from 12 single dies.
I could imagine an external box with its own PSU, massive fans and 800+ W of power draw though. But I'm not sure there's an immediate large market for this ;)

MrS[/quote]

Sound fascinating with an external option. I'm thinking about the Expressbox that was tried with an external GPU to boost laptops through the ExpressCard slot. Problem was that it was a x1 PCI-E & the idea was too costly & produced too little benefits.

But what about a x16 PCI-e 2.0 external extender riser card connected to an external GPU? Already GPU's are making so much heat & taking up so much space, so why not just separate it once and for all? No need for a redesign of the motherboard... If there was an option to stack external GPU's in a way that could use SLI, an external GPU option could give enthusiasts the opportunity to use insane ammounts of money on external GPU's & stack them on top of each other w/o running into heat or power issues. Even if a fermi requires 800W, if it's external & has it's own dedicated PSU, you'd still be able to stack 3 on top of each other to get that fermi tri-SLI.
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15489 - Posted: 27 Feb 2010 | 23:45:23 UTC - in response to Message 15487.

nVidia can already disable shader clusters as they whish. They could sell a GF100 with only 32 shaders active, if this made any sense. If I remember correctly the first cards to employ these techniques were ATIs X800 Pro and nVidias 6800GS. On newer cards they may also have more flexibility regarding the ROPS, memory controllers etc., so I think that's quite good already.

What do you mean by "lack of limitations for a GPU card"? I think it does have severe limitations: the maximum length is defined by ATX cases (we're already pushing the limits here), the height is also limited by the regular cases and power draw and cooling are severely limited. Sure, you could just ask people to plug even more 6 and 8 pin PCIe connectors into the cards, but somehow you also have to remove all that heat.

I could see two or four 40nm GPUs on one board,


2 Chips definitely (HD5970), but 4? What kind of chips would you use? Since we don't know enough about Fermi I'd prefer to discuss this based on ATI chips. As I said even going for 3 Cypress (HD58x0) is almost impossible, or at least unfeasible. You could use 2 x 2 Juniper (HD 57x0), which are exactly half a Cypress. Performance in Crossfire is surprisingly good compared to the full configuration, but you need an additional 30 - 40 W for the CF tandems compared to the single chip high end cards (see next page). That would turn a 300 W HD5970 into a 360 - 380 W card, if you used 4 Junipers. It had to be noticably cheaper to justify that.

Could chips be designed to include spaces, like fire-breakers?


They could, but the manufacturing cost is the same regardless of whether you use the area for transistors or whether you leave it blank. So you reduce power density somewhat and reduce the number of chips per wafer considerably. And power density at the chip level is not the problem of GPUs, as the chips are large enough to transfer the heat to the cooler. Their main problem is that the form factor forbids the use of larger fans, so removing the heat from the heatsink is the actual problem.

BTW: thanks for discussing :)

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15490 - Posted: 28 Feb 2010 | 0:00:33 UTC - in response to Message 15488.

@liveonc:

We'd need at least one PCIe 2.0 16x connection routed to the external box. That would mean it would have to be very close to the PC, but that's no show stopper.

What I could imagine: use a common PCB for 4 (or "x" if you prefer) GPU chips, or separate PCBs for each of them (cheaper, more flexible but you'd have to connect them). Make them approximately quadratic and cover each of them with a large heatsink. There'd probably have to be 2 or 3 heatpipes for the GPU core, whereas memory and power cicuitry could be cooled by the heatsink itself. Then put a 120 or 140 mm on top of each of these "GPU blocks". Leave the sides of the device free enough so that the hot air can flow out. You'd end up with a rather flat device which covers some surface, so ideally you'd place it like a tower parallel to your.. ehm, PC-Tower. It could also be snapped upon the front or backside, so you could easily use 2 of these moduls for one PC and both PCIe links would only have to cover ~10 cm to the motherboard I/O panel. For BOINC this would absolutely rock, whereas for games we might need faster and / or wider PCIe. One couild also replace the individual fans with a single really large one.. but then there'd be a large dead spot beneath its axis. And if only selected GPUs are loaded one couldn't speed fans up selectively. BTW: the fans would have easily accessible dust filters.

Oh wait.. did I just write all of this? Let me call my patent attorney first ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15492 - Posted: 28 Feb 2010 | 0:07:48 UTC - in response to Message 15490.
Last modified: 28 Feb 2010 | 1:01:26 UTC

e-SATA has been around for a long time. e-PCIe should be the next thing, why not? Nvidia wanted to go nuts, let them, if space, 300W, & cooling, no longer is a hindrance, Nvidia could go into the external GPU market, PSU market, & VGA cooling market. Plus they'd get their fermi.

BTW, do you know if a serial repeater is possible, in case the distance from the PC case to the external GPU has to be a bit further away, & would there be any sense in making x1, x4, x8, & x16 PCIe cables, if fx a PCIe 2.0 speed is 5.0 GHz, while the first version operates at 2.5 GHz, if fx the card is PCIe 1.0?

This is what I'm getting at, but much more simple, & only meant to house a single fermi using x16 PCIE 2.0 with an oversized GPU cooler & dedicated PSU: http://www.magma.com/expressbox7x8.html
____________

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15494 - Posted: 28 Feb 2010 | 1:45:04 UTC - in response to Message 15490.
Last modified: 28 Feb 2010 | 1:50:00 UTC

Here's another one: http://www.pcper.com/comments.php?nid=8222 but it's as I "suspect", much like the Asus XG Station: http://www.engadget.com/2007/01/08/hands-on-with-the-asus-xg-station-external-gpu, & if it is, it's limitted to x1 PCIe.

Magma has the right idea with their ExpressBox7x8: http://www.magma.com/expressbox7x8.html the iPass cable used looks as if it's at least 5m in length. But different from Magma, I was thinking of just 1 Fermi, in it's own box, with it's own PSU, on it's back to the bottom of the box (like a motherboard lies in a PC casing), with a giant heatsink & fan. If nVidia wants their Monster Fermi to use 800W it can do it outside with a moster GPU cooler & dedicated PSU. If they want enthusiasts to be able to afford multiple cards, they could make it possible (& attractive), if there is a way to stack one Fermi Box on top of the other to get that external SLI.

Everybody's talking as if nVidia went to far, but if they don't have to think about space limitation, power consumption, or heat dissipation, they can make their monster (& get away with it). I remember that all the critics of Intel LOL when Intel's Original Pentium overheated, only to have Intel come back, (with 40 belly dancers), when the placed a heat sink & fan on top of their Pentium.
____________

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15495 - Posted: 28 Feb 2010 | 3:11:26 UTC
Last modified: 28 Feb 2010 | 3:12:24 UTC

Read the ads with links at the bottoms of threads - Some of them offer rack-mounted GPU boxes (mostly Nvidia type with a CPU in the box also, but often with 4 GPUs; I've even seen one where one Xeon CPU is expected to control up to 8 GPUs (GTX295 type if you prefer). Rack-mounted systems should be able to get taller than a stack with no frame to support the higher units.

No Fermi boards yet, though.

Jeremy
Send message
Joined: 15 Feb 09
Posts: 55
Credit: 3,542,733
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15496 - Posted: 28 Feb 2010 | 3:34:50 UTC - in response to Message 15494.

Disclaimer: I do not know as much about chip design as I'd like, and I'm holding off any commentary on Fermi until it's actually in reviewers' and end users' hands. Anything else is pure gossip and conjecture as far as I'm concerned.

That said, much has been made about Intel's vs AMD's vs nVidia's design decisions.

AMD vs Intel's approach in the CPU world intrigue me. For Intel, they were content to package two chips onto a single die to make their initial quad core offerings. It worked quite well.

What prevents GPU makers from doing the same? Two GT200b(c?) dies on a single GPU package. It'd be a GTX295 that wouldn't require doubling up the vRAM. Make it at 40nm. It seems to make so much sense (at least to me), but nobody is doing it and there has to be a reason. Does anybody know what that reason is?
____________
C2Q, GTX 660ti

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 169
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15497 - Posted: 28 Feb 2010 | 6:08:29 UTC - in response to Message 15496.

What prevents GPU makers from doing the same? Two GT200b(c?) dies on a single GPU package. It'd be a GTX295 that wouldn't require doubling up the vRAM. Make it at 40nm. It seems to make so much sense (at least to me), but nobody is doing it and there has to be a reason. Does anybody know what that reason is?

Even at 40nm two GT200b cores would be huge and most likely wouldn't yield well. Huge dies and poor yields = no profit. In addition GPUs are very different from CPUs. The shaders are basically cores. For instance in Collatz and MW a 1600 shader 5870 does twice the work of a 800 shader 4870.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15498 - Posted: 28 Feb 2010 | 12:11:31 UTC

@Robert: the problems with the current rack mounted devices are numerous

- they're meant for professionals and thus feature expensive Teslas
- I don't have a 19" rack at home (and am not planning to buy one)
- the air flow is front to back, with the units designed to be stacked
-> the cooling can't be quiet this way
- a separate Xeon to drive them? I want to crunch with GPUs so that I don't have to buy expensive CPUs!

@liveonc: that cable in the Magma-box looks really long, so it seems like PCIe extension is no big deal :)

@Jeremy: you have to feed data to these chips. 256 bit memory interfaces are fine for single chips, where the entire space around them can be used. 386 and 512 Bit quickly get inconvenient and expensive. In order to place two GT200 class chips closer together you'd have to essentially create a 768 Bit memory interface. The longer and the more curved the wires to the memory chips are, the more problematic it becomes to clock them high. If your wire layout (and some other stuff) is not good you can not even clock the memory as high as the chips themselves are specified.
So without any serious redesign you create quite some trouble by placing two of these complex chips closer together - for zero benefit. You could get some benefit, if there was some high bandwidth, low latency communication interface between the chips. Similar to what AMD planned to include in Cypress as Sideport, but had to scrap to keep the chip "small". But that's a serious redesign. And I suppose they'd rather route this communication interface over a longer distance than to cram the memory chips into an even smaller area.
A single cypress pushes the limits of what can be done with 256 Bit GDDR5 quite hard already: HD4890 was not memory bandwidth limited. HD5870 increased raw power by a factor of 2 and memory bandwidth only by a factor of 1.3 and as a result could use even faster memory. Yet they had to be very careful about the bus to the memory chips, its timing and error detection and correction to reach even these speeds.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15500 - Posted: 28 Feb 2010 | 15:59:51 UTC - in response to Message 15496.

Disclaimer: I do not know as much about chip design as I'd like, and I'm holding off any commentary on Fermi until it's actually in reviewers' and end users' hands. Anything else is pure gossip and conjecture as far as I'm concerned.

That said, much has been made about Intel's vs AMD's vs nVidia's design decisions.

AMD vs Intel's approach in the CPU world intrigue me. For Intel, they were content to package two chips onto a single die to make their initial quad core offerings. It worked quite well.

What prevents GPU makers from doing the same? Two GT200b(c?) dies on a single GPU package. It'd be a GTX295 that wouldn't require doubling up the vRAM. Make it at 40nm. It seems to make so much sense (at least to me), but nobody is doing it and there has to be a reason. Does anybody know what that reason is?


I have years of experience in some of the early stages of chip design, mostly logic simulation, but not the stages that would address putting multiple chips in one package. I'd expect two such GPU chips in the same package to generate too much heat unless the chips were redesigned first to generate perhaps only half as much heat, or the clock rate for the pair of chips was reduced about 50% leaving them doing about as much work as just one chip but at a more normal clock rate.

Also, I suspect that the chips are not already designed to let two of them share the same memory, even if both chips are in the same package, so you'd need about twice as many pins for the package to allow it to use two separate sets of memory.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15501 - Posted: 28 Feb 2010 | 17:24:01 UTC - in response to Message 15498.

@Robert: the problems with the current rack mounted devices are numerous

- they're meant for professionals and thus feature expensive Teslas
- I don't have a 19" rack at home (and am not planning to buy one)
- the air flow is front to back, with the units designed to be stacked
-> the cooling can't be quiet this way
- a separate Xeon to drive them? I want to crunch with GPUs so that I don't have to buy expensive CPUs!

MrS


The best I can tell, the current application designs don't work with just a GPU portion of the program; they need a CPU portion to allow them to reach BOINC and then the server. The more GPUs you use, the faster a CPU you need to do this interfacing for them. Feel free to ask the rack mounted equipment companies to allow use of a fast but cheap CPU if you can find one, though. Also, it looks worth checking which BOINC versions allow the CPU interface programs for interfacing to separate GPUs to run on separate CPU cores.

However, I'd expect you to know more than me about just how fast this interfacing needs to be in order to avoid slowing down the GPUs. Putting both the GPUs and the CPU on the same board, or at least in the same cabinet, tends to make the interface faster than putting them in separate cabinets.

The rack mounted devices I've looked at put the GPUs and the CPU on the same board, and therefore don't use Teslas. I haven't checked whether they also use enough memory to drive up the price to Tesla levels.

I agree with your other objections, though.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15502 - Posted: 28 Feb 2010 | 17:34:57 UTC

Jeremy, if the heat generated and the package size weren't problems, you could take your idea even further by putting the video memory chips in the same package as the GPU, and therefore speed up the GPU to memory interface.

As it is, consider how much more you'd need to slow down the GPU clocks to avoid overheating the memory chips with this approach.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15503 - Posted: 28 Feb 2010 | 18:29:07 UTC - in response to Message 15501.
Last modified: 28 Feb 2010 | 18:30:57 UTC

However, I'd expect you to know more than me about just how fast this interfacing needs to be in order to avoid slowing down the GPUs. Putting both the GPUs and the CPU on the same board, or at least in the same cabinet, tends to make the interface faster than putting them in separate cabinets.


But Intel has already taken the first step on nVidia's foot, by not allowing them to build boards for the Core i7

nVidia already wanted to promote Supercomputing via Fermi. If they start selling external GPU kits, even an ITX with a PCIe should "theoretically", be able to use a monster GPU. Taking it even one step further, if both nVidia & Ati start selling external GPU kits with oversized GPU's unable to fit in any traditional casing, they could start making GPU boards & offer to sell the GPU chip & memory
separately. PSU, GPU Cooling, & RAM manufacturers, would be delighted, wouldn't they? But of course, Intel probably will object...
____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15504 - Posted: 28 Feb 2010 | 19:54:05 UTC - in response to Message 15503.

But Intel has already taken the first step on nVidia's foot, by not allowing them to build boards for the Core i7


What are you talking about?


____________
Thanks - Steve

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15505 - Posted: 28 Feb 2010 | 20:12:26 UTC - in response to Message 15504.

Only Intel makes chipsets for LGA1156 Socket & LGA1366 Socket, heard a "rumor", that it wasn't because nVidia couldn't.
____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15508 - Posted: 28 Feb 2010 | 21:54:39 UTC

I was confused because you said boards.
The 1366 socket chips integrated the functionality of the nvidia chipset into the cpu directly and I think nvidia announced they were getting out of the chipset market entirely (not making them for amd motherboards either) before 1156 was launched.
____________
Thanks - Steve

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15511 - Posted: 1 Mar 2010 | 0:41:04 UTC - in response to Message 15505.
Last modified: 1 Mar 2010 | 1:10:15 UTC

Only Intel makes chipsets for LGA1156 Socket & LGA1366 Socket, heard a "rumor", that it wasn't because nVidia couldn't.


Shouldn't matter unless the CPU and GPU share the same socket, the same socket type is not critical otherwise. Intel's moving in the direction of putting both the GPU and the CPU on the same chip, and therefore into the same socket, but I've seen no sign that Nvidia has designed a CPU to put on the same chip as the GPU. A few problems with Intel's approach: I've seen no sign of a BOINC version that can make any use of a GPU from Intel yet. No monster Intel GPUs yet, and I expect both heat problems and VRAM pin count problems from trying to put both a monster GPU and any of Intel's current CPU designs into the same package. I suppose that the GPU could share the same memory as the CPU, IF you're willing to give up the usual speed of the GPU to VRAM interface, but is that fast enough to bother making any? Alternatively, you could wait a few years for chip shrinkage and power requirements shrinkage to reach the point that a CPU, a monster GPU, and the entire VRAM all fit into one package without heat or size problems.

As for Nvidia giving up the chipset market, isn't that most of their product line now? Why would they be willing to risk making such a large change unless the alternative was bankruptcy? Or do you, for example, mean the part of the chipset market for chips not related to graphics?

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 15512 - Posted: 1 Mar 2010 | 1:06:01 UTC - in response to Message 15511.

Robert,

I've seen no sign of a BOINC version that can make any use of a GPU from Intel yet.


Al current Intel GPUs are low-performance, fixed-function units, quite useless for computation.


As for Nvidia giving up the chipset market, isn't that most of their product line now? Why would they be willing to risk making such a large change unless the alternative was bankruptcy?


Nvidia do not have a license for QPI, only the FSB used by older processors (and Atom).

MJH

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15513 - Posted: 1 Mar 2010 | 3:05:20 UTC - in response to Message 15508.

I was confused because you said boards.
The 1366 socket chips integrated the functionality of the nvidia chipset into the cpu directly and I think nvidia announced they were getting out of the chipset market entirely (not making them for amd motherboards either) before 1156 was launched.


Looks to me more like Nvidia has decided that they'll have to fight some legal battles with Intel, and start some new chip designs at early stages, then wait until these designs are finished, before they can offer any products compatible with any new type of board slots designed to make use of the new features of the Intel 5 and Intel 7 CPUs, but for now will continue making products for computers using the older FSB interface. See here for a report that looks somewhat biased, but at least agrees that Nvidia is not planning a full exit from the chipset market:

http://www.brightsideofnews.com/news/2009/10/9/nvidia-temporarily-stops-development-of-chipsets-to-intels-i5-and-i7.aspx

Intel appears to have designed the new QPI interface to do two things:

1. Give speedups by separating the main memory interface from the peripherals interface. This means, for example, that any benchmark testing that uses programs that use the memory a lot, but not also any peripherals, may have trouble seeing much difference between FSB and QPI interface results.

2. I'd be surprised if they aren't also trying to restrict the ability or either Nvidia or AMD/ATI to enter the graphics card market for machines using new features of the Intel 5 and Intel 7 by refusing then QPI licenses.

http://www.intel.com/technology/quickpath/index.htm?iid=tech_arch_nextgen+body_quickpath_bullet

AMD has developed a feature called Hyper-threading for their newer CPUs; I expect them to try to restrict the ability of either Nvidia or Intel to enter the graphics card market for any computers using that feature by also denying licenses.

I suspect that as a result, Nvidia will have problems offering products for any new machines designed to make full use of the new Intel or AMD features, and for now will have to concentrate on offering products for older types of motherboard slots. Somewhat later, they can offer products that put a CPU on the same board as the GPU, if they can find a CPU company wanting to increase sales of a suitable CPU but not offering any competing graphics products; hopefully one that already has a BOINC version available. I'd expect some of these products to be designed for cases that don't try to fit the ATX guidelines.

This is likely to give crunchers problems choosing computers that give the highest performance for both GPU projects and CPU projects, and are still produced in large enough quantities to avoid driving up the price, for at least some time. Not as many problems for crunchers interested mainly in GPU projects, though.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15521 - Posted: 1 Mar 2010 | 6:50:25 UTC

I've found an article about efforts to design a dual-Fermi board, although with no clear sign that a significant number of them will ever be offered for sale or that it won't overheat with the normal spacing between cards.

http://www.brightsideofnews.com/news/2010/2/26/asus-to-introduce-dual-fermi-on-computex-2010.aspx

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 169
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15525 - Posted: 1 Mar 2010 | 9:09:53 UTC - in response to Message 15513.

2. I'd be surprised if they aren't also trying to restrict the ability or either Nvidia or AMD/ATI to enter the graphics card market for machines using new features of the Intel 5 and Intel 7 by refusing then QPI licenses.

Intel partnered with NVidia and then cut them out, copied from the MS playbook. When Intel settled the lawsuit with AMD, part of the settlement was cross-licensing of technologies so AMD should be OK in this regard. Various governments still have litigation against Intel so it might be in their best interest to play a bit nicer than they're accustomed to.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15532 - Posted: 1 Mar 2010 | 15:00:46 UTC - in response to Message 15525.

The GPUs that NVidia produces are an entirely different product from their system chipsets.

There is nothing that Intel is doing to stop dedicated GPU cards created by Nvidia or any other company from working in our PCs.

While some Intel CPUs do have integrated graphics so you don't need to have a dedicated graphics card this really doesn't matter to to us.

Larrabee failed and currently Intel has no public plans to create a dedicated graphics card solution.
____________
Thanks - Steve

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15537 - Posted: 1 Mar 2010 | 16:46:02 UTC

That was an interesting week's discussion ..... :)

Putting technical possibilities that may or may not present themselves to NVidia for averting the impending fermi disaster - and thats still my personal view - there is the very real and more pertinent commercials that have to be taken into account. ATI is already one step ahead of NVidia, they were mid last year, let alone now. They will by end of this year be two generations ahead, the Fermi hassles will not be resolved until early 2011 (aka Fermi2).

If we assume they have found a hidden bag of gold to fund their way out of this, and we assume they will not repeat the crass technical directions they have taken for the last three years in terms of architecture - and thats a huge leap of faith given their track record in last three years - by mid 2011, ATI will be bringing out cards three generations ahead. Its only 28nm process that will give NVidia breathing space, not only architecturally, but also in terms of implementing DX11. At present they are no where near DX11 capability, and will not be for most of 2010.

So come 2011, why should anyone buy a product that is 3 steps behind the competition, has immature new DX11 abilities, and still has no prospect - at least none we know of - to adjust the core architecture direction sufficiently fast and in such a huge leap in one bound to be competitive? Then to cap it all, financially NVidia will be strapped for cash, 2010 will be a disaster year for income from gpu cards.

Technically there maybe an outside chance of them playing catchup, personally I doubt it, but that aside, they are in a mega investment black hole, and I cant see them reversing out of it. With such a huge gap in a technology sense ATI / NVidia that is about to unfold, I really cant see mass take up of a poorly performing Fermi card, even if they could afford to subsidise them all year.

There is the age old saying "When you are in a hole, stop digging". NVidia is not in a hole, its in a commercial ravine. Technology possibilities are one thing, especially given their abysmal track record in the last years, Commercial reality is quite another.

Regards
Zy

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15538 - Posted: 1 Mar 2010 | 17:14:30 UTC - in response to Message 15537.
Last modified: 1 Mar 2010 | 17:23:44 UTC

Indeed it's been interesting. But most of the things discussed were things that nVidia can choose to go with, in the future.

That Fermi is in trouble because it requires lots of power, producing lots of heat, & that only 4% of their 40nm wafers are usable with such high requirements. Makes me think that maybe the 4% is of "extreme edition" grade, & that the real problem might me the power requirement & therefore also heat produced.

The only workaround is placing the GPU outside, so that space won't be an issue, & therefore neither will power or heat dissipation. If nVidia only can use & sell their purest chips (at a loss), they will empty out their coffers & unless they do find a leprechaun at the end of the rainbow & his pot of gold, they will be in trouble.

If nVidia takes these 4% & use them for Workstation GPU's & salvage as much of the rest as possible to make external mainstream GPU's. They might be able to cut loss' until they get to that 28nm.

I like nVidia, but I also like Ati pressuring nVidia.
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15539 - Posted: 1 Mar 2010 | 18:58:10 UTC - in response to Message 15538.

The anticipated supply limitations for Fermi's will mean that the cards will not be available in Europe for some time.
I expect the cards will first be available in the USA, Canada, Japan and a few hotspots in Asia. They may well make their way to Australia before Europe. No doubt they will arrive in the more affluent parts of Europe first too. So Germany, France and the UK may also see them a few weeks before Spain, Portugal and the more Easterly countries, if they even make it that far. As the numbers are expected to be so few, 5K to 8K, very few cards will make it to GPUGrid crunchers in the near future. So if anyone in the USA or Japan gets one early on, please post the specs, performance and observations as it will help other crunchers decide whether to get one or not (should they ever become available in other areas). If they do turn up, and you want one, grab it quickly (as long as it gets a good review)!

Let us all hope they get really bad reviews, or the gamers will buy them up and very few will make it to GPUGrid!

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 169
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15545 - Posted: 1 Mar 2010 | 20:47:06 UTC - in response to Message 15539.

Let us all hope they get really bad reviews, or the gamers will buy them up and very few will make it to GPUGrid!

While I'm sure you're just kidding, if this does happen it will simply be another big nail in NVidia's coffin.
We need 2 viable GPU companies to keep advances coming and prices down.

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15547 - Posted: 1 Mar 2010 | 21:20:05 UTC - in response to Message 15545.
Last modified: 1 Mar 2010 | 21:28:17 UTC

We need 2 high end competitors! Only that way will new technologies advance progressively. At the time being ATI do not have a competitor! So will an average review change anything? A bad review might be just what they need to joult them back on track. Lets face it, Fermi is a loss leader that will depend on a gimic sales pitch that it outperforms ATI is some irrelivant areas of gaming.
The problem is that anyone can compete at the low end, even Intel! But if there are several low end manufacturers eating up the profit margins of the high end GPU manufacturers, they start to look over their shoulder and perhaps panic, rather than looking to the future.

By the way, nice link RobertMiles.
http://www.brightsideofnews.com/news/2010/2/26/asus-to-introduce-dual-fermi-on-computex-2010.aspx
Asus are going to limit their super Republic of Gamers HD5970 Ares dual GPU card to 1000. If they try a similar trick with Fermi's they may find it difficult to get their hands on enough GPUs!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15549 - Posted: 1 Mar 2010 | 22:14:21 UTC

Guys, I don't get it: why are (some of) you talking as if shrinking GF100 to 28 nm was nVidias only option? Just because Charlie said so? Like I said before: IMO all they need is a Fermi design of a sane size, a little over 2 billion transistors and with high speed 256 bit GDDR5.

Take a look at the GT200 on the GTX285: it's one full technology node behind Cypress and features 50% less transistors, so it should be at least 50% slower. How often do you see Cypress with more than a 50% advantage? It happens, but it's not the average. I'd say that's a good starting point for a Fermi-based mainstream chip of about Cypress' size. nVidia said they'd have something mainstream in summer. I'm still confident they won't screw this one up.

BTW: the 4% yield is probably for fully functional chips. And is probably already outdated. Yields change weekly, especially when the process is not yet mature.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15550 - Posted: 1 Mar 2010 | 23:38:19 UTC - in response to Message 15549.

Some time ago I read 9% yield.
Shrink the transistors and the yield should go up by the square. Fermi's low yield is a result of 3 main issues; a change in casting design (larger), a jump to 40nm, and 3Billion transistors.
If they cast smaller and reduced the transistor count, mainstream cards would flow out of the factories.
My arguement was that if they did this, even with GTX 285 chip design (6 months ago), they would have been producing something competitive. Obviously they should not do this now, because they have Fermi technologies, but I agree they have to shrink the transistor count, Now! In 18months or 2years, they may be able to use 28nm, but by then even Fermi will be dated (mind you, it never stopped NVidia from using dated designs in the past).

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15551 - Posted: 1 Mar 2010 | 23:49:20 UTC - in response to Message 15549.
Last modified: 2 Mar 2010 | 0:19:18 UTC

Guys, I don't get it: why are (some of) you talking as if shrinking GF100 to 28 nm was nVidias only option? Just because Charlie said so? Like I said before: IMO all they need is a Fermi design of a sane size, a little over 2 billion transistors and with high speed 256 bit GDDR5.

Take a look at the GT200 on the GTX285: it's one full technology node behind Cypress and features 50% less transistors, so it should be at least 50% slower. How often do you see Cypress with more than a 50% advantage? It happens, but it's not the average. I'd say that's a good starting point for a Fermi-based mainstream chip of about Cypress' size. nVidia said they'd have something mainstream in summer. I'm still confident they won't screw this one up.

BTW: the 4% yield is probably for fully functional chips. And is probably already outdated. Yields change weekly, especially when the process is not yet mature.

MrS


But if they've already gone into production faze, it's too late, isn't it to modify the design in one way or the other. They only thing they can do now is to make chips & put them on bigger boards. 28nm is the future, but isn't changing the design of the Fermi also in the future? I don't have the knowledge that you guys have & you can read in my wordings that I don't. But I do know that when a product goes into production faze, you can't go back to the drawing board & redesign everything. The only thing that you can do is to find a way to make what you've got work. They're already doing the same thing with the Fermi as they did with the GTX 260, making a less powerful GTX 280. If nVidia wants to launch the low-mid range before they can make a workable high end, it's just stalling, & they can't stall for too long. But if redesigning the PCB is easier & faster, then IMO it's they only way to get the original Fermi they wanted from the very start.

Here I'm thinking about 3 things. The first is to place the GPU outside the PC casing. The second is to go with water & bundle Fermi cards with "easy" water cooling kits that function like the Corsair Hydro: http://www.corsair.com/products/h50/default.aspx Or thirdly, make a huge 3-4 slot gpu card with 80mm fans blowing cool air from the front of the out the back of the casing, instead of using the blower their using now, & using a tower heatsink. (but it's upside down in most casings isn't it?)
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15553 - Posted: 2 Mar 2010 | 0:13:27 UTC - in response to Message 15551.
Last modified: 2 Mar 2010 | 0:25:13 UTC

NVidia design chips, its what they do, and they are good at it.
Unfortunately they overstreached themselves and started to pursue factory mods to manufacture larger numbers of 40nm GPUs. They tried to kill 2 birds with one stone, and missed l0l
NVidia can, and I think will, produce a modified version of Fermi - with a reduced transistor count (it's what they do), but in my opinion they should have done this earlier.
NVidia have already produced low to mid range products. GT210 & Ion (low and very low), GT 220 (low to mid), GT240 (mid range), but then there is a gap before you reach the End of Line GTX 260's (excluding any older GT92 based GPUs that are still being pushed out). By the way I have tried a GTS 240; it uses G92b (like the GTS 250) and basically does not work well on GPUGrid, ditto for several other mid-high end GT92 based cards. I'm not a gamer.

When you see Asus redign PCBs, add a bigger, and in this case better heatsink and fan, it is no wonder they get much better results. So there is more to be had! They have made 2 vastly more powerful GPUs and both create less noise! Well done, twice.
Tackle Fermi too, please. NVidia need that sort of help.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15554 - Posted: 2 Mar 2010 | 4:57:44 UTC - in response to Message 15537.

That was an interesting week's discussion ..... :)

Putting technical possibilities that may or may not present themselves to NVidia for averting the impending fermi disaster - and thats still my personal view - there is the very real and more pertinent commercials that have to be taken into account. ATI is already one step ahead of NVidia, they were mid last year, let alone now. They will by end of this year be two generations ahead, the Fermi hassles will not be resolved until early 2011 (aka Fermi2).

Regards
Zy


I'd say it's more like ATI is one step ahead in making GPU chips for graphics applications, but at least one step behind in making GPU chips for computation purposes and offering the software needed to use them for computation well. This could lead to a splitting of the GPU market could with commercial consequences almost as bad as you predict, though.

It could also mean a major setback for GPU BOINC projects that aren't already using ATI graphics cards at least as well as Nvidia graphics cards, or well on the path to getting there; and the same for all commercial operations depending on the move to GPU computing.

Zy, I suppose that you'd automatically agree with all of ATI's rather biased opinions of the Fermi, if you haven't done this already:

http://www.brightsideofnews.com/news/2010/2/10/amd-community-has-29-tough-gf100-questions-for-nvidia.aspx

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15555 - Posted: 2 Mar 2010 | 5:51:28 UTC - in response to Message 15551.

But if they've already gone into production faze, it's too late, isn't it to modify the design in one way or the other. They only thing they can do now is to make chips & put them on bigger boards. 28nm is the future, but isn't changing the design of the Fermi also in the future? I don't have the knowledge that you guys have & you can read in my wordings that I don't. But I do know that when a product goes into production faze, you can't go back to the drawing board & redesign everything. The only thing that you can do is to find a way to make what you've got work. They're already doing the same thing with the Fermi as they did with the GTX 260, making a less powerful GTX 280. If nVidia wants to launch the low-mid range before they can make a workable high end, it's just stalling, & they can't stall for too long. But if redesigning the PCB is easier & faster, then IMO it's they only way to get the original Fermi they wanted from the very start.

Here I'm thinking about 3 things. The first is to place the GPU outside the PC casing. The second is to go with water & bundle Fermi cards with "easy" water cooling kits that function like the Corsair Hydro: http://www.corsair.com/products/h50/default.aspx Or thirdly, make a huge 3-4 slot gpu card with 80mm fans blowing cool air from the front of the out the back of the casing, instead of using the blower their using now, & using a tower heatsink. (but it's upside down in most casings isn't it?)


Production phase of THAT CHIP and too late to change the design of THAT CHIP in time to help.

But are you sure that they aren't already working on another chip design that cuts down the number of GPU cores enough to use perhaps half as many transistors, but is similar otherwise?

I'd think more along the lines of for many products, putting the GPU outside the computer's main cabinet, but also adding a second CPU and second main memory close to it to provide a fast way of interfacing to it. This second cabinet should not even try to follow the ATX guidelines.

For some others, don't even try to use enough boards to make them plug in to all the cards slots it blocks other cards from using; this should leave more space for a fan and a heatsink. Or, alternatively, make some of the cards small enough that they don't even try to use the card slots they plug into for more than a way to get more power from the backplane. These products could still plug in to normal motherboards, even if they don't fully follow the guidelines for cards for those motherboards. Your plans sound good otherwise.

As for using Fermi chips that don't have all the GPU cores usable, are you sure they don't have a plan to disable use of the unusable cores in a way that will stop them from producing heat, then use them in a board family similar to the GTX 260 boards that also used chips with many of the GPU cores disabled? Likely to mean a problem with some software written assuming that all GPU cores are usable, though, and therefore difficult to use in creating programs for GPU BOINC projects.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15556 - Posted: 2 Mar 2010 | 6:16:57 UTC

An idea for Nvidia to try for any newer boards meant to use chips with some GPU cores disabled: Add a ROM to the board design that hold information about which GPU cores are disabled, if the Fermi chip design doesn't already include such a ROM, then make their software read this ROM just before making a last-minute decision of which GPU cores to distribute the GPU program among on that board. Require any other providers of software for the new boards to include a mention of whether their software can also handle a last-minute decision of which GPU cores to use, in their public offerings of the software.

This could mean a new Nvidia application program specifically for taking GPU programs compiled in a way that does not specify which GPU cores are usable, then distributing them among the parts of the GPU chip that actually are usable.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15557 - Posted: 2 Mar 2010 | 6:20:23 UTC

Do you suppose that Intel's efforts against Nvidia are at least partly because Intel plans to buy Nvidia and convert them into a GPU division of Intel, but wants to drive down the price first?

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15560 - Posted: 2 Mar 2010 | 9:20:29 UTC

Of course they're diabling non-functional shaders, just as in all previous high end chips. The GTX470 is rumored to have 448 shaders instead of 512 (2 of 16 blocks disabled).

And as mighty as Intel is, I don't think they've got anything to do with the current Fermi problems. Denying the QPI license yes, but nothing else.

MrS
____________
Scanning for our furry friends since Jan 2002

Michael Karlinsky
Avatar
Send message
Joined: 30 Jul 09
Posts: 21
Credit: 7,081,544
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwat
Message 15566 - Posted: 2 Mar 2010 | 15:54:27 UTC

News from cebit:

heise.de (in German)

In short:

- one 6 and one 8 pin power connector
- GTX 480 might have less than 512 shaders
- problems with driver
- availability: end of april

Michael
____________
Team Linux Users Everywhere

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15567 - Posted: 2 Mar 2010 | 19:37:49 UTC - in response to Message 15566.

Thanks! That means:

- 225 - 300 W power consumption, as 2 x 6 pin isn't enough (not very surprising..)
- the early performance numbers are to be taken with a grain of salt, as the drivers are not yet where they should be

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15576 - Posted: 2 Mar 2010 | 23:42:25 UTC - in response to Message 15554.
Last modified: 2 Mar 2010 | 23:45:33 UTC

[quote] ...... Zy, I suppose that you'd automatically agree with all of ATI's rather biased opinions of the Fermi ......


I would expect ATI Marketing to go on a feeding frenzy over this, it would be surprising if they did not.

No, I dont automatically go with ATI viewpoint, mainly because I have been an NVidia fan for years - since they first broke into the market. I am the last one to want them to back out now. However, NVidia fan or not, they have to come up with the goods and be competitive, its not a charity out there.

As for ATI articles etc, I treat those the same as NVidia articles - all pre cleared by marketing and not worth the paper they are printed on until reality appears in the marketplace. Corporate press releases or articles these days are comparible to Political statements released by mainstream Political parties - the truth is only co-incidental to the main objective of misleading the less well informed.

In other words, I now view both genres as automatic lies until corroborated by other sources or methods. As GDF wisely stated above, we will know in a month or so whats true .....

If NVidia come up with a good design on general release (not short term subsidised PR stunt), they get my £s, if they dont, ATI will for the first time since NVidia appeared in the consumer/gamer end of the gpu market.

Regards
Zy

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15586 - Posted: 3 Mar 2010 | 19:43:34 UTC

For all those curious, here's a look at the goodies: http://www.theinquirer.net/inquirer/blog-post/1594686/nvidia-nda-broken-topless-gtx480-pics
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15588 - Posted: 3 Mar 2010 | 22:23:17 UTC - in response to Message 15586.

They should be ashamed of stripping this poor little GTX naked and revealing it's private parts to the public!

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15591 - Posted: 3 Mar 2010 | 23:11:15 UTC - in response to Message 15588.

The Internet was meant to show everybody naked, including nVidia. Don't blame people for being curious or those with sight for not being blind. A German site I checked out yesterday used inches instead cm & got me really confused yesterday. Either that or Google translate, didn't work... Isn't that worse? I sat in silence when I read that the Fermi would be 4.2 x 4.2 inches!
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15593 - Posted: 3 Mar 2010 | 23:21:25 UTC - in response to Message 15591.

Is that 12 X 256MB RAM i see?

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15596 - Posted: 4 Mar 2010 | 1:20:18 UTC - in response to Message 15593.

I think it is 12 * 128 = 1536 (480) and 10 * 128 = 1280 (470)
____________
Thanks - Steve

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15598 - Posted: 4 Mar 2010 | 8:59:27 UTC - in response to Message 15591.

Well, that's true.. but didn't anyone consider the feelings of this poor chip? If he's not exhibitionist he'll be seriously ashemed by now and probable ask himself the same question over and over again: "why, oh why did they do it?" Like the Rumpelwichte in Ronja Räubertochter. No wonder he's too distracted to run any world records now!

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15599 - Posted: 4 Mar 2010 | 9:15:59 UTC - in response to Message 15598.

Did I see what I think I saw? On youtube?! Somehow I don't think it was a coincidence that the scene ended where it ended. I'm more a fan of stupid humor, rather than sick humor. This is more me http://www.youtube.com/watch?v=VpZXhR1ibj8 Just pretend not to speak German, & it'll be like reading an Arabic translation on CNN ;-)
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15607 - Posted: 4 Mar 2010 | 19:17:14 UTC - in response to Message 15596.
Last modified: 4 Mar 2010 | 19:39:57 UTC

I think it is 12 * 128 = 1536 (480) and 10 * 128 = 1280 (470)


You may be talking about the shaders, which are on the main GPU (under the tin) or saying that 1.5GB DDR for the GeForce versions?
Anyway, I was refering to the 12 small dark chips surrounding the core, I think they are GDDR5 ECC RAM chips.

I just checked and Fermi will have up to 6GB GDDR5, so the 12 RAM chips could each be up to 512MB. If there are 3GB & 1.5GB version as well, then they could use 256MB (or 128MB, as you said) chips. Mind you, if they are going to have 3GB or 1.5GB versions, perhaps they could leave chips off (6x512MB, or 3x512MB). The PCI would have to change a bit for that, but in turn it would reduce power consumption slightly.


http://www.reghardware.co.uk/2009/10/01/nvidia_intros_fermi/
Explains why it will be good for double-precision.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15609 - Posted: 4 Mar 2010 | 20:12:23 UTC - in response to Message 15607.

I was referring to memory ...
http://www.xtremesystems.org/forums/showthread.php?t=244211&page=66
http://en.wikipedia.org/wiki/Comparison_of_NVIDIA_graphics_processors#GeForce_400_Series
____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15611 - Posted: 4 Mar 2010 | 22:12:02 UTC - in response to Message 15609.

Thanks Steve,
So the GTX 470 will have 448 shaders @ perhaps 1296MHz(?), the GPU will be at 625MHz and the GDDR RAM @ 1600MHZ (3200MHZ). It will use 2555MB RAM total and because it uses 1280MB onboard, that tells us that the shaders depend directly on the RAM (and therefore bus width too) – disable a shader and you disable some RAM as well. I expect this means it has to use 12 RAM chips or disable accordingly?

Asus - an opportunity beckons!

The 220 TDP is a lot more attractive than the 300!

Not too many will be able to accommodate two 512 cards, but two 448 cards is do-able (a good 750W PSU should do the trick).

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15625 - Posted: 5 Mar 2010 | 23:59:19 UTC - in response to Message 15611.

So the GTX 470 will have 448 shaders @ perhaps 1296MHz(?), the GPU will be at 625MHz


The "slow core" will run at 1/2 the shader clock.

It will use 2555MB RAM total and because it uses 1280MB onboard


Where is the other half of that 2.5 GB if it's not on board?

that tells us that the shaders depend directly on the RAM (and therefore bus width too) – disable a shader and you disable some RAM as well.


No - correlation does not imply causality ;)

I expect this means it has to use 12 RAM chips or disable accordingly?


For a full 384 bit bus - yes. Unless someone makes memory chips with a 64 bit interface rather than 32 bit.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15639 - Posted: 7 Mar 2010 | 15:31:12 UTC - in response to Message 15625.

If a GTX480 will ship with 1.5GB RAM onboard the card, and can use 2.5GB, then it would need to be using 1GB system RAM to get to 2.5GB.

OK, so for a full 384 bit bus all 12 RAM chips are needed.
This would suggest that the bus for a GTX470 will actually be 320bit, as two RAM chips will be missing (unless thay are going to ship cards with RAM attached for no good reason).

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15644 - Posted: 7 Mar 2010 | 19:13:58 UTC - in response to Message 15639.

If a GTX480 will ship with 1.5GB RAM onboard the card, and can use 2.5GB, then it would need to be using 1GB system RAM to get to 2.5GB.


Adding the system RAM available to the card to the actual GPU memory is a bad habit of the OEMs to post bigger numbers. Any modern card can use lots of system memory, but it does not really matter how much as it's too slow anyway. It's like saying your PC has 2 TB of RAM because you just plugged in that 2 TB Green HDD.

This would suggest that the bus for a GTX470 will actually be 320bit, as two RAM chips will be missing.


Yes. Technically it's the other way around: it has 10 chips because of the 320 bit bus, but never mind. (disclaimer: if the specs are correct)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15645 - Posted: 8 Mar 2010 | 9:00:49 UTC - in response to Message 15644.



Adding the system RAM available to the card to the actual GPU memory is a bad habit of the OEMs to post bigger numbers. Any modern card can use lots of system memory, but it does not really matter how much as it's too slow anyway. It's like saying your PC has 2 TB of RAM because you just plugged in that 2 TB Green HDD.

MrS


I know it is more of an advertisement scam than a reality, but I think games still tend to get loaded into system RAM (not that I play games), rather than sit on a DVD or the hard drive.

The Green HD's are better as a second drive ;) or if your systems stay on most of the time. The last one I installed had 64MB onboard RAM.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15650 - Posted: 8 Mar 2010 | 22:34:20 UTC - in response to Message 15645.

Sure, the game gets loaded into system mem - otherwise the CPU couldn't process anything. However, if the GPU has to swap to system memory, that's entirely different. It's a mapping of the private adress space of the GPU into system memory, so logically it can access this memory just as if it was local. What it stores there is different from what's on the HDD and different from what the CPU processes, though. And as soon as the GPU has to use system mem your frame rate / performance takes a huge hit on everything but the lowest of the low end cards.. that's why using system mem for the GPU is 2practically forbidden". Consider this: on high end GPUs we've got 100 - 150 GB/s of bandwidth, whereas an i7 with 3 DDR3 channels can deliver ~15 GB/s, more than would be available over PCIe 16x.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15651 - Posted: 8 Mar 2010 | 23:42:52 UTC - in response to Message 15650.

I see your point, and 3 times present PCIe (2 & 2.1). I dont think PCIe3 would make too much difference either!

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15749 - Posted: 14 Mar 2010 | 4:38:19 UTC
Last modified: 14 Mar 2010 | 4:52:12 UTC

GALAXY GTX470 http://www.tcmagazine.com/comments.php?shownews=33185&catid=2
Supuermicro, To Kill for! http://www.computerbase.de/bildstrecke/28637/1/
On UR March, get GTX470/480, 26th GO! http://www.fudzilla.com/content/view/17952/1/
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15753 - Posted: 14 Mar 2010 | 11:35:19 UTC

Looks like the fan on the 470 could use a diameter boost.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15841 - Posted: 19 Mar 2010 | 22:25:11 UTC
Last modified: 19 Mar 2010 | 22:26:31 UTC

Might not be such a flop. http://www.fudzilla.com/content/view/18147/65/ Geforce GTX 480 has 480 stream processors and works at 700MHz for the core, 1401MHz for shaders and features 1536MB of memory that works at 1848MHz and paired up with a 384-bit memory interface.
If they can have an aprox 80% improvement in performance, fx 9800GTX+ vs GTX280 http://www.tomshardware.com/charts/gaming-graphics-cards-charts-2009-high-quality-update-3/Sum-of-FPS-Benchmarks-1920x1200,1702.html Sum of FPS Benchmarks 1920x1200
Then the GTX480 "could" score 315FPS which is what the ATI Radeon HD 5970 scores. It's too rich for me though, I'm going to wait for a GTX460-SP432 (whenever that comes out)...
____________

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15851 - Posted: 20 Mar 2010 | 10:09:41 UTC - in response to Message 15841.
Last modified: 20 Mar 2010 | 10:13:47 UTC

Its already clear that a 5970 will outperform a 480 in its current incarnation, and thats without letting loose the 5970 to its full legs - the latter has another 25-30% performance in it above stock. One of the 480's problems is powerdraw, and it could interesting to see how they get round that issue when going for a Fermi x 2 next year.

For now though, its looking more like whether or not they have produced a card that gives sufficient real world advantage above a 295 to be commercially viable. The pricing will be constrained by the clear real world lead of the 5970 (a lead it will have for at least 12 months, and thats without thinking around a 5970 successor), the defacto 480 floor price of 295's and 5870's, and Firmi's voracious power needs. The real battle will be next year with a re-engineered Fermi2 and whatever the 5970 successor looks like.

Meanwhile I personally find myself caught waiting, I am in the market to replace my stalwart 9800GTX+. The only reason I am still waiting on what the reality of Firmi turns out to be, is the need to run CUDA apps. In my case its the new GPGrid Project about to startup, for that I need CUDA, and I am reluctant to go down the road of a 295. I tend to buy a card and use it for years, jumping generations. Without that need for CUDA, that box would be ATI already. So in a sense, another saving grace for NVidia is the CUDA dimension. Many people/organisations will have far more serious CUDA needs than me out there in the real commercial world - outside of benchmarks and ego trips - but the motivation will be similar, the need to run CUDA (for now).

NVidia have no chance of overtaking ATI on this release, but I hope they have got a sufficiently enticing real world performance/price improvement to make it worthwhile for Fermi1 to be seen as a worthy 295 successor.

Regards
Zy

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1925
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 15852 - Posted: 20 Mar 2010 | 10:45:07 UTC - in response to Message 15851.
Last modified: 20 Mar 2010 | 10:45:34 UTC

I don't think that is so clear as you depict it right now. It is likely that Fermi will be a factor two faster compared to a GTX 285 for what it matters GPUGRID. For gaming, I would think that the 480 will be the fastest single GPU card out there. Let's wait another week and see.

gdf

Its already clear that a 5970 will outperform a 480 in its current incarnation, and thats without letting loose the 5970 to its full legs - the latter has another 25-30% performance in it above stock. One of the 480's problems is powerdraw, and it could interesting to see how they get round that issue when going for a Fermi x 2 next year.

For now though, its looking more like whether or not they have produced a card that gives sufficient real world advantage above a 295 to be commercially viable. The pricing will be constrained by the clear real world lead of the 5970 (a lead it will have for at least 12 months, and thats without thinking around a 5970 successor), the defacto 480 floor price of 295's and 5870's, and Firmi's voracious power needs. The real battle will be next year with a re-engineered Fermi2 and whatever the 5970 successor looks like.

Meanwhile I personally find myself caught waiting, I am in the market to replace my stalwart 9800GTX+. The only reason I am still waiting on what the reality of Firmi turns out to be, is the need to run CUDA apps. In my case its the new GPGrid Project about to startup, for that I need CUDA, and I am reluctant to go down the road of a 295. I tend to buy a card and use it for years, jumping generations. Without that need for CUDA, that box would be ATI already. So in a sense, another saving grace for NVidia is the CUDA dimension. Many people/organisations will have far more serious CUDA needs than me out there in the real commercial world - outside of benchmarks and ego trips - but the motivation will be similar, the need to run CUDA (for now).

NVidia have no chance of overtaking ATI on this release, but I hope they have got a sufficiently enticing real world performance/price improvement to make it worthwhile for Fermi1 to be seen as a worthy 295 successor.

Regards
Zy

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15853 - Posted: 20 Mar 2010 | 10:55:13 UTC - in response to Message 15841.

That's certainly an improvement compared to the earlier rumors! And especially the price of the GTX470. That could bring the HD5870 down to the 300$ it was supposed to cost a half year ago :p

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15857 - Posted: 20 Mar 2010 | 13:49:29 UTC - in response to Message 15853.

I'm looking forward to the possibility of consolidating several systems into one with either a GTX470 or a GTX480, but Only If the more recent prices and performances are close to correct. £300 to £400 is reasonable for such a card. However I would like to see a real review first!

Profile Zydor
Send message
Joined: 8 Feb 09
Posts: 252
Credit: 1,309,451
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 15861 - Posted: 20 Mar 2010 | 18:19:04 UTC - in response to Message 15852.

I don't think that is so clear as you depict it right now. It is likely that Fermi will be a factor two faster compared to a GTX 285 for what it matters GPUGRID. For gaming, I would think that the 480 will be the fastest single GPU card out there. Let's wait another week and see.


As a single GPU - it will be the fastest, for now. Its the real world incarnation of that compared to ATI's top card that will decide the day for 2010, and it will be significantly slower than a 5970, by an order of magnitude. The 5970 is of course - essentially - two 5870's, but the "average" consumer could care less, a card is a card etc etc, and a Fermi2 will not be upon us until 2011. Lets hope Fermi2 continues to play catchup because ATI will not stand still, I have no doubt ATI will release a "Fermi2 Killer" single GPU card this year.

I hope NVidia sort their engineering and production troubles by 2011, they need to, because at present ATI have a huge lead in a real world commercial sense. I have my fingers crossed NVidia price Fermi1 to alleviate the huge power draw, and can sustain production - we all need competition out there - else ATI will do a re-run of 15 years ago when they got complacent and let NVidia in the back door to wipe their face. ATI will not make that mistake again, its just a case of whether or not NVidia can keep up. Its a funny 'ol world, this time its NVidia thats made the big strategic mistake (2007).

Probably going to be a couple of weeks before we really know as the first reviews will be "selected" reviewers friendly to NVidia marketing. If it is significantly better than a 295, I'm in, and will try to get one - its likely the last one I will get from NVidia for a long time, and even then its only because I want to support the new GPUGrid Project, I would not even be considering it otherwise - and that is painful to me as I have been an NVidia fan since they hit the consumer graphics market.

Regards
Zy

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15862 - Posted: 20 Mar 2010 | 21:07:10 UTC - in response to Message 15861.

ATI cards have no use here. Leave it alone. This is a Fermi thread. There are ATI threads elsewhere.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15863 - Posted: 20 Mar 2010 | 21:26:37 UTC

In my world an order of magnitude is a factor of 10 ;)

And if ATI tried to build a Fermi-1-killer (i.e. be significantly faster, not "just" more economic at a comparable performance level) they'd run into the same problems Fermi faces. They'd have more experience with the 40 nm process, but they couldn't avoid the cost / heat / clock speed problems. The trick is not to try to build the biggest chip.

MrS
____________
Scanning for our furry friends since Jan 2002

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15864 - Posted: 20 Mar 2010 | 23:51:19 UTC

ATI will issue new cards Q3 this year. Basically it's same 58xx and 59xx series, but on 28nm process available with Global Foundaries later this summer. I'm considering to buy 5990 or 5970 (if i will not be patient enough :-) ) this summer for milkyway@home project. Q3 2011 will be new generation of ATI cards.

if GTX470 will be better enough against GTX275 - i'll take ($350 is OK for me), but if not... I was nvidia fan for years, but may be it's time to say "good luck, guys"
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15865 - Posted: 21 Mar 2010 | 0:03:17 UTC - in response to Message 15864.

I dare say the GTX470 will be twice as fast as a GTX275!
The GTX480 will be twice as fast as a GTX285, and thats before any consideration of architecture over and above the counts. You will know for sure next week ;)

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15866 - Posted: 21 Mar 2010 | 1:52:42 UTC

I'll not be the first, I hope others are willing. I'm going to wait until they work out the drivers, which usually aren't the best at launch, I'll wait until the stocks refill, the prices drop, they come with a revision 2, & fermi's are sold at a discount. So that all said, I'll expect to get my first fermi in 2011 ;-)
____________

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15868 - Posted: 21 Mar 2010 | 3:58:03 UTC - in response to Message 15865.

I dare say the GTX470 will be twice as fast as a GTX275!

Hm... Hope u r right :-) If it will twice faster my GTX275, pricewise will be $350 and will be good in OCing, allowing to rich stock GTX480 at least - I'll take it :-) OR - i'll wait for 495 till early summer.

____________

Hans-Ulrich Hugi
Send message
Joined: 1 Jul 09
Posts: 2
Credit: 598,355,471
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15880 - Posted: 21 Mar 2010 | 16:20:53 UTC
Last modified: 21 Mar 2010 | 16:24:12 UTC

It seems that the final specs and prices of both Fermi cards got public - and it seems that most (negative) rumours are true. The cards are slower that many hoped and their price indicates that the performance isn't great.
Source:
http://www.heise.de/newsticker/meldung/Preise-und-Spezifikationen-der-Fermi-Grafikkarten-959845.html (in German)
Linked to:
http://vr-zone.com/articles/nvidia-geforce-gtx-480-final-specs--pricing-revealed/8635.html

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15882 - Posted: 21 Mar 2010 | 16:54:58 UTC - in response to Message 15880.

If they say that the performance of the GTX480 is en par with the HD5870 & the GTX470 is like the HD5850. If looking at the Sum of FPS Benchmarks 1920x1200 on Toms Hardware: http://www.tomshardware.com/charts/gaming-graphics-cards-charts-2009-high-quality-update-3/Sum-of-FPS-Benchmarks-1920x1200,1702.html that "could" mean that the GTX480 gains rougly 40% against the GTX285 & the GTX470 gained 25%. If that chart can be converted to the compute capability, that "could" mean that there is a 25-40% gain in performance. That's before they make improvements to their drivers...
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15883 - Posted: 21 Mar 2010 | 17:26:41 UTC

Just a couple more days :)

I suppose the performance in games will be approximately what they're currently showing.. but the architecture itself should be capable of much more. Just consider GT200: initially it was just as fast as the compute capability 1.1 cards at GPU-Grid (at similar theoretical FLOPS). After some time we got an improvement of 40%, at the beginning of the year we got 30 - 40% more and now it's another 30 - 40% faster.
I'd expect something approximately similar with Fermi (the initial performance could be higher).

MrS
____________
Scanning for our furry friends since Jan 2002

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15884 - Posted: 21 Mar 2010 | 18:04:50 UTC - in response to Message 15883.
Last modified: 21 Mar 2010 | 18:28:21 UTC

I "guess", that it won't just be Workstation & Mainstream GPU's, Compute deserves it's own line. Not now, but soon...

And if ATI tried to build a Fermi-1-killer (i.e. be significantly faster, not "just" more economic at a comparable performance level) they'd run into the same problems Fermi faces. They'd have more experience with the 40 nm process, but they couldn't avoid the cost / heat / clock speed problems. The trick is not to try to build the biggest chip.


If Nvidia concentrate on Compute with Fermi & Redo Mainstream/Workstation. They're already ahead with Compute...

I'm also "curious" to when a GPU can do without the CPU. Is an Nvidia a RISC or a CISC? Would Linux be able to run a GPU only PC?
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15885 - Posted: 21 Mar 2010 | 18:07:24 UTC - in response to Message 15883.
Last modified: 21 Mar 2010 | 18:16:52 UTC

February's increase was 60% for CC1.3 cards.
I expect Fermi to be CC1.4 (or some new Cuda Capable system to be started). My concern would be that by moving to CC1.4 people with CC1.3 and lesser cards will eventually start seeing task failures, as has recently happened following the application updates; several people report only being able to successfully crunch using the older application. Mind you, it will be a while before any new aps are written for Fermi. I think the Fermi driver is still Beta, and no doubt there will be a few revisions there!

I am also concerned that these new cards may not actually be especially good at crunching on GPUGrid. The first GTX 200 cards used 65nm cores and were much more error prone. The cores then shrunk to 55nm and the GT200 was revised to B1 and so on. Perhaps these Fermi cards will follow that same path - So there could be lots of task errors with these early cards, and not until there are several revisions will the cards really start to perform, perhaps not until some 28nm version makes it to the shops, and that is a long way off.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15886 - Posted: 21 Mar 2010 | 20:02:57 UTC
Last modified: 21 Mar 2010 | 20:13:13 UTC

I finally found what I was looking for! http://www.lucidlogix.com/product-adventure2000.html One of these baby's can put that nasty Fermi outside the case & give me 2 PCIe x16 or 4 PCIe x8 for every one PCIe on my mobo. If a cheap Atom is enough, one of these on the mobo would make it possible to use 2-4 High End CUDA GPU's to play crunchbox: http://www.lucidlogix.com/products_hydra200.html But I don't make PC's, I buy them. So maybe if the price ain't bad, & I can find out where i can get an Adventure 2000, I'd be able to run an external multi Fermi GPU box...

Maybe if VIA supplies the CPU, Nvidia can do the rest, with or w/o Lucid.
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15892 - Posted: 21 Mar 2010 | 22:12:48 UTC

I "guess", that it won't just be Workstation & Mainstream GPU's, Compute deserves it's own line. Not now, but soon...


They do: more memory, stronger cooling, more testing and higher price. Otherwise the chip is just the same, anything else would be rediculously expensive now and in the coming couple of years.

Developing on GPU architecture is so expensive, you don't just redisgn it for more computer or texture power. What you can do is to create a modular design, where you can easily choose the number of shader cluster, memory controllers etc. And that's just what they've been doing for years and that's how the mainstream chips will be build: from the same fundamental blocks, just less of them. And since each shader cluster contains the shaders (compute, shaders in games) and the texture and other fixed function units (gaming only) there won't be any shifts in performance between "compute" and "gaming".

Would Linux be able to run a GPU only PC?


That would probably require starting from scratchand would end up being quite slow. Not impossible, though ;)

@SK: compatibility is going to be a problem for GPUs in the long run. Currently they're still changing so much so quickly that supporting old hardware is going to become a pain. Ideally the driver should take care of this - but just how long is nVidia (and ATI) going to support old cards which are not sold any more? Drivers are not perfect and need expensive debugging as well. Which, btw is the cause of the "current" problems with initial GT200 chips: a driver bug which nVidia can't or doesn't want to fix. This should calm your concerns regarding the first Fermis, though: all nVidia has to do is get the driver done properly..

MrS
____________
Scanning for our furry friends since Jan 2002

M J Harvey
Send message
Joined: 12 Feb 07
Posts: 9
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 15893 - Posted: 22 Mar 2010 | 0:06:15 UTC - in response to Message 15886.

liveonc

I finally found what I was looking for!


http://www.colfax-intl.com/ms_tesla.asp?M=102 would involve less faffing about with sheet metal. Ain't cheap, though.

MJH

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15898 - Posted: 22 Mar 2010 | 5:15:57 UTC - in response to Message 15893.
Last modified: 22 Mar 2010 | 5:36:51 UTC

Too damn rich for me! Besides, it runs Workstation GPU's. I'm thinking about cost reduction. I bet that I could get an i7 mobo with 3 PCIe x16 running 3 Lucid Adventure 2000 boxes running 2-4 GTX295 each (depending on if you want 2 x16 PCIe or 4 x8 PCIe), costing less than a single Colfax CXT8000. That would be 1 CPU & 12-24 GPU's VS 2 CPU's & 8 GPU's.

It's all about "affordable" Supercomputing, for even the most unpopular, ill-funded, mad scientist out there (or helping one out)...
____________

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 15900 - Posted: 22 Mar 2010 | 10:45:16 UTC - in response to Message 15898.

bet that I could get an i7 mobo with 3 PCIe x16 running 3 Lucid Adventure 2000 boxes running 2-4 GTX295 each (depending on if you want 2 x16 PCIe or 4 x8 PCIe), costing less than a single Colfax CXT8000.


If you can, do let us know! I've not seen the Adventure board productised anywhere...

MJH

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15903 - Posted: 22 Mar 2010 | 10:56:13 UTC - in response to Message 15900.

Haven't either yet, but for now there is this: http://eu.msi.com/index.php?func=proddesc&maincat_no=1&cat2_no=170&prod_no=1979 Isn't what I had in mind, but it's there...

Powered by Hydra Engine, Big Bang Fuzion can offer the most flexible upgradability on 3D performance, allowing users to install cross-vender GPUs in a single system. The technology can perform scalable rendering to deliver near-linear gaming performance by load-balancing graphics processing tasks.
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15904 - Posted: 22 Mar 2010 | 10:56:29 UTC - in response to Message 15898.



The above ships with 4 Teslas and costs as much as a new car.

You would be better off just buying a good PSU and getting a couple of Fermi cards (or a new sysetem, or two, or three...).
Two GTX480 cards will probably do as much work as 4 or 5 GTX295s, ie more than 4 Teslas!

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15905 - Posted: 22 Mar 2010 | 11:33:20 UTC - in response to Message 15904.
Last modified: 22 Mar 2010 | 12:01:31 UTC



The Adventure is a PCIe expansion board based on a derivative of the HYDRA 100 ASIC (LT12102). The system allows easy, high speed connection to any PC or Workstation platform using the PCIe bus. The system is optimized for GPUs and provides superb connectivity to different PCIe devices such as SSDs, HDs and other peripherals. The board has four PCIe slots and is powered by a standard PC power supply. The Adventure is best for your multi displays, broadcasting, digital signage and storage solution. The Adventure board is typically deployed within a 4U rack mount case together with a standard power supply and up to four slots.

That's why I'm interested in a Lucid Adventure! A 1000W PSU, x2 GTX480, for every PCIe slot I've got in my PC. GPUGRID runs WU's that use 1 core pr GPU, that would require an Core i9, then we're crunching!
____________

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 15906 - Posted: 22 Mar 2010 | 11:43:00 UTC - in response to Message 15904.

It's actually a rebranded Tyan FT72B7015 http://www.tyan.com/product_SKU_spec.aspx?ProductType=BB&pid=412&SKU=600000150. Difficult to source though.

For you keen GPUGRID crunchers, it might be better find the minimum-cost host system for a GPU. CPU performance isn't an issue for our apps, so a very cheap motherboard-processor combination would do, perhaps in a wee box like this one: http://www.scan.co.uk/Product.aspx?WebProductId=982116

MJH

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15907 - Posted: 22 Mar 2010 | 12:21:09 UTC - in response to Message 15884.

I "guess", that it won't just be Workstation & Mainstream GPU's, Compute deserves it's own line. Not now, but soon...

And if ATI tried to build a Fermi-1-killer (i.e. be significantly faster, not "just" more economic at a comparable performance level) they'd run into the same problems Fermi faces. They'd have more experience with the 40 nm process, but they couldn't avoid the cost / heat / clock speed problems. The trick is not to try to build the biggest chip.


If Nvidia concentrate on Compute with Fermi & Redo Mainstream/Workstation. They're already ahead with Compute...

I'm also "curious" to when a GPU can do without the CPU. Is an Nvidia a RISC or a CISC? Would Linux be able to run a GPU only PC?


Is the Compute market big enough for Nvidia to earn enough on it without a very large reduction in their sales?

If I understand the GPU architectures correctly, they do NOT include the capability of reaching memory or peripherals off the graphics boards. Therefore, they cannot reach any BOINC projects.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15909 - Posted: 22 Mar 2010 | 12:33:47 UTC - in response to Message 15886.

I finally found what I was looking for! http://www.lucidlogix.com/product-adventure2000.html One of these baby's can put that nasty Fermi outside the case & give me 2 PCIe x16 or 4 PCIe x8 for every one PCIe on my mobo. If a cheap Atom is enough, one of these on the mobo would make it possible to use 2-4 High End CUDA GPU's to play crunchbox: http://www.lucidlogix.com/products_hydra200.html But I don't make PC's, I buy them. So maybe if the price ain't bad, & I can find out where i can get an Adventure 2000, I'd be able to run an external multi Fermi GPU box...

Maybe if VIA supplies the CPU, Nvidia can do the rest, with or w/o Lucid.


Looks like a good idea, if GPUGRID decides to rewrite their application to requires less communication between the CPU section and the GPU section.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15911 - Posted: 22 Mar 2010 | 13:50:57 UTC - in response to Message 15907.
Last modified: 22 Mar 2010 | 14:45:17 UTC

Are you 100% sure about this? Lucid claims that:

The Adventure is a powerful PCIe Gen2.0 expansion board for heavy graphic computing environment. The platform allows connection of multiple PCIe based devices to a standard PC or server rack.The Adventure 2000 series is based on the different derivative of the HYDRA 200 series targeting a wide range of performance market segments. It provides a solution for wide range of applications such as gaming (driver required), GPGPU, high performance computing, mass storage, multi-display, digital and medical imaging.

Adventure 2500
The Adventure 2500 platform is based on Lucid’s LT24102 SoC, connecting up to 4 heterogeneous PCIe devices to a standard PC or Workstation using standard PCIe extension cable. The Adventure 2500 overcomes the imitations of hosting high-end PCIe devices inside their Workstation/PC systems. Those limitations are usually associated with space, power, number of PCIe ports and cooling.
The solution is seamless to the application and GPU vendor in order to meet the needs of various computing and storage markets.


That said, what would be require are:

x1 Lucid Adventure 2000 http://www.lucidlogix.com/product-adventure2000.html

x1 4U rack mount

x1 850W-1000W PSU

x2 GTX480

With that in place, you're supposed to just connect the thing to one of your PCIe slots within your PC. That's "maybe" $2000 a pop...

I'm also thinking about if x8 PCIe 2.0 doesn't effect performance, x4 GTX480 & maybe a 1500W PSU might be possible for "maybe" $3000 a pop...
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15917 - Posted: 22 Mar 2010 | 15:26:22 UTC - in response to Message 15905.
Last modified: 22 Mar 2010 | 15:27:25 UTC

GPUGRID runs WU's that use 1 core pr GPU, that would require an Core i9, then we're crunching!


That’s not the case:
GPUGrid tasks do not require that the CPU component of a process to be explicitly associated with one CPU Core (or logical core in the case of the i7). So, a low costing single core CPU could support two (present) high end GPUs!

As quad core i7’s use Hyperthreading, they have 8 logical cores. So, even if GPUGrid tasks were exclusively associated to one core an i7 could support 8 GPUs! Remember the faster the CPU, the less CPU time required!
I normally crunch CPU tasks on my i7-920, leaving one logical core free for my two GT240s. Overall my system only uses about 91% of the CPU, so bigger cards would be fine!

As my two GT240s (equivalent to One GTX260 sp216) only use 3.5% of my CPU then a GTX 295 would use about 7% and two GTX 295’s would use about 14%.
Therefore two GTX480 would use about 33% - so an i7-920 could support 6 Fermi cards crunching on GPUGrid!
Most people would be better off with a highly clocked dual core CPU (3.33GHz) than say a Q6600 (at only 2.4GHz), or just overclock it and leave a core or two free.

PS. There is no i9; Intel ended up calling it the i7-980X. But fortunately you don’t need it, as it costs £855. Better to build a dual GTX470 based system for that.

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15918 - Posted: 22 Mar 2010 | 15:38:28 UTC - in response to Message 15917.
Last modified: 22 Mar 2010 | 15:47:36 UTC

But can you answer if it's robertmiles, or Lucid who's right? Is it possible to build an "affordable" Supercomputer? Consisting of 4x 4U rack mounts, one for a Single or Dual CPU i7 & 3 for 2-4 GTXGPU's? If my "guess" is right, such a system would cost $8-12.000 & nobody says that you have to use it for crunching GPUGRID.net projects...

Do you know if there's still prize money going to whoever finds the highest Prime Number, if so, can CUDA help to find it, if putting together that "affordable" Super Computer is possible?
____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15919 - Posted: 22 Mar 2010 | 16:01:44 UTC - in response to Message 15918.

On the list of things to investigate to get your supercomputer project off the ground I would like to suggest that you look into how many GPUs in one system the NVidia drivers will support properly. Are there OS limits? How about motherboard bios limits?

Talk to the people with multiple GTX295 cards and you will see they had to do unconventional things regarding drivers and bios.

skgiven ... I think that the current linux version of gpugrid app is using up a full cpu core per gpu core. That's probably the single biggest reason I have not tried linux yet. I have seen some fast runtimes which always interest me but I am just not willing to take that much away from WCG.
____________
Thanks - Steve

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15920 - Posted: 22 Mar 2010 | 16:34:46 UTC - in response to Message 15919.
Last modified: 22 Mar 2010 | 16:36:43 UTC

Its still pretty raw, but this was an article I found from PC Persctive http://www.pcper.com/article.php?aid=815&type=expert&pid=1

They showed & tested the Hydra 200 & the Adventure. There was a mention about folding@home.
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15923 - Posted: 22 Mar 2010 | 20:14:11 UTC - in response to Message 15919.

Snow Crash, your right. 6.04 is still running over 90% CPU time on Linux, but it is saving 50min per task - Which means 5h rather than 5h 50min, or 16.5% better in terms of results/points.
My GTX 260 brings back about 24500 points per day, so under Linux I would get about 28500 per day (~ 4000 more). The quad CPU on my system only gets around 1300 Boinc points (my GTX 260 does almost 20 times the work). So using Linux I would get 2600more points per day on that one system, assuming I could not run any other CPU tasks.

Liveonc, I would say that you would really have to talk to someone who has one, to know what it can do - if the Lucid Hydra is even available yet? You would also, as Snow Crash said, need to look into the GPU drivers, other software (application), and especially the motherboards limitations. You would also need to be very sure what you want it for; crunching, gaming, rendering or display.

I guess that a direct PCIE cable would allow the whole GPU box to function 'basically' like a single GPU device and the Hydra is essentially an unmanaged GPU Layer 1&2 switch. The techs here might be able to tell you if it could at least theoretically work.
Although I design and build bespoke systems and servers this is all new and rather expensive kit. It is a bit of a niche technology, so it will be expensive and of limited use. That said if it could be used for GPUGrid, I am sure the techs would be very interested in it as it would allow them to run their own experiments internally and develop new research techniques.

For most, one Fermi in a system will be plenty! Two will be for the real hard core enthusiast or gamer. For the rare motherboards that might actually support 3 Fermi's you really are looking at a 1200W PSU (or 2 PSU's) in a Very well ventilated Tower System (£2000+).

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15924 - Posted: 22 Mar 2010 | 20:50:03 UTC - in response to Message 15923.
Last modified: 22 Mar 2010 | 21:02:15 UTC

Sorry for asking too many questions that can't really be answered unless you actually had the thing. It just fascinated me with the possible potential of the Hydra. Not just for use with GPUGRID.net, I myself "sometimes", like to play games on my PC. That most of the GPU's used in GPUGRID.net are mainstream GPU's & that Hydra allows mixing Nvidia with Ati cards, Nvidia & Ati GPU's of different types. Brought to mind someone in another thread who looked at a GTX275 with GTS250 physics card. That Hydra "might" allow mixing different GPU chips on the same card, "might" also enable Dual GPU chip cards with fx a Cypress & a Fermi, instead of putting the Hydra on the mainboard or going external.

Also, Nvidia had abandoned the idea of Hybrid-SLI GeForce® Boost, & decided to go with NVIDIA Optimus instead. If notebooks had a Hydra, they "might" be able to use both Integrated & Discrete, instead of just switching between the two.
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15926 - Posted: 22 Mar 2010 | 21:50:24 UTC - in response to Message 15924.
Last modified: 22 Mar 2010 | 21:53:55 UTC

Hydra is very interseting - it has unknown potential ;) I am sure many event organisers (DJs) would love to see it in a laptop or a box that could be attached to a laptop, for performances! It could even become a must have for ATI and NVidia alike, when demonstrating new GPU's to audiances.


Back to Fermi.

I was thinking about the difference between DDR3 and GDDR5. In itself it should mean about a 20 to 25% difference. So with improved Fermi architecture (20%) and slightly better clocks (~10%) I think the GTX480 will be at least 2.6 times as fast as a GTX295 but perhaps 2.8 times as fast.

I think they are keeping any chips with 512 working shaders aside, for monster dual cards. Although the clocks are not likely to be so high, a card that could do 4.5 times the work of a GTX295 would be a turn up for the books.

Should any lesser versiosn of Fermi turn up with DDR3, for whatever reason, avoid at all costs!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15927 - Posted: 22 Mar 2010 | 22:36:53 UTC

Hydra: it's not there yet for games, they still have to sort out the software side. Load balancing similar GPUs is difficult enough (see SLI and Crossfire), but with very different GPUs it becomes even more challenging. They'd also teach their software how crunching with Hydra had to work.

Back to Fermi: there won't be any DDR3 GF100 chips - that would be totally insane. Really bad publicity and still a very expensive board which people almost won't buy at all. Manufacturers play these tricks a lot with low end cards (which are mostly bought by people who don't know / care about performance), and sometimes mid-range cards.

I was thinking about the difference between DDR3 and GDDR5. In itself it should mean about a 20 to 25% difference.


No, it doesn't. If you take a well balanced chip (anything else doesn't get out of the door at the mid to high end range anyway) and increase its raw power by a factor of 2 you'll get anything between 0 and 100% as a speedup, depending on the application. In games probably more like 30 - 70%.
If you doulbe raw power and double memory bandwidth (and keep relative latency constant) then you'll see 100% speedup among all applications. The point is: higher memory speed doesn't make you faster by definition, because the memory bandwidth requirements scale with performance.
So if for example Fermi got 3 times as much raw crunching power as GT200 and only 2 times the bandwidth, this is not going to speed things up, regardless of the memory being GDDR5 or whatever.

However, Fermi is more than just an increase of raw power: there's also the new caching system. Which should alleviate the need for memory bandwidth somewhat, but doesn't change the fundamental issue.

MrS
____________
Scanning for our furry friends since Jan 2002

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15928 - Posted: 23 Mar 2010 | 1:54:38 UTC - in response to Message 15926.
Last modified: 23 Mar 2010 | 1:59:42 UTC

I think the GTX480 will be at least 2.6 times as fast as a GTX295 but perhaps 2.8 times as fast.

I think they are keeping any chips with 512 working shaders aside, for monster dual cards. Although the clocks are not likely to be so high, a card that could do 4.5 times the work of a GTX295 would be a turn up for the books.


from where u took this? AFAIK from various forums, here's some "facts":
- GTX480 and GTX470 are faster then GTX285 on 40% and 25% respectively
- GTX480 is weaker then 5870 in Vantage
- OCing is really poor and in the same time - make card "critically hot"
- VERY noisy
- if the quantity of good 512 core will be enough, it will be for Tesla only
- fermi2 will no be earlier then summer 2011, most probably - fall 2011, if to start redesigning today
- there are plans for GTX495. BUT: it will be based on 275 chip (retail name GTX470 with 448 cores), but not on 375 (GTX480 - 480 cores). Another problem - PCI certification (300W), which will be really hard to meet. Power consumption gonna be "mind blowing".

Sure, let's wait for a week and see if all these true or not
____________

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 484
Credit: 554,588,959
RAC: 1,045
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15929 - Posted: 23 Mar 2010 | 7:07:24 UTC - in response to Message 15911.
Last modified: 23 Mar 2010 | 7:18:31 UTC

Are you 100% sure about this? Lucid claims that:

The Adventure is a powerful PCIe Gen2.0 expansion board for heavy graphic computing environment. The platform allows connection of multiple PCIe based devices to a standard PC or server rack.The Adventure 2000 series is based on the different derivative of the HYDRA 200 series targeting a wide range of performance market segments. It provides a solution for wide range of applications such as gaming (driver required), GPGPU, high performance computing, mass storage, multi-display, digital and medical imaging.

Adventure 2500
The Adventure 2500 platform is based on Lucid’s LT24102 SoC, connecting up to 4 heterogeneous PCIe devices to a standard PC or Workstation using standard PCIe extension cable. The Adventure 2500 overcomes the imitations of hosting high-end PCIe devices inside their Workstation/PC systems. Those limitations are usually associated with space, power, number of PCIe ports and cooling.
The solution is seamless to the application and GPU vendor in order to meet the needs of various computing and storage markets.


That said, what would be require are:

x1 Lucid Adventure 2000 http://www.lucidlogix.com/product-adventure2000.html

x1 4U rack mount

x1 850W-1000W PSU
[img]
x2 GTX480

With that in place, you're supposed to just connect the thing to one of your PCIe slots within your PC. That's "maybe" $2000 a pop...

I'm also thinking about if x8 PCIe 2.0 doesn't effect performance, x4 GTX480 & maybe a 1500W PSU might be possible for "maybe" $3000 a pop...


Is who 100% sure about what?

Looks like the GPUGRID project scientists and programmers need to say more about just how fast the CPU-GPU communications needs to be, and how powerful a CPU is needed to keep up with the needs of two or four GTX480s.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1925
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 15930 - Posted: 23 Mar 2010 | 8:43:29 UTC - in response to Message 15929.



[quote]Are you 100% sure about this? Lucid claims that:


Looks like the GPUGRID project scientists and programmers need to say more about just how fast the CPU-GPU communications needs to be, and how powerful a CPU is needed to keep up with the needs of two or four GTX480s.


It does not matter at all. The code always run on the GPU only apart from IO.
gdf

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 15932 - Posted: 23 Mar 2010 | 8:56:15 UTC - in response to Message 15930.
Last modified: 23 Mar 2010 | 9:07:50 UTC

Well, I "guess" that maybe after Lucid sorts itself out, it "might" just be the One Chip to rule them all, One Chip to find them, One Chip to bring them all and in the darkness bind them In the Land of Mordor where the Shadows lie. ;-) With that said, I hope the Orcs get it before the Hobits destroys it. BTW, I read somewhere that Intel has a stake in Lucid.
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15935 - Posted: 23 Mar 2010 | 11:12:06 UTC - in response to Message 15932.

I bet its a pointy stake with a Lawyer behind it.
ATI and NVidia will hadrdly be bending over backwards to make their kit work with Hydra then!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15948 - Posted: 23 Mar 2010 | 21:53:33 UTC - in response to Message 15928.

AFAIK from various forums, here's some "facts":
- GTX480 and GTX470 are faster then GTX285 on 40% and 25% respectively
...
- if the quantity of good 512 core will be enough, it will be for Tesla only


I second that: reserving fully functional chips for ultra expensive Teslas is a smart move (if you don't have many of them ;)

Regarding the performance estimate: I don't doubt someone measured this with some immature hardware and driver, but I don't think it's going to be the final result. Consider this:

When ATI went from DX10 to DX11 they required 1.10 times as many transistors per shader (2200/1600 : 1000/800) and got 11% less performance per shader per clock (see 4870 1 GB vs. 5830).

If the 40% for GTX480 versus GTX285 were true that would mean nVidia spent 1.07 times as many transistors per shader (3200/512 : 1400/240) and got 0.74 times as much performance per shader per clock, i.e. 35% less [140%/(480*1400) : 100%/(240*1476)]. nVidias strength so far has been design, so I don't think they'd screw this one up so badly. Especially since they already screw up manufacturing and dimensioning.

MrS
____________
Scanning for our furry friends since Jan 2002

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15957 - Posted: 24 Mar 2010 | 20:22:14 UTC - in response to Message 15948.


Regarding the performance estimate: I don't doubt someone measured this with some immature hardware and driver, but I don't think it's going to be the final result.
MrS


I understand what u r talking about. But remember - this time nvidia changed the “dispatching scheme” and now they sending cards and drivers directly by themselves, but not from manufacturers, as it was before. In fact, nvidia already had sent cards and the latest driver to some close to them reporters and these results came from one of them.

Saying frankly, me personally expected and nvidia promised way better performance and frequences, so if this is true I gonna stay on my GTX275, may be I’ll try to get the 2nd one for cheap. But no Fermi until Fermi2 will be available, sorry…


____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2690
Credit: 1,254,799,048
RAC: 385,488
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15962 - Posted: 24 Mar 2010 | 21:05:10 UTC - in response to Message 15957.

Patience is a virtue ;)

Generally I'm not very keen on advising people to spend their money on something which is a hobby for us, after all. BOINC is meant to use spare computation time, not to make people invest a fortune into new hardware ;)
However, if someone wants to buy I'll gladly try to tell them what makes sense from my point of view. Which occasionally is "don't buy".

MrS
____________
Scanning for our furry friends since Jan 2002

MarkJ
Volunteer moderator
Project tester
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 732
Credit: 198,597,349
RAC: 192
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15970 - Posted: 25 Mar 2010 | 11:59:33 UTC

Leaked specs and pricing from Toms hardware

GeForce GTX 480 : 480 SP, 700/1401/1848MHz core/shader/mem, 384-bit, 1536MB, 295W TDP, US$499

GeForce GTX 470 : 448 SP, 607/1215/1674MHz core/shader/mem, 320-bit, 1280MB, 225W TDP, US$349

Looks to me that the GTX470 is the better bang for the buck. Presumably the Tesla's got the fully functional chips (ie the full 512 SP), but no idea on pricing for them.

There is some mention of a GF104 chip to come out Q2 or Q3 2010 that is meant to be a lower-cost version (and presumably lower spec'ed) of the GF100 chip. It may be worthwhile waiting for these.
____________
BOINC blog

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 15971 - Posted: 25 Mar 2010 | 12:28:09 UTC - in response to Message 15962.

ExtraTerrestrial Apes,

Patience is a virtue ;)

Generally I'm not very keen on advising people to spend their money on something which is a hobby for us, after all. BOINC is meant to use spare computation time, not to make people invest a fortune into new hardware ;)
However, if someone wants to buy I'll gladly try to tell them what makes sense from my point of view. Which occasionally is "don't buy".

MrS

100% agree, man :-) The onl thing - I built my rigs for BOINC :-) Otherwise I do not need that powerful PCs :-)

MarkJ
agree about GTX470. It's just a little bit less powerful, but pricewise is OK for.

But me personally (if Fermi is really good) will wait for GTX495 to appear somewhere Q2 or early Q3 this year
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16040 - Posted: 29 Mar 2010 | 8:57:54 UTC - in response to Message 15971.

Name p14-IBUCH_chall_pYEEI_100301-26-40-RND0782_0
Workunit 1302269
Created 29 Mar 2010 1:18:10 UTC
Sent 29 Mar 2010 1:18:53 UTC
Received 29 Mar 2010 7:17:30 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 67855
Report deadline 3 Apr 2010 1:18:53 UTC
Run time 19101.678711
CPU time 4336.531
stderr out

<core_client_version>6.10.18</core_client_version>
<![CDATA[
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
# Time per step: 30.239 ms
# Approximate elapsed time for entire WU: 18899.089 s
called boinc_finish

</stderr_txt>
]]>

Validate state Valid
Claimed credit 3977.21064814815
Granted credit 5965.81597222223
application version Full-atom molecular dynamics v6.71 (cuda23)

MarkJ
Volunteer moderator
Project tester
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 732
Credit: 198,597,349
RAC: 192
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16042 - Posted: 29 Mar 2010 | 10:32:23 UTC - in response to Message 16040.

Name p14-IBUCH_chall_pYEEI_100301-26-40-RND0782_0
Workunit 1302269
Created 29 Mar 2010 1:18:10 UTC
Sent 29 Mar 2010 1:18:53 UTC
Received 29 Mar 2010 7:17:30 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 67855
Report deadline 3 Apr 2010 1:18:53 UTC
Run time 19101.678711
CPU time 4336.531
stderr out

<core_client_version>6.10.18</core_client_version>
<![CDATA[
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTX 480"
# Clock rate: 0.81 GHz
# Total amount of global memory: 1576468480 bytes
# Number of multiprocessors: 15
# Number of cores: 120
# Time per step: 30.239 ms
# Approximate elapsed time for entire WU: 18899.089 s
called boinc_finish

</stderr_txt>
]]>

Validate state Valid
Claimed credit 3977.21064814815
Granted credit 5965.81597222223
application version Full-atom molecular dynamics v6.71 (cuda23)


Where's the 480 cuda-cores?
____________
BOINC blog

TomaszPawel
Send message
Joined: 18 Aug 08
Posts: 121
Credit: 59,836,411
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16043 - Posted: 29 Mar 2010 | 10:35:44 UTC - in response to Message 16042.

OMG LOL
____________
POLISH NATIONAL TEAM - Join! Crunch! Win!

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1003
Credit: 2,650,332,431
RAC: 2,817,349
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16045 - Posted: 29 Mar 2010 | 10:48:11 UTC - in response to Message 16043.

OMG LOL

And it's the only task they've got to run at all: All tasks for host 67855.

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16046 - Posted: 29 Mar 2010 | 10:57:43 UTC - in response to Message 16045.
Last modified: 29 Mar 2010 | 11:30:12 UTC

It cannot run ACEMD V2 tasks because of the way that app was compiled – for G200 cards, not Fermi cards!

Either the app cant see all the cores or it is just not reporting them.

MarkJ
Volunteer moderator
Project tester
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 732
Credit: 198,597,349
RAC: 192
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16048 - Posted: 29 Mar 2010 | 11:42:10 UTC - in response to Message 16046.

It cannot run ACEMD V2 tasks because of the way that app was compiled – for G200 cards, not Fermi cards!

Either the app cant see all the cores or it is just not reporting them.


You have 197.33 driver and what version DLL's did you use? Perhaps the developers need to compile a cuda 3.0 app for you to see all the 480 cuda cores. But still if it only used a quarter of them and managed 30ms/step thats good.
____________
BOINC blog

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 16049 - Posted: 29 Mar 2010 | 11:42:28 UTC - in response to Message 16042.

Where's the 480 cuda-cores?


It's just a reporting artifact. The device query code assumes 8 cores/sm (15.8=120). Fermis actually have 32 (15.32=480).

MJH

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 16050 - Posted: 29 Mar 2010 | 11:44:17 UTC - in response to Message 16046.

It cannot run ACEMD V2 tasks because of the way that app was compiled – for G200 cards, not Fermi cards!


That's right. We need to rebuild the ACEMD V2 and beta apps to include kernel code compiled for Fermi. The older app will work because it's built in a different way.

MJH

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16052 - Posted: 29 Mar 2010 | 12:17:25 UTC - in response to Message 16050.
Last modified: 29 Mar 2010 | 12:53:53 UTC

After compairing the IBUCH task that ran on the GTX480 to a similar task that ran on a GTX275 (240shaders), it really looks like the GTX480 was only running on 120 shaders. This would mean the GTX480 is 9% faster when only using 120 shaders! (anyone)?

So, if it could use all 480 shaders it could be 4.36 times as fast as a GTX275. This makes more sense than just a reporting error, given the cards architecture and would tie in well with other CUDA findings, such as folding@home (non-Boinc).

http://www.gpugrid.net/workunit.php?wuid=1298681

p14-IBUCH_chall_pYEEI_100301-26-40-RND0782 completed in 19,101sec (5h18min) claimed credit was 3,977, granted credit was 5,965.

p20-IBUCH_5Ans_pYEEI_100216-35-40-RND4473 completed on a GTX275 in 20868sec (5h48min) claimed 3,977 credit and was granted 5,965 credit.

Boinc (6.10.18) is not reading the card correctly, so some problems are stemming from that.

By the way it is CC2.0

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16060 - Posted: 29 Mar 2010 | 16:52:19 UTC

skgiven,
not bad result with GTX480. I wonder to see results for recompiled apps. If it will that fast on 480 cores, I'm ready to change my mind and take GTX470 :-)
____________

Profile skgiven
Volunteer moderator
Project tester
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat