GPU variants, performance envelopes, and being an informed consumer

Overall, performance is only as good as the weakest link in the chain. So if you skimp on the GPU and get a great CPU, the GPU becomes the bottleneck.


There is a reason the high-end cards cost so much. They provide more. Really.


Understanding how the variants differ is key.


The IHVs make 4 lines of


1) Super high-end for the enthusiast at super high prices

2) High-end at high prices

3) Mid-range at mid prices

4) Low-end at low prices.


For nVidia in the 8 series that translates to:

1)8800 Ultra overclocked 768m monster

2)8800 GTX/GTS 512m, 8800 GT 2nd gen




The manufacturing process is not perfect, and there are defects in almost every wafer burnt by the FAB. Where in the wafer the defects are is critical. The IHVs have come up with a clever way to avoid having to throw as many chips out due to defects.


One design approach is with the chip components layed out in quadrants, with the shader pipes and stream processors arrayed around a central memory controller. There is a single die design for all "family" variants using this approach.


When the wafers come out of the FAB, the parts are speed binned, by that I mean they are tested at the maximum rated performance to see if the part works. If it doesn’t work at that clock, they reduce the clock until they see if it does work. If it does not work at all, the chip is discarded. If it does work, the level at which it speed bins determines what variant the part can be sold as.


If the part doesn’t speed bin out for Ultra, they try it for High. Typically High has the same number of shader units and stream processors as Ultra and the same memory width but with lower clocks ( less memory bandwidth ) and less memory.


If it doesn’t speed bin out for High, they turn 1 quadrant off ( now 3/4 of the top end ) and try to bin it as a Mid-range. With fewer shader units and stream processors. at a lower clock, and with less memory and less memory bus width which means less memory bandwidth.


If that doesn’t work, they turn 2 quadrants off ( now 1/2 of the top end ) and try at Low-end. With fewer shader units and stream processors. at a lower clock, and with less memory and sometimes less memory bus width which means less memory bandwidth. Much less.


Low-end is really low-end, this is no joke. Really. How does this help with deciding what to purchase?


Do not buy below mid-range if you can afford to (x5xx or greater is usually how midrange is labeled), and understand that the 2nd place in the high end (GT, not GTS) is often the price/performance sweet spot.


So by that metric, there is no way a 7300 is better than a 7600, and both are substantially less than a 7800 or 7900. And an 8400 is substantially less than an 8600 or 8800.


Laptop parts are almost always cut down, even at the high end. This is for space, heat, and power. So a top-end laptop chip does not give the same performance as the chip in a top-end add-in card. Caveat Emptor there.


If you cannot afford the top end, then you can’t. And then it is good there are mid-range and low-end variants. But realize you are buying a lesser product. In some cases significantly lesser.


So here is some practical advice, gleaned by examining the nVidia site and the 8 series comparo chart.


Let us take an 8400 compared to an 8800, and do some percentage math:


       StreamProcs CoreClock ShaderClock MemClock Memory MemInterface MemBandwidth (GB/sec)TextureFillRate (B/s)

8800 Ultra____128____612______1500_____1080____768MB_384-bit_____103.7______________39.2

8400 GS______16____450_______900______400____256MB__64-bit_______6.4_______________3.6

Percent______12.5% _73.5%_____60%_____ 37%____33.3%__16.7%______6.2% ____________9.2%

Note the underbars are to make the tables line up, my blog software eats spaces. Sorry about that J.


So for key areas like memory bandwidth and fill rate, the 8400 is under 10% of what an 8800 offers.


Note the "new" 8800 GT has 112 stream processors so that is 87.5% of the top end. Lets look at the rest of that part:


       StreamProcs CoreClock ShaderClock MemClock Memory MemInterface MemBandwidth (GB/sec)TextureFillRate (B/s)

8800 Ultra____128____612______1500_____1080____768MB__384-bit____103.7______________39.2

8800 Gt______112____600______1500______900____512MB__256-bit_____57.6______________33.6

Percent______87.5%__98%______100%____83.33%__66.7% __66.7%____55.5%_____________85.7%


So the 8800 GT has only 55% of the memory bandwidth of the Ultra, and on an app like FSX that crunches through a lot of memory, that means even this part might have issues with some slider settings depending on the scene.


The 8800 GTX is the closest to the top if you cannot afford the Ultra. The GTS 640/320 has a 320-bit memory interface where the GTS/512 and GT have only 256.


Before anyone thinks I am picking on nVidia, I can do the same with the ATI specs here for 2xxx family and here for 3xxx family.


From this you can see the 2400 and 2600 are quite a bit cut down in terms of memory bus while the 3850 is much closer to the 3870. Generations can be quite different in how the spread is between the low and high.


This discussion is not about being negative. This is about being a smart shopper, and the differences between these parts can and does have a direct impact on performance.


In terms of what parts of these specs are more critical for FSX, at least 3 of these are critical:

· Overall memory size ( don’t go below 256m if you can help it)

· Overall memory bandwidth ( the max you can afford )

· Core clock ( that helps with handling the amount of drawing FSX does )

Sure, the memory and shader clocks are important, but if the core clock is too low, then those don’t matter as much. Sure, texture fill is important but without memory bandwidth that won’t matter as you cannot get the textures and geometry across the bus. Note memory interface width and memory bandwidth have a direct relationship.


I hope this was useful.