Archive Home arrow Reviews: arrow Video Cards arrow ASUS GeForce GTX-465 Video Card
ASUS GeForce GTX-465 Video Card E-mail
Reviews - Featured Reviews: Video Cards
Written by Olin Coles   
Monday, 21 June 2010
Table of Contents: Page Index
ASUS GeForce GTX-465 Video Card
Features and Specifications
NVIDIA GF100 GPU Fermi Architecture
Closer Look: ASUS GeForce GTX-465
Video Card Testing Methodology
DX10: 3DMark Vantage
DX10: Crysis Warhead
DX10: Far Cry 2
DX10: Resident Evil 5
DX11: Aliens vs Predator
DX11: Battlefield Bad Company 2
DX11: BattleForge
DX11: Metro 2033
DX11: Unigine Heaven 2.1
NVIDIA APEX PhysX Enhancements
NVIDIA 3D-Vision Effects
GeForce GTX465 Temperatures
VGA Power Consumption
ASUS SmartDoctor and GamerOSD
Editors Opinion: Fermi GF100
ASUS ENGTX465 Conclusion

NVIDIA GF100 GPU Fermi Architecture

NVIDIA's latest GPU is codenamed GF100, and is the first graphics processor based on the Fermi architecture. In this article, Benchmark Reviews explains the technical architecture behind NVIDIA's GF100 graphics processor and offers an insight into upcoming Fermi-based GeForce video cards. For those who are not familiar, NVIDIA's GF100 GPU is their first graphics processor to support DirectX-11 hardware features such as tessellation and DirectCompute, while also adding heavy particle and turbulence effects. The GF100 GPU is also the successor to the GT200 graphics processor, which launched in the GeForce GTX 280 video card back in June 2008. NVIDIA has since redefined their focus, and GF100 proves a dedication towards next generation gaming effects such as raytracing, order-independent transparency, and fluid simulations. Rest assured, the new GF100 GPU is more powerful than the GT200 could ever be, and early results indicate a Fermi-based video card delivers far more than twice the gaming performance over a GeForce GTX-280.

GF100 is not another incremental GPU step-up like we had going from G80 to GT200. While processor cores have grown from 128 (G80) and 240 (GT200), they now reach 512 and earn the title of NVIDIA CUDA (Compute Unified Device Architecture) cores. The key here is not only the name, but that the name now implies an emphasis on something more than just graphics. Each Fermi CUDA processor core has a fully pipelined integer arithmetic logic unit (ALU) and floating point unit (FPU). GF100 implements the new IEEE 754-2008 floating-point standard, providing the fused multiply-add (FMA) instruction for both single and double precision arithmetic. FMA improves over a multiply-add (MAD) instruction by doing the multiplication and addition with a single final rounding step, with no loss of precision in the addition. FMA minimizes rendering errors in closely overlapping triangles.

nvidia-fermi-gf100-gpu-block-diagram-benchmarkreviews-sm.png

NVIDIA Fermi GF100 Block Diagram (click for high-resolution)

Based on Fermi's third-generation Streaming Multiprocessor (SM) architecture, GF100 doubles the number of CUDA cores over the previous architecture. NVIDIA GeForce GF100 Fermi GPUs are based on a scalable array of Graphics Processing Clusters (GPCs), Streaming Multiprocessors (SMs), and memory controllers. The NVIDIA GF100 implements four GPCs, sixteen SMs, and six memory controllers. Expect NVIDIA to launch GF100 products with different configurations of GPCs, SMs, and memory controllers to address different price points.

CPU commands are read by the GPU via the Host Interface. The GigaThread Engine fetches the specified data from system memory and copies them to the frame buffer. GF100 implements six 64-bit GDDR5 memory controllers (384-bit total) to facilitate high bandwidth access to the frame buffer. The GigaThread Engine then creates and dispatches thread blocks to various SMs. Individual SMs in turn schedules warps (groups of 32 threads) to CUDA cores and other execution units. The GigaThread Engine also redistributes work to the SMs when work expansion occurs in the graphics pipeline, such as after the tessellation and rasterization stages.

GF100 implements 512 CUDA cores, organized as 16 SMs of 32 cores each. Each SM is a highly parallel multiprocessor supporting up to 48 warps at any given time. Each CUDA core is a unified processor core that executes vertex, pixel, geometry, and compute kernels. A unified L2 cache architecture services load, store, and texture operations. GF100 has 48 ROP units for pixel blending, antialiasing, and atomic memory operations. The ROP units are organized in six groups of eight. Each group is serviced by a 64-bit memory controller. The memory controller, L2 cache, and ROP group are closely coupled-scaling one unit automatically scales the others.

NVIDIA GigaThread Thread Scheduler

One of the most important technologies of the Fermi architecture is its two-level, distributed thread scheduler. At the chip level, a global work distribution engine schedules thread blocks to various SMs, while at the SM level, each warp scheduler distributes warps of 32 threads to its execution units. The first generation GigaThread engine introduced in G80 managed up to 12,288 threads in real-time. The Fermi architecture improves on this foundation by providing not only greater thread throughput, but dramatically faster context switching, concurrent kernel execution, and improved thread block scheduling.

What's new in Fermi?

With any new technology, consumers want to know what's new in the product. The goal of this article is to share in-depth information surrounding the Fermi architecture, as well as the new functionality unlocked in GF100. For clarity, the 'GF' letters used in the GF100 GPU name are not an abbreviation for 'GeForce'; they actually denote that this GPU is a Graphics solution based on the Fermi architecture. The next generation of NVIDIA GeForce-series desktop video cards will use the GF100 to promote the following new features:

  • Third Generation Streaming Multiprocessor (SM)
    o 32 CUDA cores per SM, 4x over GT200
    o 8x the peak double precision floating point performance over GT200
    o Dual Warp Scheduler simultaneously schedules and dispatches instructions from two independent warps
    o 64 KB of RAM with a configurable partitioning of shared memory and L1 cache
  • Second Generation Parallel Thread Execution ISA
    o Unified Address Space with Full C++ Support
    o Optimized for OpenCL and DirectCompute
    o Full IEEE 754-2008 32-bit and 64-bit precision
    o Full 32-bit integer path with 64-bit extensions
    o Memory access instructions to support transition to 64-bit addressing
    o Improved Performance through Predication
  • Improved Memory Subsystem
    o NVIDIA Parallel DataCache hierarchy with Configurable L1 and Unified L2 Caches
    o First GPU with ECC memory support
    o Greatly improved atomic memory operation performance
  • NVIDIA GigaThread Engine
    o 10x faster application context switching
    o Concurrent kernel execution
    o Out of Order thread block execution
    o Dual overlapped memory transfer engines

Benchmark Reviews also more detail in our full-length NVIDIA GF100 GPU Fermi Graphics Architecture guide.



 

Comments 

 
# Little mistake...BETA911 2010-06-21 23:33
At Battleforge, how can a none DX11 card (9800GTX+) be in the charts when DX11 is tested? Same with the HD490.
Then, the HD5770 is not 256-bit but 128-bit!
Report Comment
 
 
# RE: Little mistake...Olin Coles 2010-06-22 06:07
Thanks for finding that typo - it's been fixed. I'll update the chart, too, since those products shouldn't be included. Even though the game allows them to benchmark with the same settings, they're not compliant and likely ignore the DX11 extensions.
Report Comment
 
 
# A Strange review pt1The Crouch 2010-06-22 11:50
I'm really sorry, but this review does not make much sense to me. Not compared to other reviews mind you, but in itself!

I count 5 clear wins for the 5850, 3 for the 465 and one wash (Resident evil 5). From the 465's point of view, thats a staggering 67% more wins for the 5850!!
Report Comment
 
 
# A Strange review pt2The Crouch 2010-06-22 11:52
When it comes to the value numbers you provide I count 5 wins for the 5850 and 4 for the 465 (RE5 is clearly a 465 win).

And by the way, I don't count the two parts of 3D vantage as separate tests.

So not only is the 5850 the faster card with over half the tests won, more importantly, it also offers the most bang for your buck! All according to your own figures!

At least to me, this would count as a clear win for the 5850, but that is hardly what I see in the summary.

Also worth mentioning i think: Having been on Newegg on a few occasions, $305 seemed a bit steep for a 5850, and for aspiring customers for a graphics card, I can tell a 5850 can be found for $285. Only $5 more expensive than the price for the 465 you are quoting, and with that small difference I think the value numbers throughout the test would look a bit different.
Report Comment
 
 
# RE: A Strange review pt2Olin Coles 2010-06-22 16:03
Based on NewEgg prices today, nearly every single Radeon HD 5850 is priced above $305 with an average price of $325 (I did the math). Conversely, several models of the GTX-465 sells for as little as $250, with an average price of $260. That makes the Radeon HD 5850 22~25% more expensive... but does it perform 22~25% better? No, it doesn't. It doesn't even perform better than the GTX-465 all of the time; only 'some' of the time... slightly more than half (as you point out). So should a card that costs $55-75 more than GTX-465 be considered the best value when it doesn't even offer a relative boost to performance? I don't think so.
You should also check your math on the cost per FPS, because the GTX-465 beats the Radeon 5850 in nearly all of them.
Report Comment
 
 
# Thank you !SiliconDoc 2010-06-27 17:10
I came here to see just how much red raging rooster ATI bias was here on the gtx465.
I thank you and congratulate you for your response to the commenter.
I sit here absolutely STUNNED. I can't believe that somebody didn't just "take it" and nearly agree with the ati fan fraud.
THANK YOU SO MUCH.
My faith in humanity has been renewed.
Believe me, I really, really appreciate it.
Sincerely sick of the rampant red bias,
SiliconDoc
Report Comment
 
 
# Is a 1~2 FPS lead really a win?Olin Coles 2010-06-22 17:54
Is a 1~2 FPS lead really a win? You might see it that way, but I don't. Especially when the Radeon HD 5850 costs $55 more.
Report Comment
 
 
# RE: ASUS GeForce GTX-465 Video CardStephen E 2010-06-22 16:48
About the VGA Power Comparison that you did, can you provide a sample calculation on how you came up with your data?

Did you just report the AC Power differnence between no graphic card in the system and with the Graphics card installed? Did you try to take into account the PSU efficiency?
Report Comment
 
 
# RE: RE: ASUS GeForce GTX-465 Video CardOlin Coles 2010-06-22 16:53
From the power consumption section: "A baseline test is taken without a video card installed inside our test computer system, which is allowed to boot into Windows-7 and rest idle at the login screen before power consumption is recorded. Once the baseline reading has been taken, the graphics card is installed and the system is again booted into Windows and left idle at the login screen. Our final loaded power consumption reading is taken with the video card running a stress test using FurMark. Below is a chart with the isolated video card power consumption (not system total) displayed in Watts for each specified test product."

Power supply efficiency is not taken into consideration for any of our reported results. Only the motherboard, processor, memory, SSD, and video card are drawing power. The math is simply idle/load result minus baseline.
Report Comment
 
 
# Weird...xtremesv 2010-06-22 18:04
Why do reviewers still benchmark FarCry 2? Is it a requirement recommended (imposed) by Nvidia?

And I don't get your pricing figures. I found a 5850 for $285 and another for $305 in Newegg... the ones you mention beyond $325 include special cooling designs.
Report Comment
 
 
# nooneoverclockyourkeyboard 2012-02-11 03:10
hey do you know that i got my zotac gtx 465 at just 7250 which is $147.17(converted to USD) and the 5850 costs 14950 which is $303.48.At this price i can sli a gtx 465 and when you sli a gtx 465 against a 5850 clearly 465's the winner.I dunno why the prices aren't coming down for the 5850.
Report Comment
 

Comments have been disabled by the administrator.

Search Benchmark Reviews
QNAP Network Storage Servers

Follow Benchmark Reviews on FacebookReceive Tweets from Benchmark Reviews on Twitter