Archive Home arrow Reviews: arrow Video Cards arrow ASUS GeForce GT 430 Fermi GF108 Video Card

ASUS GeForce GT 430 Fermi GF108 Video Card E-mail
Reviews - Featured Reviews: Video Cards
Written by Hank Tolman   
Monday, 11 October 2010
Table of Contents: Page Index
ASUS GeForce GT 430 Fermi GF108 Video Card
ASUS ENGT430 Features
NVIDIA GF108 Specifications
Video Card Testing Methodology
DX10: 3DMark Vantage
DX10: Left 4 Dead 2
DX10: Street Fighter IV
DX10: Far Cry 2
DX10: Resident Evil 5
DX11: Aliens vs Predator
DX11: Battlefield Bad Company 2
DX11: Lost Planet 2
DX11: Unigine Heaven 2.2
ASUS ENGT430 Final Thoughts
ASUS ENGT430 GeForce GT 430 Conclusion

NVIDIA GF108 GPU Fermi Architecture

Based on the Fermi architecture, NVIDIA's latest GPU is codenamed GF108 and is equipped on the GeForce GT 430. In this article, Benchmark Reviews explains the technical architecture behind NVIDIA's GF108 graphics processor and offers an insight into upcoming Fermi-based GeForce video cards. For those who are not familiar, NVIDIA's GF100 GPU was their first graphics processor to support DirectX-11 hardware features such as tessellation and DirectCompute, while also adding heavy particle and turbulence effects. The GF100 GPU is also the successor to the GT200 graphics processor, which launched in the GeForce GTX 280 video card back in June 2008. NVIDIA has since redefined their focus, allowing subsequent GF100, GF104, GF106, and now GF108 GPUs to prove their dedication towards next generation gaming effects such as raytracing, order-independent transparency, and fluid simulations.

While processor cores have grown from 128 (G80) and 240 (GT200), they reach 512 in the GF100 and earn the title of NVIDIA CUDA (Compute Unified Device Architecture) cores. GF100 was not another incremental GPU step-up like we had going from G80 to GT200. GF100 featured 512 CUDA cores, while GF104 was capable of 336 cores and GF106 had 192. Effectively cutting the four SMUs on GF106 in half, NVIDIA's GF108 is good for 96 CUDA cores from just two SMUs. The key here is not only the name, but that the name now implies an emphasis on something more than just graphics. Each Fermi CUDA processor core has a fully pipelined integer arithmetic logic unit (ALU) and floating point unit (FPU). GF108 implements the IEEE 754-2008 floating-point standard, providing the fused multiply-add (FMA) instruction for both single and double precision arithmetic. FMA improves over a multiply-add (MAD) instruction by doing the multiplication and addition with a single final rounding step, with no loss of precision in the addition. FMA minimizes rendering errors in closely overlapping triangles.

GF108 implements 96 CUDA cores, organized as 2 SMs of 48 cores each. Each SM is a highly parallel multiprocessor supporting up to 32 warps at any given time (four Dispatch Units per SM deliver two dispatched instructions per warp for four total instructions per clock per SM). Each CUDA core is a unified processor core that executes vertex, pixel, geometry, and compute kernels. A unified L2 cache architecture (512KB on 1GB cards) services load, store, and texture operations. GF108 is designed to offer a total of 4 ROP units pixel blending, antialiasing, and atomic memory operations. The ROP units are organized in two groups of two. Each group is serviced by a 64-bit memory controller. The memory controller, L2 cache, and ROP group are closely coupled-scaling one unit automatically scales the others.

Asus_ENGT430_Diagram.png

NVIDIA Fermi GF108 Block Diagram

Based on Fermi's third-generation Streaming Multiprocessor (SM) architecture, GF108 could be considered a divided GF106. NVIDIA GeForce GF100-series Fermi GPUs are based on a scalable array of Graphics Processing Clusters (GPCs), Streaming Multiprocessors (SMs), and memory controllers. NVIDIA's GF100 GPU implemented four GPCs, sixteen SMs, and six memory controllers. GF104 implements two GPCs, eight SMs, and four memory controllers. Conversely, GF108 houses one GPC, two SMs, and two memory controllers. Where each SM contained 32 CUDA cores in the GF100, NVIDIA configured GF104 with 48 cores per SM... which has been repeated for GF106 and the GF108. As expected, NVIDIA Fermi-series products are launching with different configurations of GPCs, SMs, and memory controllers to address different price points.

CPU commands are read by the GPU via the Host Interface. The GigaThread Engine fetches the specified data from system memory and copies them to the frame buffer. GF108 implements two 64-bit GDDR3memory controllers (128-bit total) to facilitate high bandwidth access to the frame buffer. The GigaThread Engine then creates and dispatches thread blocks to various SMs. Individual SMs in turn schedules warps (groups of 48 threads) to CUDA cores and other execution units. The GigaThread Engine also redistributes work to the SMs when work expansion occurs in the graphics pipeline, such as after the tessellation and rasterization stages.

GF108 Specifications

  • 96 CUDA Cores
  • 16 Texture Units
  • 4 ROP Units
  • 128-bit GDDR3
  • DirectX-11 API Support

GeForce 400-Series Specifications

Graphics Card

GeForce GT 430

GeForce GTS 450

GeForce GTX 460

GeForce GTX 465

GeForce GTX 470

GeForce GTX 480

GPU Transistors

585 Million

1.17 Billion

1.95 Billion

3.2 Billion

3.2 Billion

3.2 Billion

Graphics Processing Clusters

1

1

2

4

4

4

Streaming Multiprocessors

2

4

7

11

14

15

CUDA Cores

96

192

336

352

448

480

Texture Units

16

32

56

44

56

60

ROP Units

4

16

768MB=24 / 1GB=32

32

40

48

Graphics Clock
(Fixed Function Units)

700 MHz

783 MHz

675 MHz

607 MHz

607 MHz

700 MHz

Processor Clock
(CUDA Cores)

1400 MHz

1566 MHz

1350 MHz

1215 MHz

1215 MHz

1401 MHz

Memory Clock
(Clock Rate/Data Rate)

900/1800 MHz

902/3608 MHz

900/3600 MHz

837/3348 MHz

837/3348 MHz

924/3696 MHz

Total Video Memory

1024MB GDDR3

1024MB GDDR5

768MB / 1024MB GDDR5

1024MB GDDR5

1280MB GDDR5

1536MB GDDR5

Memory Interface

128-Bit

128-Bit

768MB=192 / 1GB=256-Bit

256-Bit

320-Bit

384-Bit

Total Memory Bandwidth

28.8 GB/s

57.7 GB/s

86.4 / 115.2 GB/s

102.6 GB/s

133.9 GB/s

177.4 GB/s

Texture Filtering Rate
(Bilinear)

11.2 GigaTexels/s

25.1 GigaTexels/s

37.8 GigaTexels/s

26.7 GigaTexels/s

34.0 GigaTexels/s

42.0 GigaTexels/s

GPU Fabrication Process

40 nm

40 nm

40 nm

40 nm

40 nm

40 nm

Output Connections

1x Dual-Link DVI-I 1x HDMI 1x VGA

2x Dual-Link DVI-I
1x Mini HDMI

2x Dual-Link DVI-I
1x Mini HDMI

2x Dual-Link DVI-I
1x Mini HDMI

2x Dual-Link DVI-I
1x Mini HDMI

2x Dual-Link DVI-I
1x Mini HDMI

Form Factor

Dual Slot

Dual-Slot

Dual-Slot

Dual-Slot

Dual-Slot

Dual-Slot

Power Input

None

6-Pin

2x 6-Pin

2x 6-Pin

2x 6-Pin

6-Pin + 8-Pin

Thermal Design Power (TDP)

49 Watts

106 Watts

768MB=150W / 1GB=160W

200 Watts

215 Watts

250 Watts

Recommended PSU

300 Watts

400 Watts

450 Watts

550 Watts

550 Watts

600 Watts

GPU Thermal Threshold

95°C

104°C

105°C

105°C

105°C

GeForce Fermi Chart Courtesy of Benchmark Reviews



 

Add comment


Security code
Refresh

Search Benchmark Reviews
QNAP Network Attached Storage Servers

Follow Benchmark Reviews on FacebookReceive Tweets from Benchmark Reviews on Twitter