Difference between revisions of "Sega Dreamcast/Hardware comparison"
From Sega Retro
(→Vs. PC) |
|||
Line 12: | Line 12: | ||
The PowerVR2 was also optimized for the [[Hitachi SH-4]]'s geometry processing capabilities (rather than for a Pentium II or III), while PC drivers and software were not optimized for the Neon 250's tiled rendering architecture (compared to Dreamcast games which were optimized for the CLX2's tiled rendering architecture). The Neon 250 thus had only a fraction of the Dreamcast CLX2's fillrate and rendering performance. The reduction in performance from the Dreamacst's CLX2 to the Neon 250 was comparable to the reduction in performance from the [[Sega Model 3]]'s Real3D Pro-1000 to the [[wikipedia:Intel740|Intel740]]. | The PowerVR2 was also optimized for the [[Hitachi SH-4]]'s geometry processing capabilities (rather than for a Pentium II or III), while PC drivers and software were not optimized for the Neon 250's tiled rendering architecture (compared to Dreamcast games which were optimized for the CLX2's tiled rendering architecture). The Neon 250 thus had only a fraction of the Dreamcast CLX2's fillrate and rendering performance. The reduction in performance from the Dreamacst's CLX2 to the Neon 250 was comparable to the reduction in performance from the [[Sega Model 3]]'s Real3D Pro-1000 to the [[wikipedia:Intel740|Intel740]]. | ||
− | ===Voodoo | + | ===Pentium, GeForce, Voodoo=== |
− | The Dreamcast was generally the most powerful home system during 1998–1999, outperforming high-end PC hardware at the time.{{fileref|GamersRepublic US 03.pdf|page=29}} The Dreamcast's [[Hitachi]] [[SuperH|SH-4]] CPU calculates 3D graphics | + | The Dreamcast was generally the most powerful home system during 1998–1999, outperforming high-end PC hardware at the time.{{fileref|GamersRepublic US 03.pdf|page=29}} The Dreamcast's [[Hitachi]] [[SuperH|SH-4]] CPU calculates 3D graphics several times faster than a [[wikipedia:Pentium II|Pentium II]] from 1998,{{fileref|GamersRepublic US 03.pdf|page=29}} and faster than a [[wikipedia:Pentium III|Pentium III]] and [[NVIDIA]] [[wikipedia:GeForce256|GeForce 256]] from 1999. The Dreamcast's PowerVR CLX2 GPU, due to its tiled rendering architecture, also has has a higher [[fillrate]] and faster polygon rendering throughput than a [[wikipedia:Voodoo3|Voodoo3]] and GeForce 256 from 1999. |
The Dreamcast's CPU–GPU transmission bus is faster than the Voodoo3 and has a higher effective bandwidth than the GeForce 256 due to the Dreamcast's efficient bandwidth usage, including its lack of CPU overhead from the [[wikipedia:Operating system|operating system]] and the CLX2's tiled rendering architecture: [[wikipedia:Texture mapping|textures]] loaded directly to VRAM (freeing up CPU–GPU transmission bus for polygons), higher [[wikipedia:Texture compression|texture compression]], on-chip tile buffer with internal [[wikipedia:Z-buffer|Z-buffering]], and [[wikipedia:Deferred shading|deferred rendering]] (no need to draw, [[wikipedia:Shading|shade]] or texture overdrawn polygons). The CLX2 is also capable of [[wikipedia:Order-independent transparency|order-independent transparency]] (which the Voodoo3 and GeForce 256 lacked) and [[wikipedia:Normal mapping|Dot3 normal mapping]] (which the Voodoo3 lacked).{{ref|1=''[[wikipedia:PC Magazine|PC Magazine]]'', [https://books.google.co.uk/books?id=90OvoBUqQoIC&pg=PA193 December 1999, page 193]}} | The Dreamcast's CPU–GPU transmission bus is faster than the Voodoo3 and has a higher effective bandwidth than the GeForce 256 due to the Dreamcast's efficient bandwidth usage, including its lack of CPU overhead from the [[wikipedia:Operating system|operating system]] and the CLX2's tiled rendering architecture: [[wikipedia:Texture mapping|textures]] loaded directly to VRAM (freeing up CPU–GPU transmission bus for polygons), higher [[wikipedia:Texture compression|texture compression]], on-chip tile buffer with internal [[wikipedia:Z-buffer|Z-buffering]], and [[wikipedia:Deferred shading|deferred rendering]] (no need to draw, [[wikipedia:Shading|shade]] or texture overdrawn polygons). The CLX2 is also capable of [[wikipedia:Order-independent transparency|order-independent transparency]] (which the Voodoo3 and GeForce 256 lacked) and [[wikipedia:Normal mapping|Dot3 normal mapping]] (which the Voodoo3 lacked).{{ref|1=''[[wikipedia:PC Magazine|PC Magazine]]'', [https://books.google.co.uk/books?id=90OvoBUqQoIC&pg=PA193 December 1999, page 193]}} |
Revision as of 01:27, 8 December 2016
This article needs cleanup. This article needs to be edited to conform to a higher standard of article quality. After the article has been cleaned up, you may remove this message. For help, see the How to Edit a Page article. |
Contents
Vs. Arcade
The Sega Dreamcast's arcade counterpart, the Sega NAOMI, has the same CPU, the Hitachi SH-4, at the same clock rate, but is more powerful in other ways, including an updated PowerVR2 GPU with faster performance, additional RAM and VRAM, higher bandwidth, and faster ROM cartridge storage. The NAOMI released for $1995, ten times the price of the Dreamcast and more expensive than a high-end PC at the time, but cheaper than the Sega Model 3 arcade system (which debuted at $20,000 in 1996).
The NAOMI was, in turn, the basis for two significantly more powerful arcade systems, the Hikaru (debuted 1999) and NAOMI 2 (debuted 2000). Sega later packaged the Dreamcast into an arcade board as the Atomiswave. While the Dreamcast is not as powerful as 1997–1999 Sega arcade hardware, including the Model 3 Step 2 (debuted 1997), NAOMI, and Hikaru, the Dreamcast rivalled the Model 3 Step 1 (debuted 1996) in performance.
Vs. PC
Neon 250
The Dreamcast's PowerVR CLX2 GPU was the basis for the PowerVR PMX1, a PC GPU released with the Neon 250 graphics card in 1999. However, the Neon 250 lacks many of the tiled rendering features of the CLX2: the tile size is halved (halving the fillrate), it lacks the CLX2's internal Z-buffering and alpha test capability with hardware front-to-back translucency sorting (further reducing the fillrate and performance, as well as requiring the Neon 250 to render a Z-buffer externally), and the tiling is partially handled by software (the CLX2 handles the tiling entirely in hardware). The Neon 250 also lacks the CLX2's latency buffering and palettized texture support while VQ texture compression performance is halved, and it has bus contention due to having a single data bus (whereas the CLX2 has two data buses).
The PowerVR2 was also optimized for the Hitachi SH-4's geometry processing capabilities (rather than for a Pentium II or III), while PC drivers and software were not optimized for the Neon 250's tiled rendering architecture (compared to Dreamcast games which were optimized for the CLX2's tiled rendering architecture). The Neon 250 thus had only a fraction of the Dreamcast CLX2's fillrate and rendering performance. The reduction in performance from the Dreamacst's CLX2 to the Neon 250 was comparable to the reduction in performance from the Sega Model 3's Real3D Pro-1000 to the Intel740.
Pentium, GeForce, Voodoo
The Dreamcast was generally the most powerful home system during 1998–1999, outperforming high-end PC hardware at the time.[1] The Dreamcast's Hitachi SH-4 CPU calculates 3D graphics several times faster than a Pentium II from 1998,[1] and faster than a Pentium III and NVIDIA GeForce 256 from 1999. The Dreamcast's PowerVR CLX2 GPU, due to its tiled rendering architecture, also has has a higher fillrate and faster polygon rendering throughput than a Voodoo3 and GeForce 256 from 1999.
The Dreamcast's CPU–GPU transmission bus is faster than the Voodoo3 and has a higher effective bandwidth than the GeForce 256 due to the Dreamcast's efficient bandwidth usage, including its lack of CPU overhead from the operating system and the CLX2's tiled rendering architecture: textures loaded directly to VRAM (freeing up CPU–GPU transmission bus for polygons), higher texture compression, on-chip tile buffer with internal Z-buffering, and deferred rendering (no need to draw, shade or texture overdrawn polygons). The CLX2 is also capable of order-independent transparency (which the Voodoo3 and GeForce 256 lacked) and Dot3 normal mapping (which the Voodoo3 lacked).[2]
In terms of game engine performance, the CLX2 peaks at 5 million polygons/sec,[3] compared to the GeForce 256 which peaks at 2.9 million polygons/sec.[4] Dreamcast game engines rendered 50,000–160,000 polygons per scene (3–5 million polygons/sec),[3] while PC game engines of 1999 rendered up to 10,000 polygons per scene[5][6] (1–1.6 million polygons/sec).[7] Character models in particular were significantly more detailed in Dreamcast games than in PC games during 1998–1999.[8]
Vs. Consoles
PlayStation 2
Compared to the rival PlayStation 2, the Dreamcast is better at textures, anti-aliasing, and image quality, while the PS2 is better at polygon geometry, particles, and lighting. The PS2 has a more powerful CPU geometry engine, higher translucent fillrate, and more main RAM (32 MB, compared to Dreamcast's 16 MB), while the DC has more VRAM (8 MB, compared to PS2's 4 MB), higher opaque fillrate, and more GPU hardware features, with CLX2 capabilities like tiled rendering, super-sample anti-aliasing, Dot3 normal mapping, order-independent transparency, and texture compression, which the PS2's GPU lacks.
With larger VRAM and tiled rendering, the DC can render a larger framebuffer at higher native resolution (with an on-chip Z-buffer), and with texture compression, it can compress around 20–60 MB of texture data in its VRAM. Because the PS2 has only 4 MB VRAM, it relies on the main RAM to store textures. While the PS2's CPU–GPU transmission bus for transferring polygons and textures is 50% faster than the Dreamcast's CPU–GPU transmission bus, the DC has textures loaded directly to VRAM (freeing up the CPU–GPU transmission bus for polygons) and texture compression gives it higher effective texture bandwidth.
Dreamcast games were effectively using 20–30 MB of texture data[9] (compressed to around 5–6 MB),[10] while PS2 games up until 2003 peaked at 5.5 MB of texture data (average 1.5 MB). PS2 games up until 2003 rendered up to 7.5 million polygons/sec (145,000 polygons per scene), with most rendering 2–5 million polygons/sec (average 52,000 polygons per scene);[11] in comparison, Dreamcast game engines rendered up to 5 million polygons/sec (160,000 polygons per scene),[3] with most games rendering 2–4 million polygons/sec (average 50,000 polygons per scene).
The Dreamcast is more user-friendly for developers, making it easier to develop for, while the PS2 is more difficult to develop for; this is the reverse of the 32-bit era, when the PlayStation was more user-friendly, and the Saturn more difficult, for developers.
GameCube and Xbox
The GameCube and Xbox are both generally more powerful than the Dreamcast, but the Dreamcast has several hardware advantages. The T&L geometry performance of the Dreamcast's SH-4 CPU is faster than the Xbox's Pentium III CPU but slower than the GameCube's PowerPC CPU; however, the GameCube and Xbox have T&L GPU, each with faster geometry performance than the Dreamcast.
The Dreamcast has an on-chip Z-buffer, which the GameCube also has but the Xbox lacks. The Dreamcast has a faster Z-buffer bandwidth than both, giving it a higher opaque fillrate, but with lower translucent fillrate. The higher opaque fillrate allows the Dreamcast to draw a higher number of large opaque polygons, whereas the GameCube and Xbox can draw a higher number of small polygons and/or translucent polygons.
Graphics comparison table
- See Sega Dreamcast technical specifications for more technical details on Dreamcast hardware
System | Dreamcast (1998) | NAOMI (1998) | PC (1998) | PC (1999) | PlayStation 2 (2000) | GameCube (2001) | Xbox (2001) | |||
---|---|---|---|---|---|---|---|---|---|---|
Geometry processors | Hitachi SH-4 (200 MHz) |
Hitachi SH-4 (200 MHz) |
Intel Pentium II (450 MHz) |
Intel Pentium III 800EB (800 MHz), NVIDIA GeForce 256 (120 MHz) |
Emotion Engine (294.912 MHz) |
Gekko (485 MHz), Flipper (162 MHz) |
Pentium III (733 MHz), NV2A (233 MHz) | |||
Matrix transform [n 1] |
FLOPS | 1.4 GFLOPS[n 2] | 1.4 GFLOPS | 230 MFLOPS[n 3] | 720 MFLOPS[n 4] | 5.5 GFLOPS[n 5] | 7.5 GFLOPS[n 6] | 5.8 GFLOPS[n 7] | ||
MACs/sec | 800 million[n 8] | 800 million | 100 million[n 9] | 300 million[n 10] | 2 billion[n 11] | 3 billion[n 12] | 2 billion[n 13] | |||
Vertices | 50 MVertices/s[n 14] | 50 MVertices/s | 8.4 MVertices/s[n 15] | 25 MVertices/s[n 16] | 140 MVertices/s[n 17] | 162 MVertices/s[n 18] | 116 MVertices/s[n 19] | |||
Perspective transform | 16 MVertices/s[n 20] | 16 MVertices/s | 2.6 MVertices/s[n 21] | 9.3 MVertices/s[n 22] | 80 MVertices/s | 160 MVertices/s[n 23] | 110 MVertices/s[n 24] | |||
Lighting | 1 light source | 14 MPolygons/s[n 25] | 14 MPolygons/s | 2 MPolygons/s[n 26] | 7.2 MPolygons/s[n 27] | 39 MPolygons/s[n 28] | 90 MPolygons/s[n 29] | 46 MPolygons/s[n 30] | ||
4 light sources | 6.8 MPolygons/s | 6.8 MPolygons/s | 1.1 MPolygons/s[n 31] | 5.8 MPolygons/s[n 32] | 9.8 MPolygons/s[n 33] | 20 MPolygons/s[n 34] | 16 MPolygons/s[n 35] | |||
Rendering processors | PowerVR CLX2 (100 MHz) |
PowerVR2 (100 MHz)[n 36] |
2x Voodoo2 (SLI) (90 MHz)[n 37] |
Neon 250 (125 MHz) |
Voodoo3 SE (200 MHz)[n 38] |
GeForce 256 (120 MHz) |
Graphics Synthesizer (147.456 MHz) |
Flipper (162 MHz) | NV2A (233 MHz) | |
Tiled rendering |
Tiling FPU | 720 MFLOPS | 1 GFLOPS | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
Tile size | 32×32 pixels | 32×32 pixels | N/A | 32×16 pixels | ||||||
Pixel fillrate |
Opaque | 3.2 GPixels/s[n 39] | 6 GPixels/s[n 40] | 100 MPixels/s[n 41] | 500 MPixels/s | 200 MPixels/s | 430 MPixels/s [n 42] |
2.3 GPixels/s [n 43] |
648 MPixels/s | 870 MPixels/s [n 44] |
Opaque/Translucent | 500 MPixels/s | 1 GPixel/s | 100 MPixels/s | 250 MPixels/s | ||||||
Texture fillrate |
Opaque | 3.2 GTexels/s | 6 GTexels/s | 100 MTexels/s | 500 MTexels/s | 380 MTexels/s [n 45] |
320 MTexels/s [n 46] |
1.1 GTexels/s | 648 MTexels/s | 650 MTexels/s [n 47] |
Opaque/Translucent | 500 MTexels/s | 1 GTexel/s | 100 MTexels/s | 250 MTexels/s | ||||||
Multi-texture fillrate |
Opaque | 1.6 GTexels/s | 3 GTexels/s | 100 MTexels/s | 250 MTexels/s | 310 MTexels/s [n 48] |
250 MTexels/s [n 49] |
580 MTexels/s | 648 MTexels/s | 520 MTexels/s [n 50] |
Opaque/Translucent | 250 MTexels/s | 500 MTexels/s | 100 MTexels/s | 120 MTexels/s | ||||||
Textured polygons |
32-pixel | 7.1 MPolygons/s | 12 MPolygons/s | 2 MPolygons/s | 4 MPolygons/s[28] | 6 MPolygons/s | 7 MPolygons/s | 30 MPolygons/s | 20 MPolygons/s | 20 MPolygons/s |
100-pixel (opaque) | 7.1 MPolygons/s | 12 MPolygons/s | 1 MPolygons/s | 4 MPolygons/s | 2 MPolygons/s | 4.3 MPolygons/s | 10 MPolygons/s | 6.4 MPolygons/s | 8 MPolygons/s | |
100-pixel (opaque/translucent) |
5 MPolygons/s | 10 MPolygons/s | 1 MPolygons/s | 2.5 MPolygons/s | ||||||
Multi-texture polygons |
32-pixel | 7.1 MPolygons/s | 12 MPolygons/s | 2 MPolygons/s | 4 MPolygons/s | 5 MPolygons/s | 7 MPolygons/s | 18 MPolygons/s | 20 MPolygons/s | 16 MPolygons/s |
100-pixel (opaque) | 7.1 MPolygons/s | 12 MPolygons/s | 1 MPolygons/s | 2.5 MPolygons/s | 2 MPolygons/s | 2.5 MPolygons/s | 5 MPolygons/s | 6.4 MPolygons/s | 5 MPolygons/s | |
100-pixel (opaque/translucent) |
2.5 MPolygons/s | 5 MPolygons/s | 1 MPolygons/s | 1.2 MPolygons/s | ||||||
Texture compression ratio | 7.98:1 (VQ) | 7.98:1 (VQ) | 3:1 (palette)[n 51] | 4:1 (VQ) | 4:1 (FXT1) | 6:1 (S3TC) | 3:1 (palette)[n 52] | 6:1 (S3TC) | 6:1 (S3TC) | |
CPU–GPU transfer bus[n 53] |
Bandwidth | 800 MB/s[13] | 800 MB/s | 260 MB/s[n 54] | 530 MB/s[n 55] | 530 MB/s[n 56] | 1 GB/s[n 57] | 1.2 GB/s[27] | 1.3 GB/s[n 58] | 1.064 GB/s[n 59] |
Texture compression | 6.3 GB/s | 6.3 GB/s | 800 MB/s | 2.1 GB/s | 2.1 GB/s | 6 GB/s | 3.6 GB/s | 7.7 GB/s | 6.3 GB/s | |
Internal GPU cache |
Cache memory | 33 KB[n 60] | 46 KB[n 61] | N/A | 16 KB[n 62] | N/A | N/A | 4 MB | 3 MB | N/A |
Bandwidth | 15 GB/s[n 63] | 28 GB/s[n 64] | N/A | 1 GB/s | N/A | N/A | 48 GB/s[n 65] | 20 GB/s[n 66] | ||
External video memory |
External memory | 24 MB (SDRAM)[n 67] |
48 MB (SDRAM), 100 MB (VROM)[9] |
16 MB (SDRAM)[n 68] |
32 MB (SDRAM) |
16 MB (SDRAM) |
32 MB (SDRAM) |
32 MB (RDRAM)[n 69] |
24 MB (1T-SRAM) |
64 MB (DDR SDRAM) |
Texture compression | 190 MB | 300 MB (SDRAM), 700 MB (VROM) |
32 MB[n 70] | 120 MB | 64 MB | 190 MB | 96 MB | 144 MB | 300 MB | |
Bandwidth | 1.6 GB/s[n 71] | 1.8 GB/s[n 72] | 2.8 GB/s[n 73] | 1 GB/s | 3.1 GB/s | 2.6 GB/s | 3.2 GB/s[n 74] | 2.6 GB/s[n 75] | 5.3 GB/s[n 76] | |
Buffering bandwidth |
Framebuffer | 800 MB/s (tiled 6.4 GB/s)[n 77] |
1 GB/s (tiled 12 GB/s)[n 78] |
720 MB/s[n 73] | 1 GB/s | 3.1 GB/s | 2.6 GB/s | 38 GB/s[n 65] | 9.6 GB/s[n 66] | 5.3 GB/s |
Z-buffer | 12 GB/s[n 63] | 25 GB/s[n 64] | ||||||||
Texture buffer | 800 MB/s (compress 6 GB/s) |
1 GB/s (compress 7 GB/s) |
720 MB/s[n 73] (compress 2.1 GB/s) |
9.6 GB/s[n 65] | 10 GB/s[n 66] | |||||
System | Dreamcast (1998) | NAOMI (1998) | PC (1998) | PC (1999) | PlayStation 2 (2000) | GameCube (2001) | Xbox (2001) |
Notes
- ↑ [Matrix transformation (4×4 matrix × 4×1 vector) - 28 computations (16 multiplies, 12 adds)[12] Matrix transformation (4×4 matrix × 4×1 vector) - 28 computations (16 multiplies, 12 adds)[12]]
- ↑ [1.4 GFLOPS,[13][14] 7 floating-point operations per cycle (28 computations per 4 cycles)[15][16] 1.4 GFLOPS,[13][14] 7 floating-point operations per cycle (28 computations per 4 cycles)[15][16]] (Wayback Machine: 2000-08-23 20:47)
- ↑ [28 floating-point operations per 53 cycles[17] 28 floating-point operations per 53 cycles[17]]
- ↑ [Pentium III: 28 floating-point operations per 31 cycles[17]
GeForce 256: T&L unit outperformed by Pentium III (742 MHz)[18] Pentium III: 28 floating-point operations per 31 cycles[17]
GeForce 256: T&L unit outperformed by Pentium III (742 MHz)[18]] - ↑ [Emotion Engine FPU: 0.64 GFLOPS
Emotion Engine VU0/VU1: 5.52 GFLOPS Emotion Engine FPU: 0.64 GFLOPS
Emotion Engine VU0/VU1: 5.52 GFLOPS] - ↑ [Gekko: 1.94 GFLOPS (4 floating-point operations per cycle)
Flipper: 7.533 GFLOPS (46.5 floating-point operations per cycle)[19] Gekko: 1.94 GFLOPS (4 floating-point operations per cycle)
Flipper: 7.533 GFLOPS (46.5 floating-point operations per cycle)[19]] - ↑ [Pentium III: 662 MFLOPS (28 floating-point operations per 31 cycles)
NV2A: 5.8 GFLOPS (24 floating-point operations per cycle) Pentium III: 662 MFLOPS (28 floating-point operations per 31 cycles)
NV2A: 5.8 GFLOPS (24 floating-point operations per cycle)] - ↑ [4 MAC operations per cycle[16] 4 MAC operations per cycle[16]]
- ↑ [3.3125 cycles per MAC operation: 53 cycles per 12 MACs[17] 3.3125 cycles per MAC operation: 53 cycles per 12 MACs[17]]
- ↑ [31 cycles per 12 MAC operations[17] 31 cycles per 12 MAC operations[17]]
- ↑ [8 MAC operations per cycle (4 MAC operations per VU)[20] 8 MAC operations per cycle (4 MAC operations per VU)[20]]
- ↑ [19 MAC operations (12 MACs matrix transform, 7 MACs perspective transform) per cycle (8 cycles per 8 vertices)[19] 19 MAC operations (12 MACs matrix transform, 7 MACs perspective transform) per cycle (8 cycles per 8 vertices)[19]]
- ↑ [Pentium III: 280 million MAC operations per second (31 cycles per 12 MAC operations)
NV2A: 2.2 billion MAC operations per second (116.5 million vertices/sec, 19 MACs each) per second (9.5 MAC operations per cycle) Pentium III: 280 million MAC operations per second (31 cycles per 12 MAC operations)
NV2A: 2.2 billion MAC operations per second (116.5 million vertices/sec, 19 MACs each) per second (9.5 MAC operations per cycle)] - ↑ [4 cycles per matrix transformation[21] 4 cycles per matrix transformation[21]]
- ↑ [53 cycles per matrix transformation[17] 53 cycles per matrix transformation[17]]
- ↑ [31 cycles per matrix transformation[17] 31 cycles per matrix transformation[17]]
- ↑ [2 matrix transformations (1 transformation per VU) per 4 cycles[22] 2 matrix transformations (1 transformation per VU) per 4 cycles[22]]
- ↑ [1 matrix transformation per cycle[19] 1 matrix transformation per cycle[19]]
- ↑ [Pentium III: 23 million vertices per second (31 cycles per matrix transformation)
NV2A: 116 million vertices per second Pentium III: 23 million vertices per second (31 cycles per matrix transformation)
NV2A: 116 million vertices per second] - ↑ [MVertices/s = Million vertices per second MVertices/s = Million vertices per second]
- ↑ [170 cycles per perspective transformation: 53 cycles matrix transformation, 117 cycles projection (2 multiplies, 2 adds, 3 divides)[23] (2 cycles per multiply, 1 cycle per add, 37 cycles per divide)[24] 170 cycles per perspective transformation: 53 cycles matrix transformation, 117 cycles projection (2 multiplies, 2 adds, 3 divides)[23] (2 cycles per multiply, 1 cycle per add, 37 cycles per divide)[24]]
- ↑ [86 cycles per perspective transformation: 31 cycles matrix transformation, 55 cycles projection (2 multiplies, 2 adds, 3 divides)[23] (1 cycle per multiply, 1 cycle per add, 17 cycles per divide)[25] 86 cycles per perspective transformation: 31 cycles matrix transformation, 55 cycles projection (2 multiplies, 2 adds, 3 divides)[23] (1 cycle per multiply, 1 cycle per add, 17 cycles per divide)[25]]
- ↑ [8 cycles per 8 perspective transformations in T&L pipeline[19] 8 cycles per 8 perspective transformations in T&L pipeline[19]]
- ↑ [Pentium III: 8.5 million vertices per second (86 cycles per perspective transformation)
NV2A: 116.5 million vertices per second (2 cycles per vertex) Pentium III: 8.5 million vertices per second (86 cycles per perspective transformation)
NV2A: 116.5 million vertices per second (2 cycles per vertex)] - ↑ [MPolygons/s = Million polygons per second MPolygons/s = Million polygons per second]
- ↑ [223 cycles per vertex: 170 cycles perspective transformation, 53 cycles lighting (21 multiplies, 11 adds),[23] 2 cycles per multiply, 1 cycle per add, 37 cycles per divide[24] 223 cycles per vertex: 170 cycles perspective transformation, 53 cycles lighting (21 multiplies, 11 adds),[23] 2 cycles per multiply, 1 cycle per add, 37 cycles per divide[24]]
- ↑ [110 cycles per vertex: Pentium III (742 MHz) calculates 6,752,000 triangle strips per second, with 1 light, faster than GeForce 256's T&L unit[18] 110 cycles per vertex: Pentium III (742 MHz) calculates 6,752,000 triangle strips per second, with 1 light, faster than GeForce 256's T&L unit[18]]
- ↑ [15 cycles/vertex[26] per VU 15 cycles/vertex[26] per VU]
- ↑ [14 cycles per 8 vertices[19] 14 cycles per 8 vertices[19]]
- ↑ [Pentium III: 6.6 million vertices per second (110 cycles per vertex)
NV2A: 46 million vertices per second, 5 cycles per vertex (2 cycles transform, 21 MACs lighting)[23] Pentium III: 6.6 million vertices per second (110 cycles per vertex)
NV2A: 46 million vertices per second, 5 cycles per vertex (2 cycles transform, 21 MACs lighting)[23]] - ↑ [382 cycles per vertex: 170 cycles perspective transformation, 212 cycles lighting (84 multiplies, 44 adds),[23] 2 cycles per multiply, 1 cycle per add, 37 cycles per divide 382 cycles per vertex: 170 cycles perspective transformation, 212 cycles lighting (84 multiplies, 44 adds),[23] 2 cycles per multiply, 1 cycle per add, 37 cycles per divide]
- ↑ [136 cycles per vertex: Pentium III (742 MHz) calculates 5,453,000 triangle strips per second, with 4 lights, faster than GeForce 256's T&L unit[18] 136 cycles per vertex: Pentium III (742 MHz) calculates 5,453,000 triangle strips per second, with 4 lights, faster than GeForce 256's T&L unit[18]]
- ↑ [60 cycles/vertex per VU: 4 light sources,[27] 15 cycles/vertex per light source[26] 60 cycles/vertex per VU: 4 light sources,[27] 15 cycles/vertex per light source[26]]
- ↑ [63 cycles per 8 vertices[19] 63 cycles per 8 vertices[19]]
- ↑ [Pentium III: 5.3 million vertices per second (136 cycles per vertex)
NV2A: 13 million vertices per second, 14 cycles per vertex (2 cycles transform, 3 cycles per light source)[23] Pentium III: 5.3 million vertices per second (136 cycles per vertex)
NV2A: 13 million vertices per second, 14 cycles per vertex (2 cycles transform, 3 cycles per light source)[23]] - ↑ [High-end arcade revision of PowerVR2 with twice the performance of the Dreamcast's PowerVR CLX2 High-end arcade revision of PowerVR2 with twice the performance of the Dreamcast's PowerVR CLX2]
- ↑ [2x framebuffer (twice Voodoo2), 2x TMU (texture mapping units) (same as Voodoo2) 2x framebuffer (twice Voodoo2), 2x TMU (texture mapping units) (same as Voodoo2)]
- ↑ [Falcon Voodoo3 3500 TV Special Edition Falcon Voodoo3 3500 TV Special Edition]
- ↑ [ISP unit's PE Array of 32 processor elements process 32 pixels per cycle ISP unit's PE Array of 32 processor elements process 32 pixels per cycle]
- ↑ [2 ISP units, PE Arrays of 64 processor elements process 64 pixels per cycle 2 ISP units, PE Arrays of 64 processor elements process 64 pixels per cycle]
- ↑ [720 MB/s framebuffer bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 646 MB/s available bandwidth, 6 bytes per pixel (16-bit color, 32-bit Z-buffer precision) 720 MB/s framebuffer bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 646 MB/s available bandwidth, 6 bytes per pixel (16-bit color, 32-bit Z-buffer precision)]
- ↑ [2.656 GB/s VRAM bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 2.582 GB/s available bandwidth, 6 bytes per pixel (16-bit color, 32-bit Z-buffer precision) 2.656 GB/s VRAM bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 2.582 GB/s available bandwidth, 6 bytes per pixel (16-bit color, 32-bit Z-buffer precision)]
- ↑ [16 pixel pipelines 16 pixel pipelines]
- ↑ [5.336 GB/s video RAM bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 5.262 GB/s available bandwidth, 6 bytes per pixel (double-buffered 16-bit color, 32-bit Z-buffer precision) 5.336 GB/s video RAM bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 5.262 GB/s available bandwidth, 6 bytes per pixel (double-buffered 16-bit color, 32-bit Z-buffer precision)]
- ↑ [3.19 GB/s bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 3.116 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer) 3.19 GB/s bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 3.116 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer)]
- ↑ [2.582 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer) 2.582 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer)]
- ↑ [5.262 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer) 5.262 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer)]
- ↑ [3.116 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer) 3.116 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer)]
- ↑ [2.582 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer) 2.582 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer)]
- ↑ [5.262 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer) 5.262 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer)]
- ↑ [Low-quality compression[29] Low-quality compression[29]]
- ↑ [Low-quality compression[29] Low-quality compression[29]]
- ↑ [Bus interface transfers polygons and textures from CPU's main RAM to GPU's VRAM Bus interface transfers polygons and textures from CPU's main RAM to GPU's VRAM]
- ↑ [1x AGP bus[30] 1x AGP bus[30]]
- ↑ [2x AGP bus[28][30] 2x AGP bus[28][30]]
- ↑ [2x AGP bus[30] 2x AGP bus[30]]
- ↑ [Transmission bus from Pentium III 800EB (133 MHz FSB, 1 GB/s) to GeForce 256 (4x AGP)[30] Transmission bus from Pentium III 800EB (133 MHz FSB, 1 GB/s) to GeForce 256 (4x AGP)[30]]
- ↑ [162 MHz (64-bit) CPU FSB 162 MHz (64-bit) CPU FSB]
- ↑ [133 MHz (64-bit) CPU FSB 133 MHz (64-bit) CPU FSB]
- ↑ [8.25 KB register memory, 12.25 KB ISP cache, 13 KB TSP cache, 256 bytes FIFO buffer 8.25 KB register memory, 12.25 KB ISP cache, 13 KB TSP cache, 256 bytes FIFO buffer]
- ↑ [8.25 KB register memory, 24.5 KB ISP cache, 13 KB TSP cache, 256 bytes FIFO buffer 8.25 KB register memory, 24.5 KB ISP cache, 13 KB TSP cache, 256 bytes FIFO buffer]
- ↑ [12 KB ISP Parameter Cache, 4 KB TSP Parameter Cache[31] 12 KB ISP Parameter Cache, 4 KB TSP Parameter Cache[31]]
- ↑ 63.0 63.1 [1.2 GB/s register memory, 12.8 GB/s ISP PE Array, 1.6 GB/s TSP cache 1.2 GB/s register memory, 12.8 GB/s ISP PE Array, 1.6 GB/s TSP cache]
- ↑ 64.0 64.1 [1.6 GB/s register memory, 25.6 GB/s ISP PE Array, 1.6 GB/s TSP cache 1.6 GB/s register memory, 25.6 GB/s ISP PE Array, 1.6 GB/s TSP cache]
- ↑ 65.0 65.1 65.2 [38.4 GB/s framebuffer, 9.6 GB/s texture cache 38.4 GB/s framebuffer, 9.6 GB/s texture cache]
- ↑ 66.0 66.1 66.2 [10.368 GB/s texture cache (512-bit, 162 MHz), 9.632 GB/s framebuffer 10.368 GB/s texture cache (512-bit, 162 MHz), 9.632 GB/s framebuffer]
- ↑ [16 MB main RAM (accessible by SH-4 and CLX2), 8 MB VRAM (accessible by CLX2) 16 MB main RAM (accessible by SH-4 and CLX2), 8 MB VRAM (accessible by CLX2)]
- ↑ [8 MB texture RAM, 8 MB (2x 4 MB) framebuffer RAM 8 MB texture RAM, 8 MB (2x 4 MB) framebuffer RAM]
- ↑ [Accessible by Emotion Engine and Graphics Synthesizer Accessible by Emotion Engine and Graphics Synthesizer]
- ↑ [24 MB texture RAM compression, 8 MB framebuffer RAM 24 MB texture RAM compression, 8 MB framebuffer RAM]
- ↑ [800 MB/s main RAM, 800 MB/s VRAM 800 MB/s main RAM, 800 MB/s VRAM]
- ↑ [800 MB/s main RAM, 1 GB/s VRAM, 612 MB/s VROM 800 MB/s main RAM, 1 GB/s VRAM, 612 MB/s VROM]
- ↑ 73.0 73.1 73.2 [90 MHz, 2x 64-bit TMU, 2x 64-bit framebuffer 90 MHz, 2x 64-bit TMU, 2x 64-bit framebuffer]
- ↑ [Accessible by Emotion Engine at 3.2 GB/s, accessible by Graphics Synthesizer through 1.2 GB/s transmission bus Accessible by Emotion Engine at 3.2 GB/s, accessible by Graphics Synthesizer through 1.2 GB/s transmission bus]
- ↑ [162 MHz (128-bit) bus 162 MHz (128-bit) bus]
- ↑ [6.4 GB/s RAM bandwidth - 1.064 GB/s CPU FSB bandwidth[32] 6.4 GB/s RAM bandwidth - 1.064 GB/s CPU FSB bandwidth[32]]
- ↑ [3.2 gigapixels/sec effective fillrate, 16-bit (2 bytes) per pixel 3.2 gigapixels/sec effective fillrate, 16-bit (2 bytes) per pixel]
- ↑ [6 gigapixels/sec effective fillrate, 16-bit (2 bytes) per pixel 6 gigapixels/sec effective fillrate, 16-bit (2 bytes) per pixel]
References
- ↑ 1.0 1.1 File:GamersRepublic US 03.pdf, page 29
- ↑ [PC Magazine, December 1999, page 193 PC Magazine, December 1999, page 193]
- ↑ 3.0 3.1 3.2 Test Drive: Le Mans (IGN)
- ↑ Actual HW T&L perfomance of NVIDIA GeForce/GeForce2 chips (IXBT Labs)
- ↑ [PC Magazine, December 1999, page 203 PC Magazine, December 1999, page 203]
- ↑ Unreal Modeling Guide (Unreal Developer Network)
- ↑ QIII Arena High Polygon Count
- ↑ DF Retro: Shenmue - A Game Ahead Of Its Time (Digital Foundry)
- ↑ 9.0 9.1 Hideki Sato Sega Interview (Edge)
- ↑ How Many Polygons Can the Dreamcast Render?
- ↑ File:HowFarHaveWeGot.pdf
- ↑ Design of Digital Systems and Devices (page 95)
- ↑ 13.0 13.1 Sega Dreamcast: Implementation (IEEE) (Wayback Machine: 2000-08-23 20:47)
- ↑ File:SH-4 Next-Generation DSP Architecture.pdf, page 5
- ↑ File:Entertainment Systems and High-Performance Processor SH-4.pdf, page 4
- ↑ 16.0 16.1 File:SH-4 Next-Generation DSP Architecture.pdf, page 31
- ↑ 17.0 17.1 17.2 17.3 17.4 17.5 File:Streaming SIMD Extensions - Matrix Multiplication.pdf, page 7
- ↑ 18.0 18.1 18.2 Benchmarking T&L in 3DMark 2000
- ↑ 19.0 19.1 19.2 19.3 19.4 19.5 [Nikkei Electronics (2000/10/9) Nikkei Electronics (2000/10/9)]
- ↑ File:ThePowerOfPS2.pdf, page 6
- ↑ File:SH-4 Next-Generation DSP Architecture.pdf, page 12
- ↑ File:ThePowerOfPS2.pdf, page 12
- ↑ 23.0 23.1 23.2 23.3 23.4 23.5 Design of Digital Systems and Devices (pages 95-97)
- ↑ 24.0 24.1 File:Instruction Tables.pdf, page 107
- ↑ File:Instruction Tables.pdf, page 110
- ↑ 26.0 26.1 Procedural Rendering on Playstation 2 (page 4) (Gamasutra)
- ↑ 27.0 27.1 File:ThePowerOfPS2.pdf, page 4
- ↑ 28.0 28.1 Neon 250 Specs & Features (Wayback Machine: 2007-08-07 15:12)
- ↑ 29.0 29.1 Texture Limitations (Version 1.5 - Nov. 23, 1998)
- ↑ 30.0 30.1 30.2 30.3 AGP Peak Speeds
- ↑ PC 3D Graphics Accelerators FAQ: VideoLogic PowerVR
- ↑ Hardware Behind the Consoles - Part I: Microsoft's Xbox (Understanding the Hardware – The X-CPU) (AnandTech)
See also