Difference between revisions of "Sega Dreamcast/Hardware comparison"

From Sega Retro

(Added rewrite and clean up tags.)
(16 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
{{otherPage|desc=technical details on the Sega Dreamcast|page=Sega Dreamcast/Technical specifications}}
 
{{otherPage|desc=technical details on the Sega Dreamcast|page=Sega Dreamcast/Technical specifications}}
 
+
{{rewrite}}
 +
{{cleanup}}
 
This article presents a hardware comparison between the [[Sega Dreamcast]] and other rival systems in its time. It compares the technical specifications and hardware advantages/disadvantages between the systems.
 
This article presents a hardware comparison between the [[Sega Dreamcast]] and other rival systems in its time. It compares the technical specifications and hardware advantages/disadvantages between the systems.
  
Line 15: Line 16:
  
 
===Pentium, GeForce, Voodoo===
 
===Pentium, GeForce, Voodoo===
In most ways, the Dreamcast was generally the most powerful home system during 1998–1999, outperforming high-end PC hardware in most ways during that era.{{fileref|GamersRepublic US 03.pdf|page=29}} The Dreamcast's [[Hitachi]] [[SuperH|SH-4]] CPU calculates 3D graphics several times faster than a [[wikipedia:Pentium II|Pentium II]] from 1998,{{fileref|GamersRepublic US 03.pdf|page=29}} and faster than a [[wikipedia:Pentium III|Pentium III]] and [[NVIDIA]] [[wikipedia:GeForce256|GeForce 256]] from 1999. The Dreamcast's PowerVR CLX2 GPU, due to its tiled rendering architecture, also has has a higher [[fillrate]] and faster polygon rendering throughput than a [[wikipedia:Voodoo3|Voodoo3]] and GeForce 256 from 1999. On the other hand, the GeForce 256 has a higher fillrate for translucent polygons, whereas the Dreamcast's CLX2 has a higher fillrate for opaque polygons.
+
In most ways, the Dreamcast was generally the most powerful home system during 1998–1999, outperforming high-end PC hardware in most ways during that era.{{magref|gr|3|29}} The Dreamcast's [[Hitachi]] [[SuperH|SH-4]] CPU calculates 3D graphics several times faster than a [[wikipedia:Pentium II|Pentium II]] from 1998,{{magref|gr|3|29}} and faster than a [[wikipedia:Pentium III|Pentium III]] and [[NVIDIA]] [[wikipedia:GeForce256|GeForce 256]] from 1999. The Dreamcast's PowerVR CLX2 GPU, due to its tiled rendering architecture, also has has a higher [[fillrate]] and faster polygon rendering throughput than a [[wikipedia:Voodoo3|Voodoo3]] and GeForce 256 from 1999. On the other hand, the GeForce 256 has a higher fillrate for translucent polygons, whereas the Dreamcast's CLX2 has a higher fillrate for opaque polygons and an overall higher average fillrate (for scenes with both opaque and translucent polygons).
  
The Dreamcast's CPU–GPU transmission bus is faster than the Voodoo3 and has a higher effective bandwidth than the GeForce 256 due to the Dreamcast's efficient bandwidth usage, including its lack of CPU overhead from the [[wikipedia:Operating system|operating system]] and the CLX2's tiled rendering architecture: [[wikipedia:Texture mapping|textures]] loaded directly to VRAM (freeing up CPU–GPU transmission bus for polygons), higher [[wikipedia:Texture compression|texture compression]], on-chip tile buffer with internal [[wikipedia:Z-buffer|Z-buffering]], and [[wikipedia:Deferred shading|deferred rendering]] (no need to draw, [[wikipedia:Shading|shade]] or texture overdrawn polygons). The CLX2 is also capable of [[wikipedia:Order-independent transparency|order-independent transparency]] (which the Voodoo3 and GeForce 256 lacked) and [[wikipedia:Normal mapping|Dot3 normal mapping]] (which the Voodoo3 lacked).{{ref|1=''[[wikipedia:PC Magazine|PC Magazine]]'', [https://books.google.co.uk/books?id=90OvoBUqQoIC&pg=PA193 December 1999, page 193]}}
+
The Dreamcast's CPU–GPU transmission bus is faster than the Voodoo3 and has a higher effective bandwidth than the GeForce 256 due to the Dreamcast's efficient bandwidth usage, including its lack of CPU overhead from the [[wikipedia:Operating system|operating system]], [[wikipedia:Texture mapping|textures]] loaded directly to VRAM (freeing up CPU–GPU transmission bus for polygons), higher [[wikipedia:Texture compression|texture compression]], and the CLX2's tiled rendering architecture: on-chip tile buffer with internal [[wikipedia:Z-buffer|Z-buffering]], and [[wikipedia:Deferred shading|deferred rendering]] (no need to draw, [[wikipedia:Shading|shade]] or texture overdrawn polygons). The CLX2 is also capable of [[wikipedia:Order-independent transparency|order-independent transparency]] (which the Voodoo3 and GeForce 256 lacked) and [[wikipedia:Normal mapping|Dot3 normal mapping]] (which the Voodoo3 lacked).{{ref|1=''[[wikipedia:PC Magazine|PC Magazine]]'', [https://books.google.co.uk/books?id=90OvoBUqQoIC&pg=PA193 December 1999, page 193]}}
  
 
In terms of game engine performance, the CLX2 peaks at 5 million polygons/sec,{{ref|[http://planetdc.segaretro.org/games/reviews/testdrivelemans/index.html Test Drive: Le Mans] ([[wikipedia:IGN|IGN]])}} compared to the GeForce 256 which peaks at 2.9 million polygons/sec.{{ref|[http://ixbtlabs.com/articles/gf2hwtl/ Actual HW T&L perfomance of NVIDIA GeForce/GeForce2 chips (IXBT Labs)]}} Dreamcast game engines rendered 50,000–160,000 polygons per scene (3–5 million polygons/sec),{{ref|[http://planetdc.segaretro.org/games/reviews/testdrivelemans/index.html Test Drive: Le Mans] ([[wikipedia:IGN|IGN]])}} while PC game engines of 1999 rendered up to 10,000 polygons per scene{{ref|1=''[[wikipedia:PC Magazine|PC Magazine]]'', [https://books.google.co.uk/books?id=90OvoBUqQoIC&pg=PA203 December 1999, page 203]}}{{ref|[https://udn.epicgames.com/Two/UnrealModeling.html Unreal Modeling Guide (Unreal Developer Network)]}} (1–1.6 million polygons/sec).{{ref|[http://gamepilgrimage.com/sites/default/files/32-bitCompare/6thGen/QuakeIIIArena/QIIIPolygonCountsEstimated.png QIII Arena High Polygon Count]}} Character models in particular were significantly more detailed in Dreamcast games than in PC games during 1998–1999.{{ref|1=[https://www.youtube.com/watch?v=c0blSBgpRUg DF Retro: Shenmue - A Game Ahead Of Its Time] ([[wikipedia:Eurogamer|Digital Foundry]])}}
 
In terms of game engine performance, the CLX2 peaks at 5 million polygons/sec,{{ref|[http://planetdc.segaretro.org/games/reviews/testdrivelemans/index.html Test Drive: Le Mans] ([[wikipedia:IGN|IGN]])}} compared to the GeForce 256 which peaks at 2.9 million polygons/sec.{{ref|[http://ixbtlabs.com/articles/gf2hwtl/ Actual HW T&L perfomance of NVIDIA GeForce/GeForce2 chips (IXBT Labs)]}} Dreamcast game engines rendered 50,000–160,000 polygons per scene (3–5 million polygons/sec),{{ref|[http://planetdc.segaretro.org/games/reviews/testdrivelemans/index.html Test Drive: Le Mans] ([[wikipedia:IGN|IGN]])}} while PC game engines of 1999 rendered up to 10,000 polygons per scene{{ref|1=''[[wikipedia:PC Magazine|PC Magazine]]'', [https://books.google.co.uk/books?id=90OvoBUqQoIC&pg=PA203 December 1999, page 203]}}{{ref|[https://udn.epicgames.com/Two/UnrealModeling.html Unreal Modeling Guide (Unreal Developer Network)]}} (1–1.6 million polygons/sec).{{ref|[http://gamepilgrimage.com/sites/default/files/32-bitCompare/6thGen/QuakeIIIArena/QIIIPolygonCountsEstimated.png QIII Arena High Polygon Count]}} Character models in particular were significantly more detailed in Dreamcast games than in PC games during 1998–1999.{{ref|1=[https://www.youtube.com/watch?v=c0blSBgpRUg DF Retro: Shenmue - A Game Ahead Of Its Time] ([[wikipedia:Eurogamer|Digital Foundry]])}}
Line 34: Line 35:
 
The [[GameCube]] and [[Xbox]] are both generally more powerful than the Dreamcast, but the Dreamcast has several hardware advantages. The [[wikipedia:Transform and lighting|T&L]] geometry performance of the Dreamcast's SH-4 CPU is faster than the Xbox's Pentium III CPU but slower than the GameCube's PowerPC CPU; however, the GameCube and Xbox have T&L GPU, each with faster geometry performance than the Dreamcast.
 
The [[GameCube]] and [[Xbox]] are both generally more powerful than the Dreamcast, but the Dreamcast has several hardware advantages. The [[wikipedia:Transform and lighting|T&L]] geometry performance of the Dreamcast's SH-4 CPU is faster than the Xbox's Pentium III CPU but slower than the GameCube's PowerPC CPU; however, the GameCube and Xbox have T&L GPU, each with faster geometry performance than the Dreamcast.
  
The Dreamcast has an on-chip Z-buffer, which the GameCube also has but the Xbox lacks. The Dreamcast has a faster Z-buffer bandwidth than both, giving it a higher opaque fillrate, but with lower translucent fillrate. The higher opaque fillrate allows the Dreamcast to draw a higher number of large opaque polygons, whereas the GameCube and Xbox can draw a higher number of small polygons and/or translucent polygons.
+
The Dreamcast has an on-chip Z-buffer, which the GameCube also has but the Xbox lacks. The Dreamcast has a faster Z-buffer bandwidth than both. Its tiled rendering also gives it a higher opaque fillrate, but with lower translucent fillrate. The higher opaque fillrate allows the Dreamcast to draw a higher number of large opaque polygons, whereas the GameCube and Xbox can draw a higher number of small polygons and/or translucent polygons.
  
 
==Graphics comparison table==
 
==Graphics comparison table==
Line 50: Line 51:
 
! scope="col" | [[Xbox]] (2001)
 
! scope="col" | [[Xbox]] (2001)
 
|-
 
|-
! colspan="2" | [[wikipedia:Geometry pipelines|Geometry processor(s)]]
+
! colspan="2" | [[wikipedia:Geometry pipelines|Geometry processor]]
 
! [[Hitachi]] [[SuperH|SH-4]] <br> (200 MHz)
 
! [[Hitachi]] [[SuperH|SH-4]] <br> (200 MHz)
 
! Hitachi SH-4 <br> (200 MHz)
 
! Hitachi SH-4 <br> (200 MHz)
 
! [[wikipedia:Pentium II|Intel Pentium II]] <br> (450 MHz)
 
! [[wikipedia:Pentium II|Intel Pentium II]] <br> (450 MHz)
! colspan="3" style="text-align:center;" | [[wikipedia:Pentium III|Intel Pentium III 800EB]] (800 MHz), <br> [[NVIDIA]] [[wikipedia:GeForce 256|GeForce 256]] (120 MHz)
+
! colspan="3" style="text-align:center;" | [[wikipedia:Pentium III|Intel Pentium III 800EB]] <br> (800 MHz){{ref|GeForce 256 T&L unit outperformed by Pentium III (742 MHz){{ref|[https://www.beyond3d.com/content/articles/50/ Benchmarking T&L in 3DMark 2000]}}|group=n}}
 
! [[wikipedia:Emotion Engine|Emotion Engine]] <br> (294.912 MHz)
 
! [[wikipedia:Emotion Engine|Emotion Engine]] <br> (294.912 MHz)
! [[wikipedia:Gekko (microprocessor)|Gekko]] (485 MHz), <br> [[wikipedia:Nintendo GameCube technical specifications|Flipper]] (162 MHz)
+
! [[wikipedia:Nintendo GameCube technical specifications|ATI Flipper]] <br> (162 MHz){{ref|[[wikipedia:Gekko (microprocessor)|Gekko]] (485 MHz) CPU could be used as an alternative geometry processor.|group=n}}
! Pentium III (733 MHz), <br> [[wikipedia:Xbox technical specifications|NV2A]] (233 MHz)
+
! [[Nvidia]] [[wikipedia:Xbox technical specifications|NV2A]] <br> (233 MHz){{ref|Pentium III (733 MHz) CPU could be used as an alternative geometry processor.|group=n}}
 
|-
 
|-
! rowspan="3" | [[wikipedia:Transformation matrix|Matrix <br> transformations]] <br> (4×4){{ref|Matrix transformation (4×4 matrix × 4×1 vector) - 28 computations (16 multiplies, 12 adds){{ref|1=[https://books.google.co.uk/books?id=iAvHt5RCHbMC&pg=PA95 ''Design of Digital Systems and Devices'' (page 95)]}}|group=n}}
+
! rowspan="3" | [[wikipedia:Transformation matrix|Matrix <br> transformations]] <br> {{ref|Matrix transformation (4×4 matrix × 4×1 vector) - 28 computations (16 multiplies, 12 adds){{ref|1=[https://books.google.co.uk/books?id=iAvHt5RCHbMC&pg=PA95 ''Design of Digital Systems and Devices'' (page 95)]}}|group=n}}
 
! Matrix [[wikipedia:FLOPS|FLOPS]]
 
! Matrix [[wikipedia:FLOPS|FLOPS]]
 
| 1.4 [[wikipedia:GFLOPS|GFLOPS]]{{ref|1.4 GFLOPS,{{ref|http://web.archive.org/web/20000823204755/computer.org/micro/articles/dreamcast_2.htm}}{{fileref|SH-4 Next-Generation DSP Architecture.pdf|page=5}} 7 floating-point operations per cycle (28 computations per 4 cycles){{fileref|Entertainment Systems and High-Performance Processor SH-4.pdf|page=4}}{{fileref|SH-4 Next-Generation DSP Architecture.pdf|page=31}}|group=n}}
 
| 1.4 [[wikipedia:GFLOPS|GFLOPS]]{{ref|1.4 GFLOPS,{{ref|http://web.archive.org/web/20000823204755/computer.org/micro/articles/dreamcast_2.htm}}{{fileref|SH-4 Next-Generation DSP Architecture.pdf|page=5}} 7 floating-point operations per cycle (28 computations per 4 cycles){{fileref|Entertainment Systems and High-Performance Processor SH-4.pdf|page=4}}{{fileref|SH-4 Next-Generation DSP Architecture.pdf|page=31}}|group=n}}
 
| 1.4 GFLOPS
 
| 1.4 GFLOPS
 
| 230 [[wikipedia:MFLOPS|MFLOPS]]{{ref|28 floating-point operations per 53 cycles{{fileref|Streaming SIMD Extensions - Matrix Multiplication.pdf|page=7}}|group=n}}
 
| 230 [[wikipedia:MFLOPS|MFLOPS]]{{ref|28 floating-point operations per 53 cycles{{fileref|Streaming SIMD Extensions - Matrix Multiplication.pdf|page=7}}|group=n}}
| colspan="3" style="text-align:center;" | 720 MFLOPS{{ref|Pentium III: 28 floating-point operations per 31 cycles for 4×4 matrix transformation{{fileref|Streaming SIMD Extensions - Matrix Multiplication.pdf|page=7}} <br> GeForce 256: T&L unit outperformed by Pentium III (742 MHz){{ref|[https://www.beyond3d.com/content/articles/50/ Benchmarking T&L in 3DMark 2000]}}|group=n}}
+
| colspan="3" style="text-align:center;" | 1.1 GFLOPS{{ref|24 floating-point operations per 17 cycles{{ref|[http://www.cortstratton.org/articles/OptimizingForSSE.php Optimizing for SSE: A Case Study]}}|group=n}}
 
| 5.5 GFLOPS{{ref|Emotion Engine FPU: 0.64 GFLOPS <br> Emotion Engine VU0/VU1: 5.52 GFLOPS|group=n}}
 
| 5.5 GFLOPS{{ref|Emotion Engine FPU: 0.64 GFLOPS <br> Emotion Engine VU0/VU1: 5.52 GFLOPS|group=n}}
| 7.5 GFLOPS{{ref|Gekko: 1.94 GFLOPS (4 floating-point operations per cycle) <br> Flipper: 7.533 GFLOPS (46.5 floating-point operations per cycle){{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}}|group=n}}
+
| 7.5 GFLOPS{{ref|Flipper: 7.533 GFLOPS (46.5 floating-point operations per cycle){{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}} <br> Gekko: 1.94 GFLOPS (4 floating-point operations per cycle)|group=n}}
| 5.8 GFLOPS{{ref|Pentium III: 662 MFLOPS (28 floating-point operations per 31 cycles) <br> NV2A: 5.8 GFLOPS (24 floating-point operations per cycle)|group=n}}
+
| 5.8 GFLOPS{{ref|NV2A: 5.8 GFLOPS (24 floating-point operations per cycle) <br> Pentium III: 1 GFLOPS (24 floating-point operations per 17 cycles)|group=n}}
 
|-
 
|-
 
! [[wikipedia:Multiply–accumulate operation|MACs]]/sec
 
! [[wikipedia:Multiply–accumulate operation|MACs]]/sec
 
| 800 million{{ref|4 MAC operations per cycle{{fileref|SH-4 Next-Generation DSP Architecture.pdf|page=31}}|group=n}}
 
| 800 million{{ref|4 MAC operations per cycle{{fileref|SH-4 Next-Generation DSP Architecture.pdf|page=31}}|group=n}}
 
| 800 million
 
| 800 million
| 100 million{{ref|3.3125 cycles per MAC operation: 53 cycles per 12 MACs{{fileref|Streaming SIMD Extensions - Matrix Multiplication.pdf|page=7}}|group=n}}
+
| 130 million{{ref|3.3125 cycles per MAC operation: 53 cycles per 12 MACs{{fileref|Streaming SIMD Extensions - Matrix Multiplication.pdf|page=7}}|group=n}}
| colspan="3" style="text-align:center;" | 300 million{{ref|31 cycles per 12 MAC operations{{fileref|Streaming SIMD Extensions - Matrix Multiplication.pdf|page=7}}|group=n}}
+
| colspan="3" style="text-align:center;" | 420 million{{ref|17 cycles per 9 MAC operations{{ref|[http://www.cortstratton.org/articles/OptimizingForSSE.php Optimizing for SSE: A Case Study]}}|group=n}}
 
| 2 billion{{ref|8 MAC operations per cycle (4 MAC operations per VU){{fileref|ThePowerOfPS2.pdf|page=6}}|group=n}}
 
| 2 billion{{ref|8 MAC operations per cycle (4 MAC operations per VU){{fileref|ThePowerOfPS2.pdf|page=6}}|group=n}}
 
| 3 billion{{ref|19 MAC operations (12 MACs matrix transform, 7 MACs perspective transform) per cycle (8 cycles per 8 vertices){{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}}|group=n}}
 
| 3 billion{{ref|19 MAC operations (12 MACs matrix transform, 7 MACs perspective transform) per cycle (8 cycles per 8 vertices){{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}}|group=n}}
| 2 billion{{ref|Pentium III: 280 million MAC operations per second (31 cycles per 12 MAC operations) <br> NV2A: 2.2 billion MAC operations per second (116.5 million vertices/sec, 19 MACs each) per second (9.5 MAC operations per cycle)|group=n}}
+
| 2 billion{{ref|NV2A: 2.2 billion MAC operations per second (116.5 million vertices/sec, 19 MACs each) per second (9.5 MAC operations per cycle) <br> Pentium III: 380 million MAC operations per second (17 cycles per 9 MAC operations)|group=n}}
 
|-
 
|-
 
! [[wikipedia:Vertex (computer graphics)|Vertices]]
 
! [[wikipedia:Vertex (computer graphics)|Vertices]]
Line 82: Line 83:
 
| 50 MVertices/s
 
| 50 MVertices/s
 
| 8.4 MVertices/s{{ref|53 cycles per matrix transformation{{fileref|Streaming SIMD Extensions - Matrix Multiplication.pdf|page=7}}|group=n}}
 
| 8.4 MVertices/s{{ref|53 cycles per matrix transformation{{fileref|Streaming SIMD Extensions - Matrix Multiplication.pdf|page=7}}|group=n}}
| colspan="3" style="text-align:center;" | 25 MVertices/s{{ref|31 cycles per matrix transformation{{fileref|Streaming SIMD Extensions - Matrix Multiplication.pdf|page=7}}|group=n}}
+
| colspan="3" style="text-align:center;" | 47 MVertices/s{{ref|17 cycles per matrix transformation{{ref|[http://www.cortstratton.org/articles/OptimizingForSSE.php Optimizing for SSE: A Case Study]}}|group=n}}
 
| 140 MVertices/s{{ref|2 matrix transformations (1 transformation per VU) per 4 cycles{{fileref|ThePowerOfPS2.pdf|page=12}}|group=n}}
 
| 140 MVertices/s{{ref|2 matrix transformations (1 transformation per VU) per 4 cycles{{fileref|ThePowerOfPS2.pdf|page=12}}|group=n}}
 
| 162 MVertices/s{{ref|1 matrix transformation per cycle{{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}}|group=n}}
 
| 162 MVertices/s{{ref|1 matrix transformation per cycle{{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}}|group=n}}
| 116 MVertices/s{{ref|Pentium III: 23 million vertices per second (31 cycles per matrix transformation) <br> NV2A: 116 million vertices per second|group=n}}
+
| 116 MVertices/s{{ref|NV2A: 116 million vertices per second <br> Pentium III: 43 million vertices per second (17 cycles per matrix transformation)|group=n}}
 
|-
 
|-
 
! colspan="2" | [[wikipedia:3D projection|Perspective transformations]]
 
! colspan="2" | [[wikipedia:3D projection|Perspective transformations]]
Line 91: Line 92:
 
| 16 MVertices/s
 
| 16 MVertices/s
 
| 2.6 MVertices/s{{ref|170 cycles per perspective transformation: 53 cycles matrix transformation, 117 cycles projection (2 multiplies, 2 adds, 3 divides){{ref|1=[https://books.google.co.uk/books?id=iAvHt5RCHbMC&pg=PA95 ''Design of Digital Systems and Devices'' (pages 95-97)]}} (2 cycles per multiply, 1 cycle per add, 37 cycles per divide){{fileref|Instruction Tables.pdf|page=107}}|group=n}}
 
| 2.6 MVertices/s{{ref|170 cycles per perspective transformation: 53 cycles matrix transformation, 117 cycles projection (2 multiplies, 2 adds, 3 divides){{ref|1=[https://books.google.co.uk/books?id=iAvHt5RCHbMC&pg=PA95 ''Design of Digital Systems and Devices'' (pages 95-97)]}} (2 cycles per multiply, 1 cycle per add, 37 cycles per divide){{fileref|Instruction Tables.pdf|page=107}}|group=n}}
| colspan="3" style="text-align:center;" | 9.3 MVertices/s{{ref|86 cycles per perspective transformation: 31 cycles matrix transformation, 55 cycles projection (2 multiplies, 2 adds, 3 divides){{ref|1=[https://books.google.co.uk/books?id=iAvHt5RCHbMC&pg=PA95 ''Design of Digital Systems and Devices'' (pages 95-97)]}} (1 cycle per multiply, 1 cycle per add, 17 cycles per divide){{fileref|Instruction Tables.pdf|page=110}}|group=n}}
+
| colspan="3" style="text-align:center;" | 11 MVertices/s{{ref|72 cycles per perspective transformation: 17 cycles matrix transformation, 55 cycles projection (2 multiplies, 2 adds, 3 divides){{ref|1=[https://books.google.co.uk/books?id=iAvHt5RCHbMC&pg=PA95 ''Design of Digital Systems and Devices'' (pages 95-97)]}} (1 cycle per multiply, 1 cycle per add, 17 cycles per divide){{fileref|Instruction Tables.pdf|page=110}}|group=n}}
 
| 80 MVertices/s
 
| 80 MVertices/s
 
| 160 MVertices/s{{ref|8 cycles per 8 perspective transformations in T&L pipeline{{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}}|group=n}}
 
| 160 MVertices/s{{ref|8 cycles per 8 perspective transformations in T&L pipeline{{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}}|group=n}}
| 110 MVertices/s{{ref|Pentium III: 8.5 million vertices per second (86 cycles per perspective transformation) <br> NV2A: 116.5 million vertices per second (2 cycles per vertex)|group=n}}
+
| 110 MVertices/s{{ref|NV2A: 116.5 million vertices per second (2 cycles per vertex) <br> Pentium III: 10 million vertices per second (72 cycles per perspective transformation)|group=n}}
 
|-
 
|-
 
! rowspan="2" | [[wikipedia:Transform, clipping, and lighting|Lighting]]
 
! rowspan="2" | [[wikipedia:Transform, clipping, and lighting|Lighting]]
Line 104: Line 105:
 
| 39 MPolygons/s{{ref|15 cycles/vertex{{ref|1=[http://www.gamasutra.com/view/feature/131444/procedural_rendering_on_.php?page=4 Procedural Rendering on Playstation 2 (page 4)] ([[wikipedia:Gamasutra|Gamasutra]])}} per VU|group=n}}
 
| 39 MPolygons/s{{ref|15 cycles/vertex{{ref|1=[http://www.gamasutra.com/view/feature/131444/procedural_rendering_on_.php?page=4 Procedural Rendering on Playstation 2 (page 4)] ([[wikipedia:Gamasutra|Gamasutra]])}} per VU|group=n}}
 
| 90 MPolygons/s{{ref|14 cycles per 8 vertices{{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}}|group=n}}
 
| 90 MPolygons/s{{ref|14 cycles per 8 vertices{{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}}|group=n}}
| 46 MPolygons/s{{ref|Pentium III: 6.6 million vertices per second (110 cycles per vertex) <br> NV2A: 46 million vertices per second, 5 cycles per vertex (2 cycles transform, 21 MACs lighting){{ref|1=[https://books.google.co.uk/books?id=iAvHt5RCHbMC&pg=PA95 ''Design of Digital Systems and Devices'' (pages 95-97)]}}|group=n}}
+
| 46 MPolygons/s{{ref|NV2A: 46 million vertices per second, 5 cycles per vertex (2 cycles transform, 21 MACs lighting){{ref|1=[https://books.google.co.uk/books?id=iAvHt5RCHbMC&pg=PA95 ''Design of Digital Systems and Devices'' (pages 95-97)]}} <br> Pentium III: 6.6 million vertices per second (110 cycles per vertex)|group=n}}
 
|-
 
|-
 
! 4 light sources
 
! 4 light sources
Line 113: Line 114:
 
| 9.8 MPolygons/s{{ref|60 cycles/vertex per VU: 4 light sources,{{fileref|ThePowerOfPS2.pdf|page=4}} 15 cycles/vertex per light source{{ref|1=[http://www.gamasutra.com/view/feature/131444/procedural_rendering_on_.php?page=4 Procedural Rendering on Playstation 2 (page 4)] ([[wikipedia:Gamasutra|Gamasutra]])}}|group=n}}
 
| 9.8 MPolygons/s{{ref|60 cycles/vertex per VU: 4 light sources,{{fileref|ThePowerOfPS2.pdf|page=4}} 15 cycles/vertex per light source{{ref|1=[http://www.gamasutra.com/view/feature/131444/procedural_rendering_on_.php?page=4 Procedural Rendering on Playstation 2 (page 4)] ([[wikipedia:Gamasutra|Gamasutra]])}}|group=n}}
 
| 20 MPolygons/s{{ref|63 cycles per 8 vertices{{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}}|group=n}}
 
| 20 MPolygons/s{{ref|63 cycles per 8 vertices{{ref|''[[wikipedia:The Nikkei|Nikkei Electronics]]'' (2000/10/9)}}|group=n}}
| 16 MPolygons/s{{ref|Pentium III: 5.3 million vertices per second (136 cycles per vertex) <br> NV2A: 13 million vertices per second, 14 cycles per vertex (2 cycles transform, 3 cycles per light source){{ref|1=[https://books.google.co.uk/books?id=iAvHt5RCHbMC&pg=PA95 ''Design of Digital Systems and Devices'' (pages 95-97)]}}|group=n}}
+
| 16 MPolygons/s{{ref|NV2A: 13 million vertices per second, 14 cycles per vertex (2 cycles transform, 3 cycles per light source){{ref|1=[https://books.google.co.uk/books?id=iAvHt5RCHbMC&pg=PA95 ''Design of Digital Systems and Devices'' (pages 95-97)]}} <br> Pentium III: 5.3 million vertices per second (136 cycles per vertex)|group=n}}
 
|-
 
|-
 
! colspan="2" | [[wikipedia:Rendering pipeline|Rendering processors]]
 
! colspan="2" | [[wikipedia:Rendering pipeline|Rendering processors]]
Line 123: Line 124:
 
! GeForce 256 <br> (120 MHz)
 
! GeForce 256 <br> (120 MHz)
 
! [[wikipedia:Graphics Synthesizer|Graphics Synthesizer]] <br> (147.456 MHz)
 
! [[wikipedia:Graphics Synthesizer|Graphics Synthesizer]] <br> (147.456 MHz)
! Flipper (162 MHz)
+
! Flipper <br> (162 MHz)
! NV2A (233 MHz)
+
! NV2A <br> (233 MHz)
 
|-
 
|-
 
! rowspan="2" | [[wikipedia:Tiled rendering|Tiled <br> rendering]]
 
! rowspan="2" | [[wikipedia:Tiled rendering|Tiled <br> rendering]]
Line 273: Line 274:
 
| 6:1 (S3TC)
 
| 6:1 (S3TC)
 
|-
 
|-
! rowspan="2" | CPU–GPU <br> transfer <br> bus{{ref|Bus interface transfers polygons and textures from CPU's main [[RAM]] to GPU's [[VRAM]]|group=n}}
+
! rowspan="2" | CPU–GPU <br> transfer bus <br> {{ref|Bus interface transfers polygons and/or textures from the CPU to the GPU's [[VRAM]]|group=n}}
 
! [[Byte|Bandwidth]]
 
! [[Byte|Bandwidth]]
 
| 800 [[Byte|MB/s]]{{ref|http://web.archive.org/web/20000823204755/computer.org/micro/articles/dreamcast_2.htm}}
 
| 800 [[Byte|MB/s]]{{ref|http://web.archive.org/web/20000823204755/computer.org/micro/articles/dreamcast_2.htm}}
Line 369: Line 370:
 
|-
 
|-
 
! Texture buffer
 
! Texture buffer
| 800 MB/s <br> (compress 6 GB/s)
+
| 800 MB/s <br> (compressed 6 GB/s)
| 1 GB/s <br> (compress 7 GB/s)
+
| 1 GB/s <br> (compressed 7 GB/s)
| 720 MB/s{{ref|90 MHz, 2x 64-bit TMU, 2x 64-bit framebuffer|group=n}} <br> (compress 2.1 GB/s)
+
| 720 MB/s{{ref|90 MHz, 2x 64-bit TMU, 2x 64-bit framebuffer|group=n}}
 
| 9.6 GB/s{{ref|38.4 GB/s framebuffer, 9.6 GB/s texture cache|group=n}}
 
| 9.6 GB/s{{ref|38.4 GB/s framebuffer, 9.6 GB/s texture cache|group=n}}
 
| 10 GB/s{{ref|10.368 GB/s texture cache (512-bit, 162 MHz), 9.632 GB/s framebuffer|group=n}}
 
| 10 GB/s{{ref|10.368 GB/s texture cache (512-bit, 162 MHz), 9.632 GB/s framebuffer|group=n}}

Revision as of 18:01, 24 October 2022

For technical details on the Sega Dreamcast, see Sega Dreamcast/Technical specifications.


Rewrite.svg
This article needs to be rewritten.
This article needs to be rewritten to conform to a higher standard of article quality. After the article has been rewritten, you may remove this message. For help, see the How to Edit a Page article.
Cleanup.svg
This article needs cleanup.
This article needs to be edited to conform to a higher standard of article quality. After the article has been cleaned up, you may remove this message. For help, see the How to Edit a Page article.

This article presents a hardware comparison between the Sega Dreamcast and other rival systems in its time. It compares the technical specifications and hardware advantages/disadvantages between the systems.

Vs. Arcade

The Sega Dreamcast's arcade counterpart, the Sega NAOMI, has the same CPU, the Hitachi SH-4, at the same clock rate, but is more powerful in other ways, including an updated PowerVR2 GPU with faster performance, additional RAM and VRAM, higher bandwidth, and faster ROM cartridge storage. The NAOMI released for $1995, ten times the price of the Dreamcast and more expensive than a high-end PC at the time, but cheaper than the Sega Model 3 arcade system (which debuted at $20,000 in 1996).

The NAOMI was, in turn, the basis for two significantly more powerful arcade systems, the Hikaru (debuted 1999) and NAOMI 2 (debuted 2000). Sega later packaged the Dreamcast into an arcade board as the Atomiswave. While the Dreamcast is not as powerful as 1997–1999 Sega arcade hardware, including the Model 3 Step 2 (debuted 1997), NAOMI, and Hikaru, the Dreamcast surpassed the Model 3 Step 1 (debuted 1996) in performance.[1]

Vs. PC

Neon 250

The Dreamcast's PowerVR CLX2 GPU was the basis for the PowerVR PMX1, a PC GPU released with the Neon 250 graphics card in 1999. However, the Neon 250 lacks many of the tiled rendering features of the CLX2: the tile size is halved (halving the fillrate), it lacks the CLX2's internal Z-buffering and alpha test capability with hardware front-to-back translucency sorting (further reducing the fillrate and performance, as well as requiring the Neon 250 to render a Z-buffer externally), and the tiling is partially handled by software (the CLX2 handles the tiling entirely in hardware). The Neon 250 also lacks the CLX2's latency buffering and palettized texture support while VQ texture compression performance is halved, and it has bus contention due to having a single data bus (whereas the CLX2 has two data buses).

The PowerVR2 was also optimized for the Hitachi SH-4's geometry processing capabilities (rather than for a Pentium II or III), while PC drivers and software were not optimized for the Neon 250's tiled rendering architecture (compared to Dreamcast games which were optimized for the CLX2's tiled rendering architecture). The Neon 250 thus had only a fraction of the Dreamcast CLX2's fillrate and rendering performance. The reduction in performance from the Dreamacst's CLX2 to the Neon 250 was comparable to the reduction in performance from the Sega Model 3's Real3D Pro-1000 to the Intel740.

Pentium, GeForce, Voodoo

In most ways, the Dreamcast was generally the most powerful home system during 1998–1999, outperforming high-end PC hardware in most ways during that era.[2] The Dreamcast's Hitachi SH-4 CPU calculates 3D graphics several times faster than a Pentium II from 1998,[2] and faster than a Pentium III and NVIDIA GeForce 256 from 1999. The Dreamcast's PowerVR CLX2 GPU, due to its tiled rendering architecture, also has has a higher fillrate and faster polygon rendering throughput than a Voodoo3 and GeForce 256 from 1999. On the other hand, the GeForce 256 has a higher fillrate for translucent polygons, whereas the Dreamcast's CLX2 has a higher fillrate for opaque polygons and an overall higher average fillrate (for scenes with both opaque and translucent polygons).

The Dreamcast's CPU–GPU transmission bus is faster than the Voodoo3 and has a higher effective bandwidth than the GeForce 256 due to the Dreamcast's efficient bandwidth usage, including its lack of CPU overhead from the operating system, textures loaded directly to VRAM (freeing up CPU–GPU transmission bus for polygons), higher texture compression, and the CLX2's tiled rendering architecture: on-chip tile buffer with internal Z-buffering, and deferred rendering (no need to draw, shade or texture overdrawn polygons). The CLX2 is also capable of order-independent transparency (which the Voodoo3 and GeForce 256 lacked) and Dot3 normal mapping (which the Voodoo3 lacked).[3]

In terms of game engine performance, the CLX2 peaks at 5 million polygons/sec,[4] compared to the GeForce 256 which peaks at 2.9 million polygons/sec.[5] Dreamcast game engines rendered 50,000–160,000 polygons per scene (3–5 million polygons/sec),[4] while PC game engines of 1999 rendered up to 10,000 polygons per scene[6][7] (1–1.6 million polygons/sec).[8] Character models in particular were significantly more detailed in Dreamcast games than in PC games during 1998–1999.[9]

Vs. Consoles

PlayStation 2

Compared to the rival PlayStation 2, the Dreamcast is more effective at textures, anti-aliasing, and image quality, while the PS2 is more effective at polygon geometry, physics, particles, and lighting. The PS2 has a more powerful CPU geometry engine, higher translucent fillrate, and more main RAM (32 MB, compared to Dreamcast's 16 MB), while the DC has more VRAM (8 MB, compared to PS2's 4 MB), higher opaque fillrate, and more GPU hardware features, with CLX2 capabilities like tiled rendering, super-sample anti-aliasing, Dot3 normal mapping, order-independent transparency, and texture compression, which the PS2's GPU lacks.

With larger VRAM and tiled rendering, the DC can render a larger framebuffer at higher native resolution (with an on-chip Z-buffer), and with texture compression, it can compress around 20–60 MB of texture data in its VRAM. Because the PS2 has only 4 MB VRAM, it relies on the main RAM to store textures. While the PS2's CPU–GPU transmission bus for transferring polygons and textures is 50% faster than the Dreamcast's CPU–GPU transmission bus, the DC has textures loaded directly to VRAM (freeing up the CPU–GPU transmission bus for polygons) and texture compression gives it higher effective texture bandwidth.

Dreamcast games were effectively using 20–30 MB of texture data[1] (compressed to around 5–6 MB),[10] while PS2 games up until 2003 peaked at 5.5 MB of texture data (average 1.5 MB). PS2 games up until 2003 rendered up to 7.5 million polygons/sec (145,000 polygons per scene), with most rendering 2–5 million polygons/sec (average 52,000 polygons per scene);[11] in comparison, Dreamcast game engines rendered up to 5 million polygons/sec (160,000 polygons per scene),[4] with most games rendering 2–4 million polygons/sec (average 50,000 polygons per scene).

The Dreamcast is more user-friendly for developers, making it easier to develop for, while the PS2 is more difficult to develop for; this is the reverse of the 32-bit era, when the PlayStation was more user-friendly, and the Saturn more difficult, for developers.

GameCube and Xbox

The GameCube and Xbox are both generally more powerful than the Dreamcast, but the Dreamcast has several hardware advantages. The T&L geometry performance of the Dreamcast's SH-4 CPU is faster than the Xbox's Pentium III CPU but slower than the GameCube's PowerPC CPU; however, the GameCube and Xbox have T&L GPU, each with faster geometry performance than the Dreamcast.

The Dreamcast has an on-chip Z-buffer, which the GameCube also has but the Xbox lacks. The Dreamcast has a faster Z-buffer bandwidth than both. Its tiled rendering also gives it a higher opaque fillrate, but with lower translucent fillrate. The higher opaque fillrate allows the Dreamcast to draw a higher number of large opaque polygons, whereas the GameCube and Xbox can draw a higher number of small polygons and/or translucent polygons.

Graphics comparison table

See Sega Dreamcast technical specifications for more technical details on Dreamcast hardware
System Dreamcast (1998)[12] NAOMI (1998)[13] PC (1998) PC (1999) PlayStation 2 (2000) GameCube (2001) Xbox (2001)
Geometry processor Hitachi SH-4
(200 MHz)
Hitachi SH-4
(200 MHz)
Intel Pentium II
(450 MHz)
Intel Pentium III 800EB
(800 MHz)[n 1]
Emotion Engine
(294.912 MHz)
ATI Flipper
(162 MHz)[n 2]
Nvidia NV2A
(233 MHz)[n 3]
Matrix
transformations

[n 4]
Matrix FLOPS 1.4 GFLOPS[n 5] 1.4 GFLOPS 230 MFLOPS[n 6] 1.1 GFLOPS[n 7] 5.5 GFLOPS[n 8] 7.5 GFLOPS[n 9] 5.8 GFLOPS[n 10]
MACs/sec 800 million[n 11] 800 million 130 million[n 12] 420 million[n 13] 2 billion[n 14] 3 billion[n 15] 2 billion[n 16]
Vertices 50 MVertices/s[n 17] 50 MVertices/s 8.4 MVertices/s[n 18] 47 MVertices/s[n 19] 140 MVertices/s[n 20] 162 MVertices/s[n 21] 116 MVertices/s[n 22]
Perspective transformations 16 MVertices/s[n 23] 16 MVertices/s 2.6 MVertices/s[n 24] 11 MVertices/s[n 25] 80 MVertices/s 160 MVertices/s[n 26] 110 MVertices/s[n 27]
Lighting 1 light source 14 MPolygons/s[n 28] 14 MPolygons/s 2 MPolygons/s[n 29] 7.2 MPolygons/s[n 30] 39 MPolygons/s[n 31] 90 MPolygons/s[n 32] 46 MPolygons/s[n 33]
4 light sources 6.8 MPolygons/s 6.8 MPolygons/s 1.1 MPolygons/s[n 34] 5.8 MPolygons/s[n 35] 9.8 MPolygons/s[n 36] 20 MPolygons/s[n 37] 16 MPolygons/s[n 38]
Rendering processors PowerVR CLX2
(100 MHz)
PowerVR2
(100 MHz)[n 39]
2x Voodoo2 (SLI)
(90 MHz)[n 40]
Neon 250
(125 MHz)
Voodoo3 SE
(200 MHz)[n 41]
GeForce 256
(120 MHz)
Graphics Synthesizer
(147.456 MHz)
Flipper
(162 MHz)
NV2A
(233 MHz)
Tiled
rendering
Tiling FPU 720 MFLOPS 1 GFLOPS N/A N/A N/A N/A N/A N/A N/A
Tile size 32×32 pixels 32×32 pixels N/A 32×16 pixels
Pixel
fillrate
Opaque 3.2 GPixels/s[n 42] 6 GPixels/s[n 43] 100 MPixels/s[n 44] 500 MPixels/s 200 MPixels/s 430 MPixels/s
[n 45]
2.3 GPixels/s
[n 46]
648 MPixels/s 870 MPixels/s
[n 47]
Opaque/Translucent 500 MPixels/s 1 GPixel/s 100 MPixels/s 250 MPixels/s
Translucent 200 MPixels/s 400 MPixels/s 100 MPixels/s 125 MPixels/s
Texture
fillrate
Opaque 3.2 GTexels/s 6 GTexels/s 100 MTexels/s 500 MTexels/s 380 MTexels/s
[n 48]
320 MTexels/s
[n 49]
1.1 GTexels/s 648 MTexels/s 650 MTexels/s
[n 50]
Opaque/Translucent 500 MTexels/s 1 GTexel/s 100 MTexels/s 250 MTexels/s
Multi-texture
fillrate
Opaque 1.6 GTexels/s 3 GTexels/s 100 MTexels/s 250 MTexels/s 310 MTexels/s
[n 51]
250 MTexels/s
[n 52]
580 MTexels/s 648 MTexels/s 520 MTexels/s
[n 53]
Opaque/Translucent 250 MTexels/s 500 MTexels/s 100 MTexels/s 120 MTexels/s
Textured
polygons
32-pixel 7.1 MPolygons/s 12 MPolygons/s 2 MPolygons/s 4 MPolygons/s[31] 6 MPolygons/s 7 MPolygons/s 30 MPolygons/s 20 MPolygons/s 20 MPolygons/s
100-pixel (opaque) 7.1 MPolygons/s 12 MPolygons/s 1 MPolygons/s 4 MPolygons/s 2 MPolygons/s 4.3 MPolygons/s 10 MPolygons/s 6.4 MPolygons/s 8 MPolygons/s
100-pixel
(opaque/translucent)
5 MPolygons/s 10 MPolygons/s 1 MPolygons/s 2.5 MPolygons/s
Multi-texture
polygons
32-pixel 7.1 MPolygons/s 12 MPolygons/s 2 MPolygons/s 4 MPolygons/s 5 MPolygons/s 7 MPolygons/s 18 MPolygons/s 20 MPolygons/s 16 MPolygons/s
100-pixel (opaque) 7.1 MPolygons/s 12 MPolygons/s 1 MPolygons/s 2.5 MPolygons/s 2 MPolygons/s 2.5 MPolygons/s 5 MPolygons/s 6.4 MPolygons/s 5 MPolygons/s
100-pixel
(opaque/translucent)
2.5 MPolygons/s 5 MPolygons/s 1 MPolygons/s 1.2 MPolygons/s
Texture compression ratio 7.98:1 (VQ) 7.98:1 (VQ) 3:1 (palette)[n 54] 4:1 (VQ) 4:1 (FXT1) 6:1 (S3TC) 3:1 (palette)[n 55] 6:1 (S3TC) 6:1 (S3TC)
CPU–GPU
transfer bus
[n 56]
Bandwidth 800 MB/s[16] 800 MB/s 260 MB/s[n 57] 530 MB/s[n 58] 530 MB/s[n 59] 1 GB/s[n 60] 1.2 GB/s[30] 1.3 GB/s[n 61] 1.064 GB/s[n 62]
Texture compression 6.3 GB/s 6.3 GB/s 800 MB/s 2.1 GB/s 2.1 GB/s 6 GB/s 3.6 GB/s 7.7 GB/s 6.3 GB/s
Internal
GPU
cache
Cache memory 33 KB[n 63] 46 KB[n 64] N/A 16 KB[n 65] N/A N/A 4 MB 3 MB N/A
Bandwidth 15 GB/s[n 66] 28 GB/s[n 67] N/A 1 GB/s N/A N/A 48 GB/s[n 68] 20 GB/s[n 69]
External
video
memory
External memory 24 MB
(SDRAM)[n 70]
48 MB (SDRAM),
100 MB (VROM)[1]
16 MB
(SDRAM)[n 71]
32 MB
(SDRAM)
16 MB
(SDRAM)
32 MB
(SDRAM)
32 MB
(RDRAM)[n 72]
24 MB
(1T-SRAM)
64 MB
(DDR SDRAM)
Texture compression 190 MB 300 MB (SDRAM),
700 MB (VROM)
32 MB[n 73] 120 MB 64 MB 190 MB 96 MB 144 MB 300 MB
Bandwidth 1.6 GB/s[n 74] 1.8 GB/s[n 75] 2.8 GB/s[n 76] 1 GB/s 3.1 GB/s 2.6 GB/s 3.2 GB/s[n 77] 2.6 GB/s[n 78] 5.3 GB/s[n 79]
Buffering
bandwidth
Framebuffer 800 MB/s
(tiled 6.4 GB/s)[n 80]
1 GB/s
(tiled 12 GB/s)[n 81]
720 MB/s[n 76] 1 GB/s 3.1 GB/s 2.6 GB/s 38 GB/s[n 68] 9.6 GB/s[n 69] 5.3 GB/s
Z-buffer 12 GB/s[n 66] 25 GB/s[n 67]
Texture buffer 800 MB/s
(compressed 6 GB/s)
1 GB/s
(compressed 7 GB/s)
720 MB/s[n 76] 9.6 GB/s[n 68] 10 GB/s[n 69]
System Dreamcast (1998) NAOMI (1998) PC (1998) PC (1999) PlayStation 2 (2000) GameCube (2001) Xbox (2001)

Notes

  1. [GeForce 256 T&L unit outperformed by Pentium III (742 MHz)[14] GeForce 256 T&L unit outperformed by Pentium III (742 MHz)[14]]
  2. Gekko (485 MHz) CPU could be used as an alternative geometry processor.
  3. [Pentium III (733 MHz) CPU could be used as an alternative geometry processor. Pentium III (733 MHz) CPU could be used as an alternative geometry processor.]
  4. [Matrix transformation (4×4 matrix × 4×1 vector) - 28 computations (16 multiplies, 12 adds)[15] Matrix transformation (4×4 matrix × 4×1 vector) - 28 computations (16 multiplies, 12 adds)[15]]
  5. [1.4 GFLOPS,[16][17] 7 floating-point operations per cycle (28 computations per 4 cycles)[18][19] 1.4 GFLOPS,[16][17] 7 floating-point operations per cycle (28 computations per 4 cycles)[18][19]] (Wayback Machine: 2000-08-23 20:47)
  6. [28 floating-point operations per 53 cycles[20] 28 floating-point operations per 53 cycles[20]]
  7. [24 floating-point operations per 17 cycles[21] 24 floating-point operations per 17 cycles[21]]
  8. [Emotion Engine FPU: 0.64 GFLOPS
    Emotion Engine VU0/VU1: 5.52 GFLOPS Emotion Engine FPU: 0.64 GFLOPS
    Emotion Engine VU0/VU1: 5.52 GFLOPS]
  9. [Flipper: 7.533 GFLOPS (46.5 floating-point operations per cycle)[22]
    Gekko: 1.94 GFLOPS (4 floating-point operations per cycle) Flipper: 7.533 GFLOPS (46.5 floating-point operations per cycle)[22]
    Gekko: 1.94 GFLOPS (4 floating-point operations per cycle)]
  10. [NV2A: 5.8 GFLOPS (24 floating-point operations per cycle)
    Pentium III: 1 GFLOPS (24 floating-point operations per 17 cycles) NV2A: 5.8 GFLOPS (24 floating-point operations per cycle)
    Pentium III: 1 GFLOPS (24 floating-point operations per 17 cycles)]
  11. [4 MAC operations per cycle[19] 4 MAC operations per cycle[19]]
  12. [3.3125 cycles per MAC operation: 53 cycles per 12 MACs[20] 3.3125 cycles per MAC operation: 53 cycles per 12 MACs[20]]
  13. [17 cycles per 9 MAC operations[21] 17 cycles per 9 MAC operations[21]]
  14. [8 MAC operations per cycle (4 MAC operations per VU)[23] 8 MAC operations per cycle (4 MAC operations per VU)[23]]
  15. [19 MAC operations (12 MACs matrix transform, 7 MACs perspective transform) per cycle (8 cycles per 8 vertices)[22] 19 MAC operations (12 MACs matrix transform, 7 MACs perspective transform) per cycle (8 cycles per 8 vertices)[22]]
  16. [NV2A: 2.2 billion MAC operations per second (116.5 million vertices/sec, 19 MACs each) per second (9.5 MAC operations per cycle)
    Pentium III: 380 million MAC operations per second (17 cycles per 9 MAC operations) NV2A: 2.2 billion MAC operations per second (116.5 million vertices/sec, 19 MACs each) per second (9.5 MAC operations per cycle)
    Pentium III: 380 million MAC operations per second (17 cycles per 9 MAC operations)]
  17. [4 cycles per matrix transformation[24] 4 cycles per matrix transformation[24]]
  18. [53 cycles per matrix transformation[20] 53 cycles per matrix transformation[20]]
  19. [17 cycles per matrix transformation[21] 17 cycles per matrix transformation[21]]
  20. [2 matrix transformations (1 transformation per VU) per 4 cycles[25] 2 matrix transformations (1 transformation per VU) per 4 cycles[25]]
  21. [1 matrix transformation per cycle[22] 1 matrix transformation per cycle[22]]
  22. [NV2A: 116 million vertices per second
    Pentium III: 43 million vertices per second (17 cycles per matrix transformation) NV2A: 116 million vertices per second
    Pentium III: 43 million vertices per second (17 cycles per matrix transformation)]
  23. [MVertices/s = Million vertices per second MVertices/s = Million vertices per second]
  24. [170 cycles per perspective transformation: 53 cycles matrix transformation, 117 cycles projection (2 multiplies, 2 adds, 3 divides)[26] (2 cycles per multiply, 1 cycle per add, 37 cycles per divide)[27] 170 cycles per perspective transformation: 53 cycles matrix transformation, 117 cycles projection (2 multiplies, 2 adds, 3 divides)[26] (2 cycles per multiply, 1 cycle per add, 37 cycles per divide)[27]]
  25. [72 cycles per perspective transformation: 17 cycles matrix transformation, 55 cycles projection (2 multiplies, 2 adds, 3 divides)[26] (1 cycle per multiply, 1 cycle per add, 17 cycles per divide)[28] 72 cycles per perspective transformation: 17 cycles matrix transformation, 55 cycles projection (2 multiplies, 2 adds, 3 divides)[26] (1 cycle per multiply, 1 cycle per add, 17 cycles per divide)[28]]
  26. [8 cycles per 8 perspective transformations in T&L pipeline[22] 8 cycles per 8 perspective transformations in T&L pipeline[22]]
  27. [NV2A: 116.5 million vertices per second (2 cycles per vertex)
    Pentium III: 10 million vertices per second (72 cycles per perspective transformation) NV2A: 116.5 million vertices per second (2 cycles per vertex)
    Pentium III: 10 million vertices per second (72 cycles per perspective transformation)]
  28. [MPolygons/s = Million polygons per second MPolygons/s = Million polygons per second]
  29. [223 cycles per vertex: 170 cycles perspective transformation, 53 cycles lighting (21 multiplies, 11 adds),[26] 2 cycles per multiply, 1 cycle per add, 37 cycles per divide[27] 223 cycles per vertex: 170 cycles perspective transformation, 53 cycles lighting (21 multiplies, 11 adds),[26] 2 cycles per multiply, 1 cycle per add, 37 cycles per divide[27]]
  30. [110 cycles per vertex: Pentium III (742 MHz) calculates 6,752,000 triangle strips per second, with 1 light, faster than GeForce 256's T&L unit[14] 110 cycles per vertex: Pentium III (742 MHz) calculates 6,752,000 triangle strips per second, with 1 light, faster than GeForce 256's T&L unit[14]]
  31. [15 cycles/vertex[29] per VU 15 cycles/vertex[29] per VU]
  32. [14 cycles per 8 vertices[22] 14 cycles per 8 vertices[22]]
  33. [NV2A: 46 million vertices per second, 5 cycles per vertex (2 cycles transform, 21 MACs lighting)[26]
    Pentium III: 6.6 million vertices per second (110 cycles per vertex) NV2A: 46 million vertices per second, 5 cycles per vertex (2 cycles transform, 21 MACs lighting)[26]
    Pentium III: 6.6 million vertices per second (110 cycles per vertex)]
  34. [382 cycles per vertex: 170 cycles perspective transformation, 212 cycles lighting (84 multiplies, 44 adds),[26] 2 cycles per multiply, 1 cycle per add, 37 cycles per divide 382 cycles per vertex: 170 cycles perspective transformation, 212 cycles lighting (84 multiplies, 44 adds),[26] 2 cycles per multiply, 1 cycle per add, 37 cycles per divide]
  35. [136 cycles per vertex: Pentium III (742 MHz) calculates 5,453,000 triangle strips per second, with 4 lights, faster than GeForce 256's T&L unit[14] 136 cycles per vertex: Pentium III (742 MHz) calculates 5,453,000 triangle strips per second, with 4 lights, faster than GeForce 256's T&L unit[14]]
  36. [60 cycles/vertex per VU: 4 light sources,[30] 15 cycles/vertex per light source[29] 60 cycles/vertex per VU: 4 light sources,[30] 15 cycles/vertex per light source[29]]
  37. [63 cycles per 8 vertices[22] 63 cycles per 8 vertices[22]]
  38. [NV2A: 13 million vertices per second, 14 cycles per vertex (2 cycles transform, 3 cycles per light source)[26]
    Pentium III: 5.3 million vertices per second (136 cycles per vertex) NV2A: 13 million vertices per second, 14 cycles per vertex (2 cycles transform, 3 cycles per light source)[26]
    Pentium III: 5.3 million vertices per second (136 cycles per vertex)]
  39. [High-end arcade revision of PowerVR2 with twice the performance of the Dreamcast's PowerVR CLX2 High-end arcade revision of PowerVR2 with twice the performance of the Dreamcast's PowerVR CLX2]
  40. [2x framebuffer (twice Voodoo2), 2x TMU (texture mapping units) (same as Voodoo2) 2x framebuffer (twice Voodoo2), 2x TMU (texture mapping units) (same as Voodoo2)]
  41. [Falcon Voodoo3 3500 TV Special Edition Falcon Voodoo3 3500 TV Special Edition]
  42. [ISP unit's PE Array of 32 processor elements process 32 pixels per cycle ISP unit's PE Array of 32 processor elements process 32 pixels per cycle]
  43. [2 ISP units, PE Arrays of 64 processor elements process 64 pixels per cycle 2 ISP units, PE Arrays of 64 processor elements process 64 pixels per cycle]
  44. [720 MB/s framebuffer bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 646 MB/s available bandwidth, 6 bytes per pixel (16-bit color, 32-bit Z-buffer precision) 720 MB/s framebuffer bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 646 MB/s available bandwidth, 6 bytes per pixel (16-bit color, 32-bit Z-buffer precision)]
  45. [2.656 GB/s VRAM bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 2.582 GB/s available bandwidth, 6 bytes per pixel (16-bit color, 32-bit Z-buffer precision) 2.656 GB/s VRAM bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 2.582 GB/s available bandwidth, 6 bytes per pixel (16-bit color, 32-bit Z-buffer precision)]
  46. [16 pixel pipelines 16 pixel pipelines]
  47. [5.336 GB/s video RAM bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 5.262 GB/s available bandwidth, 6 bytes per pixel (double-buffered 16-bit color, 32-bit Z-buffer precision) 5.336 GB/s video RAM bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 5.262 GB/s available bandwidth, 6 bytes per pixel (double-buffered 16-bit color, 32-bit Z-buffer precision)]
  48. [3.19 GB/s bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 3.116 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer) 3.19 GB/s bandwidth, 74 MB/s framebuffer (640×480, 16-bit color, double-buffered, 60 FPS), 3.116 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer)]
  49. [2.582 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer) 2.582 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer)]
  50. [5.262 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer) 5.262 GB/s available bandwidth, 8 bytes per pixel (16-bit pixel, 16-bit texel, 32-bit Z-buffer)]
  51. [3.116 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer) 3.116 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer)]
  52. [2.582 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer) 2.582 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer)]
  53. [5.262 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer) 5.262 GB/s available bandwidth, 10 bytes per pixel (16-bit pixel, dual 16-bit texels, 32-bit Z-buffer)]
  54. [Low-quality compression[32] Low-quality compression[32]] (Wayback Machine: 1999-04-28 17:17)
  55. [Low-quality compression[32] Low-quality compression[32]] (Wayback Machine: 1999-04-28 17:17)
  56. [Bus interface transfers polygons and/or textures from the CPU to the GPU's VRAM Bus interface transfers polygons and/or textures from the CPU to the GPU's VRAM]
  57. [1x AGP bus[33] 1x AGP bus[33]]
  58. [2x AGP bus[31][33] 2x AGP bus[31][33]]
  59. [2x AGP bus[33] 2x AGP bus[33]]
  60. [Transmission bus from Pentium III 800EB (133 MHz FSB, 1 GB/s) to GeForce 256 (4x AGP)[33] Transmission bus from Pentium III 800EB (133 MHz FSB, 1 GB/s) to GeForce 256 (4x AGP)[33]]
  61. [162 MHz (64-bit) CPU FSB 162 MHz (64-bit) CPU FSB]
  62. [133 MHz (64-bit) CPU FSB 133 MHz (64-bit) CPU FSB]
  63. [8.25 KB register memory, 12.25 KB ISP cache, 13 KB TSP cache, 256 bytes FIFO buffer 8.25 KB register memory, 12.25 KB ISP cache, 13 KB TSP cache, 256 bytes FIFO buffer]
  64. [8.25 KB register memory, 24.5 KB ISP cache, 13 KB TSP cache, 256 bytes FIFO buffer 8.25 KB register memory, 24.5 KB ISP cache, 13 KB TSP cache, 256 bytes FIFO buffer]
  65. [12 KB ISP Parameter Cache, 4 KB TSP Parameter Cache[34] 12 KB ISP Parameter Cache, 4 KB TSP Parameter Cache[34]]
  66. 66.0 66.1 [1.2 GB/s register memory, 12.8 GB/s ISP PE Array, 1.6 GB/s TSP cache 1.2 GB/s register memory, 12.8 GB/s ISP PE Array, 1.6 GB/s TSP cache]
  67. 67.0 67.1 [1.6 GB/s register memory, 25.6 GB/s ISP PE Array, 1.6 GB/s TSP cache 1.6 GB/s register memory, 25.6 GB/s ISP PE Array, 1.6 GB/s TSP cache]
  68. 68.0 68.1 68.2 [38.4 GB/s framebuffer, 9.6 GB/s texture cache 38.4 GB/s framebuffer, 9.6 GB/s texture cache]
  69. 69.0 69.1 69.2 [10.368 GB/s texture cache (512-bit, 162 MHz), 9.632 GB/s framebuffer 10.368 GB/s texture cache (512-bit, 162 MHz), 9.632 GB/s framebuffer]
  70. [16 MB main RAM (accessible by SH-4 and CLX2), 8 MB VRAM (accessible by CLX2) 16 MB main RAM (accessible by SH-4 and CLX2), 8 MB VRAM (accessible by CLX2)]
  71. [8 MB texture RAM, 8 MB (2x 4 MB) framebuffer RAM 8 MB texture RAM, 8 MB (2x 4 MB) framebuffer RAM]
  72. [Accessible by Emotion Engine and Graphics Synthesizer Accessible by Emotion Engine and Graphics Synthesizer]
  73. [24 MB texture RAM compression, 8 MB framebuffer RAM 24 MB texture RAM compression, 8 MB framebuffer RAM]
  74. [800 MB/s main RAM, 800 MB/s VRAM 800 MB/s main RAM, 800 MB/s VRAM]
  75. [800 MB/s main RAM, 1 GB/s VRAM, 612 MB/s VROM 800 MB/s main RAM, 1 GB/s VRAM, 612 MB/s VROM]
  76. 76.0 76.1 76.2 [90 MHz, 2x 64-bit TMU, 2x 64-bit framebuffer 90 MHz, 2x 64-bit TMU, 2x 64-bit framebuffer]
  77. [Accessible by Emotion Engine at 3.2 GB/s, accessible by Graphics Synthesizer through 1.2 GB/s transmission bus Accessible by Emotion Engine at 3.2 GB/s, accessible by Graphics Synthesizer through 1.2 GB/s transmission bus]
  78. [162 MHz (128-bit) bus 162 MHz (128-bit) bus]
  79. [6.4 GB/s RAM bandwidth - 1.064 GB/s CPU FSB bandwidth[35] 6.4 GB/s RAM bandwidth - 1.064 GB/s CPU FSB bandwidth[35]]
  80. [3.2 gigapixels/sec effective fillrate, 16-bit (2 bytes) per pixel 3.2 gigapixels/sec effective fillrate, 16-bit (2 bytes) per pixel]
  81. [6 gigapixels/sec effective fillrate, 16-bit (2 bytes) per pixel 6 gigapixels/sec effective fillrate, 16-bit (2 bytes) per pixel]

References

  1. 1.0 1.1 1.2 Hideki Sato Sega Interview (Edge)
  2. 2.0 2.1 Gamers' Republic, "August 1998" (US; 1998-07-21), page 29
  3. [PC Magazine, December 1999, page 193 PC Magazine, December 1999, page 193]
  4. 4.0 4.1 4.2 Test Drive: Le Mans (IGN)
  5. Actual HW T&L perfomance of NVIDIA GeForce/GeForce2 chips (IXBT Labs)
  6. [PC Magazine, December 1999, page 203 PC Magazine, December 1999, page 203]
  7. Unreal Modeling Guide (Unreal Developer Network)
  8. QIII Arena High Polygon Count
  9. DF Retro: Shenmue - A Game Ahead Of Its Time (Digital Foundry)
  10. htt (Wayback Machine: 2001-03-06 00:59)
  11. File:HowFarHaveWeGot.pdf
  12. Sega Dreamcast/Technical specifications
  13. Sega NAOMI
  14. 14.0 14.1 14.2 Benchmarking T&L in 3DMark 2000
  15. Design of Digital Systems and Devices (page 95)
  16. 16.0 16.1 htt (Wayback Machine: 2000-08-23 20:47)
  17. File:SH-4 Next-Generation DSP Architecture.pdf, page 5
  18. File:Entertainment Systems and High-Performance Processor SH-4.pdf, page 4
  19. 19.0 19.1 File:SH-4 Next-Generation DSP Architecture.pdf, page 31
  20. 20.0 20.1 20.2 File:Streaming SIMD Extensions - Matrix Multiplication.pdf, page 7
  21. 21.0 21.1 21.2 Optimizing for SSE: A Case Study
  22. 22.0 22.1 22.2 22.3 22.4 22.5 [Nikkei Electronics (2000/10/9) Nikkei Electronics (2000/10/9)]
  23. File:ThePowerOfPS2.pdf, page 6
  24. File:SH-4 Next-Generation DSP Architecture.pdf, page 12
  25. File:ThePowerOfPS2.pdf, page 12
  26. 26.0 26.1 26.2 26.3 26.4 26.5 Design of Digital Systems and Devices (pages 95-97)
  27. 27.0 27.1 File:Instruction Tables.pdf, page 107
  28. File:Instruction Tables.pdf, page 110
  29. 29.0 29.1 Procedural Rendering on Playstation 2 (page 4) (Gamasutra)
  30. 30.0 30.1 File:ThePowerOfPS2.pdf, page 4
  31. 31.0 31.1 htt (Wayback Machine: 2007-08-07 15:12)
  32. 32.0 32.1 htt (Wayback Machine: 1999-04-28 17:17)
  33. 33.0 33.1 33.2 33.3 AGP Peak Speeds
  34. PC 3D Graphics Accelerators FAQ: VideoLogic PowerVR
  35. Hardware Behind the Consoles - Part I: Microsoft's Xbox (Understanding the Hardware – The X-CPU) (AnandTech)


Hardware comparisons
SG-1000 / SC-3000 (specs) vs. Famicom | vs. Master System | vs. C64 | vs. MSX
Sega Master System (specs) Hardware comparison | vs. NES | vs. Atari 7800 | vs. Game Gear | vs. Mega Drive | vs. C64 and ZX | vs. Amiga
Sega Mega Drive (specs) Hardware comparison | vs. 32X | vs. Amiga | vs. Atari Jaguar | vs. Atari ST | vs. Mega-CD | vs. Neo Geo | vs. Saturn | vs. SNES | vs. TurboGrafx-16
Sega Game Gear (specs) vs. Atari Lynx | vs. Game Boy | vs. Master System
Sega Mega-CD (specs) vs. 32X | vs. 3DO | vs. Amiga CD32 | vs. Atari Jaguar CD | vs. Neo Geo CD | vs. SNES | vs. Saturn | vs. TurboGrafx-CD
Sega 32X (specs) Hardware comparison | vs. 3DO | vs. Atari Jaguar | vs. SNES | vs. Saturn
Sega Saturn (specs) Hardware comparison | vs. Atari Jaguar | vs. Dreamcast | vs. Nintendo 64 | vs. PC-FX | vs. PlayStation | vs. SNES | vs. PC | vs. Sega Model 2
Sega Dreamcast (specs) Hardware comparison | vs. GameCube | vs. Nintendo 64 | vs. PlayStation | vs. PlayStation 2 | vs. Xbox | vs. Arcade | vs. PC
Sega Dreamcast
Topics Technical specifications (Hardware comparison) | History (Development | Release | Decline and legacy | Internet) | List of games | Magazine articles | Promotional material | Merchandise
Hardware Japan (Special) | Western Europe | Eastern Europe | North America | Asia | South America | Australasia | Africa
Add-ons Dreamcast Karaoke | Dreameye
Controllers Controller | Arcade Stick | Fishing Controller | Gun (Dream Blaster) | Race Controller | Maracas Controller (Third-party) | Twin Stick | Keyboard | Mouse | Third-party
Controller Add-ons Jump Pack (Third-party) | Microphone | VMU (4x Memory Card | Third-party)
Development Hardware Dev.Box | Controller Box | Controller Function Checker | Sound Box | GD-Writer | C1/C2 Checker | Dev.Cas | GD-ROM Duplicator
Online Services/Add-ons Dreamarena | SegaNet | WebTV for Dreamcast | Modem | Modular Cable | Modular Extension Cable | Broadband Adapter | Dreamphone
Connector Cables Onsei Setsuzoku Cable | RF Adapter | Scart Cable | S Tanshi Cable | Stereo AV Cable | VGA Box

Dreamcast MIDI Interface Cable | Neo Geo Pocket/Dreamcast Setsuzoku Cable | Taisen Cable

Misc. Hardware Action Replay CDX | Code Breaker | Kiosk | MP3 DC | MP3 DC Audio Player | Official Case | Treamcast
Third-party accessories Controllers | Controller converters | Miscellaneous
Unreleased Accessories DVD Player | Zip Drive | Swatch Access for Dreamcast | VMU MP3 Player
Arcade Variants NAOMI | Atomiswave | Sega Aurora