Blast processing
From Sega Retro
Blast Processing was a marketing term coined by Sega of America to advertise the faster processing performance of the Sega Mega Drive (Sega Genesis in that region).
Sonic the Hedgehog 2 was the posterboy for this campaign, being faster than any other platform game at the time. The ad campaign featured commercials with races between two vehicles, with the SNES strapped to one and the Mega Drive strapped to the other.
Sega originally coined the term to refer to the high-speed bandwidth and fillrate of the Mega Drive VDP graphics processor's DMA controller. It was a reference to how the DMA controller could "blast" data into the VDP graphics processor and DAC at high speeds. However, many later assumed it was referring to the 68000 CPU's higher clock rate.
Origins of term
<mediaplayer>File:Blast Processing Commercial.flv</mediaplayer>
According to Sega staff involved in its development and marketing, it was the high-speed DMA controller, rather than the CPU MHz, that the term was actually referring to. According to Sega of America's former technical director Scot Bayless:[1]
“ | the PR guys interviewed me about what made the platform interesting from a technical standpoint and somewhere in there I mentioned the fact that you could just "blast data into the DAC's". Well they loved the word 'blast' and the next thing I knew Blast Processing was born. | „ |
— Scot Bayless |
One of the specific DMA programming techniques he was referring to was the mid-frame palette swap, where the color could be changed every scanline, increasing the colors displayed on screen, a technique that was used in Sonic 2:[2]
“ | Marty Franz [Sega technical director] discovered that you could do this nifty trick with the display system by hooking the scan line interrupt and firing off a DMA at just the right time. The result was that you could effectively jam data onto the graphics chip while the scan line was being drawn – which meant you could drive the DAC's with 8 bits per pixel. Assuming you could get the timing just right you could draw 256 color static images. There were all kinds of subtleties to the timing and the trick didn't work reliably on all iterations of the hardware but you could do it and it was cool as heck. | „ |
— Scot Bayless |
Many of these DMA programmable techniques were originally intended by the Mega Drive's original product designer Masami Ishikawa:[3]
“ | the sprite size could be changed to fill the whole display. It could also display the background screen behind the scrolling window and could change the color of each line. The number of available colors was limited compared to comparable arcade systems, but it could create shadows that matched each character's shape and was also capable of semi-transparency. | „ |
Technical details
- For more technical details on Mega Drive, see Mega Drive: Technical specifications and Mega Drive: Blast Processing
The term was used to refer to either the faster CPU processor or the VDP graphics processor's faster DMA controller.
CPU
The main CPU processor was clocked over two times faster than the one in its rival product, the Super NES. Sega's Motorola 68000 processor was clocked at 7.67 MHz, compared to the 3.58 MHz clock speed of Nintendo's Ricoh 5A22 processor. However, the idea of simply comparing CPU clock rates to determine performance, regardless of other characteristics, is commonly known as the megahertz myth. While Nintendo's 5A22 did run slower in clock cycles per second, it would put out more instructions per clock cycle, giving it a similar MIPS (million instructions per second) performance to Sega's 68000.
The 68000's faster performance came from other advantages, such as a wider 32-bit internal data bus and 16-bit external data bus (the SNES CPU had a 16-bit internal data bus and 8-bit external data bus), faster external data bus clock rate and bandwidth, more registers, 32-bit instructions,[4] and shared codebase with arcade games (where the 68000 saw widespread use).
Console | Sega Mega Drive | Super Nintendo Entertainment System[5][6][7] | |
---|---|---|---|
Main CPU | Motorola 68000 | Ricoh 5A22 | |
Clock rate | Internal | 7.670453 MHz (NTSC), 7.600489 MHz (PAL) |
2.684658–3.579545 MHz (NTSC), 2.660171–3.546895 MHz (PAL) |
External | 5 MHz (RAM/ROM) | 2.660171–2.684658 MHz (RAM), 2.660171–3.579545 MHz (ROM) | |
Bits | Data bus width | 32-bit internal, 16-bit external | 16-bit internal, 8-bit external |
Arithmetic logic units |
16-bit data ALU, 32-bit address ALU (2x 16-bit ALU) |
16-bit ALU | |
Word length | 16-bit | 16-bit | |
Internal instructions |
Registers | 16x 32-bit registers | 4x 16-bit registers, 4x 8-bit registers |
Instruction set | 16-bit, 32-bit | 8-bit, 16-bit | |
Instructions per second |
1.342329 MIPS (NTSC), 1.330085 MIPS (PAL) |
1.125–1.5 MIPS (NTSC), 1.114738–1.486318 MIPS (PAL) | |
Work RAM | Memory | 64 KB PSRAM (16-bit, 5.263157 MHz) | 128 KB DRAM (8-bit, 2.660171–2.684658 MHz) |
Bandwidth | 10.526314 MB/s (5 MB/s CPU access) CPU access per frame: 81 KB (NTSC), 98 KB (PAL) |
2.684658 MB/s (NTSC), 2.660171 MB/s (PAL) CPU access per frame: 43 KB (NTSC), 51 KB (PAL) | |
Transfer rate | 3.835226 MB/s (NTSC), 3.800244 MB/s (PAL) CPU write per frame: 62 KB (NTSC), 73 KB (PAL) |
2.684658 MB/s (NTSC), 2.660171 MB/s (PAL) CPU write per frame: 43 KB (NTSC), 51 KB (PAL) | |
Cartridge ROM |
Memory | 128 KB to 8 MB | 128 KB to 6 MB |
Bandwidth | 10–15.340906 MB/s (5 MB/s CPU access) | 2.5–3.579545 MB/s |
DMA
The Sega Mega Drive's Yamaha YM7101 VDP graphics processor had a powerful DMA controller that could handle DMA (direct memory access) operations at much faster speeds than the Super Nintendo.[4] The Mega Drive could write to VRAM during active display and VBlank,[8] and had a faster memory bandwidth than the SNES. The quicker DMA transfer rates and bandwidth gave the Mega Drive a faster performance than the SNES,[6] and helped give the Mega Drive a higher fillrate, higher gameplay resolution, faster parallax scrolling, fast data blitting, and high frame-rate with many moving objects on screen, and allowed it to display more unique tiles (background and sprite tiles) and large sprites (32×32 and higher) on screen, and quickly transfer more unique tiles and large sprites (16×16 and higher) on screen.
The Mega Drive's DMA capabilities also helped give it more flexibility, allowing the hardware to be programmed in various different ways. With DMA programming, it could replicate some of the Super Nintendo's hardware features, such as larger 64×64 sprites (combining 32×32 sprites), background scaling and rotation (like the Sega X Board and Mode 7), and direct color (increasing colors on screen). Other DMA programmable capabilities of the Mega Drive include mid-frame palette swaps (increasing colors per scanline), sprite scaling and rotation, ray casting, bitmap framebuffers, and 3D polygon graphics; the base Mega Drive hardware (without needing any cartridge enhancement chips) could render 3D polygons with a performance comparable to the Super Nintendo's optional Super FX (SFX) enhancement chip,[9][10][11][12] which itself is significantly outperformed by the Mega Drive's optional Sega Virtua Processor enhancement chip.
Console | Sega Mega Drive | Super Nintendo Entertainment System[5][6][7][13] | |
---|---|---|---|
DMA controller | Sega 315‑5313 VDP (Yamaha YM7101) | Ricoh 5A22 | |
Clock rate | Internal | 13.423294 MHz (NTSC), 13.300856 MHz (PAL) |
2.684658–3.579545 MHz (NTSC), 2.660171–3.546895 MHz (PAL) |
External | RAM: 8–11.764705 MHz (NTSC), 8–8.333333 MHz (PAL) ROM: 5–7.670453 MHz (NTSC), 5–7.600489 MHz (PAL) |
RAM: 2.684658 MHz (NTSC), 2.660171 MHz (PAL) ROM: 2.660171–3.579545 MHz | |
Video RAM | Memory | 64 KB VRAM (64 KB FPM DRAM, 256 bytes SAM), 232 bytes VDP cache (CRAM, VSRAM, sprite buffer) |
64 KB SRAM, 1056 bytes PPU cache (CGRAM, OAM) |
Bandwidth | 8–11.764705 MB/s (VRAM), 26.846588 MB/s (VDP cache) | 7.15909 MB/s (NTSC), 7.09379 MB/s (PAL) | |
DMA blitting transfer rate |
Inactive display | VRAM: 3.21845 MB/s, 205 bytes per scanline VDP cache: 6.4369 MB/s, 410 bytes per scanline |
NTSC: 2.684658 MB/s, 170.5 bytes per scanline PAL: 2.660171 MB/s, 170.5 bytes per scanline |
Active display (VRAM) |
320×224: 708.406 KB/s (NTSC), 1.09701 MB/s (PAL) 320×160: 1.437846 MB/s (NTSC), 1.702026 MB/s (PAL) |
256×224: 443.228 KB/s (NTSC), 795.11 KB/s (PAL) 256×192: 763.435 KB/s (NTSC), 1.061548 MB/s (PAL) | |
Active display (cache) |
320×224: 1.416813 MB/s (NTSC), 2.194021 MB/s (PAL) 320×160: 2.875692 MB/s (NTSC), 3.404052 MB/s (PAL) | ||
Fillrate | Video clock rate | 13.300856–13.423294 MHz (VDP) | 5.320342–5.369317 MHz (PPU) |
Read fillrate | 6.650428–6.934358 MPixels/s | 5.320342–5.369317 MPixels/s | |
Write fillrate (inactive display) |
6.4369 MPixels/s, 410 pixels per scanline |
5.320342–5.369317 MPixels/s, 341 pixels per scanline | |
Write fillrate (active display) |
1.416813–2.875692 MPixels/s (NTSC), 2.194021–3.404052 MPixels/s (PAL) |
886,457 pixels/s (NTSC), 1.59022 MPixels/s (PAL) | |
Tiles on screen (active display) |
Display: 1808 tiles Blit per frame: 369 tiles (NTSC), 1070 tiles (PAL) |
Display: 1395 tiles (NTSC), 1536 tiles (PAL) Blit per frame: 230 tiles (NTSC), 496 tiles (PAL) | |
Sprites | Sprite fillrate | 4.90887 MTexels/s, 320 texels per scanline | 4.282881 MTexels/s, 272 texels per scanline |
Sprite tiles | 1280 sprite tiles on screen | 512 sprite tiles on screen | |
Sprites on screen |
80 sprites (8×8 to 32×32), 20 sprites (64×64), 5 sprites (128×128) |
128 sprites (8×8, 16×16), 69 sprites (32×32), 17 sprites (64×64), 4 sprites (128×128) | |
Unique sprites on screen |
80 sprites (8×8 to 32×32), 20 sprites (64×64), 5 sprites (128×128) |
128 sprites (8×8, 16×16), 32 sprites (32×32), 8 sprites (64×64), 2 sprites (128×128) | |
Blit per frame (NTSC) |
80 sprites (8×8 to 16×16), 41 sprites (24×24), 23 sprites (32×32), 5 sprites (64×64) |
128 sprites (8×8), 57 sprites (16×16), 14 sprites (32×32), 3 sprites (64×64) | |
Blit per frame (PAL) |
80 sprites (8×8 to 24×24), 66 sprites (32×32), 16 sprites (64×64), 4 sprites (128×128) |
128 sprites (8×8), 124 sprites (16×16), 31 sprites (32×32), 7 sprites (64×64) | |
Sprites per scanline |
20 sprites (8×8 to 16×16), 13 sprites (24×24), 10 sprites (32×32), 5 sprites (64×64) |
32 sprites (8×8), 17 sprites (16×16), 8 sprites (32×32), 4 sprites (64×64) | |
Background planes |
Background tiles on screen |
1344–1808 background tiles | 256–1024 background tiles |
Tilemap planes | 2 scrolling planes (1344–1808 tiles), 1 static window plane, 40–64 overlapping scrolling layers (20–32 layers per plane) |
1–4 planes (256–1024 tiles) | |
Tilemap resolution |
256×256 to 512×512 (2 planes, 1344–1808 tiles), 1024×256 (2 planes, 1344–1424 tiles) |
256×256 to 512×512 (1–4 planes, 256–1024 tiles), 1024×1024 (1 plane, 256 tiles) | |
Scrolling capabilities |
Parallax scrolling, line scrolling, tile scrolling, row/column scrolling, overlapping scrolling layers |
Parallax scrolling, line scrolling, tile scrolling | |
Resolution | Overscan | 427×262 (NTSC), 423×312 (PAL) | 341×262 (NTSC), 341×312 (PAL) |
Display resolution |
Gameplay: 256×224 to 320×480 (default 320×224) Custom: 128×160 to 320×160, 128×224 to 160×224 |
Gameplay: 256×224 to 256×239 (default 256×224) Pseudo-hires text: 512×448, 512×478 (half-pixels) | |
Colors | Color palettes | 512 colors (default), 1536 colors (Shadow/Highlight) | 32,768 colors (default), 256–4096 colors (direct) |
Colors on screen |
61–64 (default), 114–1536 (Shadow/Highlight), 1536 (scrolling background demo) |
128–256 (1–2 planes), 128–160 (3 planes), 128 (4 planes), 2723 (static image demo) | |
Colors per tile | 16 colors (2 planes), 16–256 colors (palette swap), 256–512 colors (direct) |
16 colors (1–2 planes), 8 colors (3 planes), 4 colors (4 planes), 256 colors (direct) | |
3D polygon graphics |
Base hardware | T&L geometry calculations: 3333 polys/s Render: 1800 polys/s (flat), 1300 polys/s (textured) |
T&L geometry calculations: 110 polys/s[14] Render: 80 polys/s (flat),[15] 60 polys/s (textured)[16] |
Enhancement chips |
Sega Virtua Processor (23.01136 MHz) T&L geometry calculations: 100,000 polys/s Render: 9000 polys/s (flat), ~3000 polys/s (textured) |
Super FX (10.738635 MHz)[17] T&L geometry calculations: 3000 polys/s[18] Render: 1600 polys/s (flat),[19] 900 polys/s (textured)[20] | |
Super FX 2 (21.47727 MHz)[17] T&L geometry calculations: 6000 polys/s Render: 3200 polys/s (flat), 1800 polys/s (textured) |
References
- ↑ [Damien McFerran, "Retroinspection: Mega-CD", Retro Gamer, issue 61, page 84 Damien McFerran, "Retroinspection: Mega-CD", Retro Gamer, issue 61, page 84]
- ↑ The Man Responsible For Sega's Blast Processing (Nintendo Life)
- ↑ How Sega Built the Genesis: Masami Ishikawa Inteview (Polygon)
- ↑ 4.0 4.1 Blast Processing 101
- ↑ 5.0 5.1 SNES hardware specifications
- ↑ 6.0 6.1 6.2 Sega Genesis vs Super Nintendo
- ↑ 7.0 7.1 Anomie's Register Doc
- ↑ File:GenesisTechnicalOverview.pdf
- ↑ 3D math engine (SGDK)
- ↑ Interview: Lee Actor (Sterling Software Programmer)
- ↑ Star Fox 3D Tech Demo on Sega Genesis
- ↑ Star Fox 3D Tech Demo on Sega Genesis: Version 2 Using DMA
- ↑ SNES Developer Manual (Nintendo)
- ↑ [SNES CPU geometry calculations: 30.584 kHz (938 Hz adds, 29.646 kHz multiplies) per polygon (134 adds, 183 multiplies), 7 cycles per add, 162 cycles per 16×16 multiply (2x 16×8 multiplies), 81 cycles per 16×8 multiply (3 cycles SEP, 7 cycles STA, 4 cycles STY, 12 cycles NOP, 7 cycles LDA, 8 cycles LDY, 3 cycles XBA, 7 cycles STA, 2 cycles TYA, 2 cycles CLC, 7 cycles ADC, 2 cycles BCC, 2 cycles INY, 2 cycles BCC, 1 cycle carry_bit, 3 cycles XBA, 3 cycles REP, 6 cycles RTS) SNES CPU geometry calculations: 30.584 kHz (938 Hz adds, 29.646 kHz multiplies) per polygon (134 adds, 183 multiplies), 7 cycles per add, 162 cycles per 16×16 multiply (2x 16×8 multiplies), 81 cycles per 16×8 multiply (3 cycles SEP, 7 cycles STA, 4 cycles STY, 12 cycles NOP, 7 cycles LDA, 8 cycles LDY, 3 cycles XBA, 7 cycles STA, 2 cycles TYA, 2 cycles CLC, 7 cycles ADC, 2 cycles BCC, 2 cycles INY, 2 cycles BCC, 1 cycle carry_bit, 3 cycles XBA, 3 cycles REP, 6 cycles RTS)]
- ↑ [SNES CPU rendering:
- Framebuffer rendering: 256×160 framebuffer (double-buffered, 40 KB), 15 FPS (614.4 KB/s), 819.69 kHz framebuffer DMA (1.334 kHz per KB, 80 cycles setup), 80 cycles per DMA setup (12 cycles LDX, 12 cycles STX, 28 cycles LDA, 28 cycles STA)
- Polygon rendering: 2.759855 MHz (15 FPS), 31.426 kHz per 8×8 pixel polygon
- 30.584 kHz geometry per polygon
- 410 Hz polygon rendering per polygon: 48 comparison cycles (12 comparisons, 4 cycles per CPY comparison), 7 assignments (6 rasterization assignments, 1 flat shading assignment), 162 multiply cycles (2 multiplies), 28 add cycles (4 adds), 5 broadcasts, 160 cycles DMA access (40 bytes per polygon, 2 cycles per byte, 80 cycles setup)
- 432 Hz pixel rendering per 8×8 pixel polygon: 224 add cycles (1 add per pixel, 7 cycles per add), 208 cycles DMA (1 byte per pixel, 2 cycles per pixel, 80 cycles setup) SNES CPU rendering:
- Framebuffer rendering: 256×160 framebuffer (double-buffered, 40 KB), 15 FPS (614.4 KB/s), 819.69 kHz framebuffer DMA (1.334 kHz per KB, 80 cycles setup), 80 cycles per DMA setup (12 cycles LDX, 12 cycles STX, 28 cycles LDA, 28 cycles STA)
- Polygon rendering: 2.759855 MHz (15 FPS), 31.426 kHz per 8×8 pixel polygon
- 30.584 kHz geometry per polygon
- 410 Hz polygon rendering per polygon: 48 comparison cycles (12 comparisons, 4 cycles per CPY comparison), 7 assignments (6 rasterization assignments, 1 flat shading assignment), 162 multiply cycles (2 multiplies), 28 add cycles (4 adds), 5 broadcasts, 160 cycles DMA access (40 bytes per polygon, 2 cycles per byte, 80 cycles setup)
- 432 Hz pixel rendering per 8×8 pixel polygon: 224 add cycles (1 add per pixel, 7 cycles per add), 208 cycles DMA (1 byte per pixel, 2 cycles per pixel, 80 cycles setup)]
- ↑ [SNES CPU texture mapping: 40.091 kHz per 8×8 texel polygon (8.665 kHz texture mapping per 8×8 texel polygon)
- 416 cycles DMA per 8×8 texel texture: 2 block moves, 2 cycles per texel (1 byte per texel), 80 cycles setup
- 8249 divide cycles per 8×8 texel polygon: 73 divides per 8×8 texel polygon, 1017 vertex divide cycles per polygon (9 divides per polygon), 7232 texel divide cycles per 8×8 texel polygon (64 divides, 1 divide per texel), 113 cycles per divide (5 cycles STZ, 4 cycles LDY, 7 cycles ASL, 2 cycles BCS, 2 cycles INY, 4 cycles CPY, 4 cycles BNE, 7 cycles ROR, 3 cycles PHA, 2 cycles TXA, 2 cycles SEC, 7 cycles SBC, 4 cycles BCC, 2 cycles TAX, 7 cycles ROL, 4 cycles PLA, 7 cycles LSR, 2 cycles DEY, 6 cycles RTS, 32 cycles NOP) SNES CPU texture mapping: 40.091 kHz per 8×8 texel polygon (8.665 kHz texture mapping per 8×8 texel polygon)
- 416 cycles DMA per 8×8 texel texture: 2 block moves, 2 cycles per texel (1 byte per texel), 80 cycles setup
- 8249 divide cycles per 8×8 texel polygon: 73 divides per 8×8 texel polygon, 1017 vertex divide cycles per polygon (9 divides per polygon), 7232 texel divide cycles per 8×8 texel polygon (64 divides, 1 divide per texel), 113 cycles per divide (5 cycles STZ, 4 cycles LDY, 7 cycles ASL, 2 cycles BCS, 2 cycles INY, 4 cycles CPY, 4 cycles BNE, 7 cycles ROR, 3 cycles PHA, 2 cycles TXA, 2 cycles SEC, 7 cycles SBC, 4 cycles BCC, 2 cycles TAX, 7 cycles ROL, 4 cycles PLA, 7 cycles LSR, 2 cycles DEY, 6 cycles RTS, 32 cycles NOP)]
- ↑ 17.0 17.1 Super NES Programming: Super FX tutorial
- ↑ [Super FX geometry calculations: 3.366 kHz (804 Hz adds, 2.562 kHz multiplies) per polygon (134 adds, 183 multiplies), 6 cycles per addition, 14 cycles per multiply Super FX geometry calculations: 3.366 kHz (804 Hz adds, 2.562 kHz multiplies) per polygon (134 adds, 183 multiplies), 6 cycles per addition, 14 cycles per multiply]
- ↑ [Super FX rendering:
- Framebuffer rendering: 256×192 framebuffer (double-buffered, 40 KB), 15 FPS (737.28 KB/s), 983.612 kHz CPU framebuffer DMA (1.334 kHz per KB, 80 cycles setup), 2.950836 MHz Super FX cycles
- CPU polygon rendering: 9.716388 MHz (15 FPS), 5.892 kHz per 8×8 pixel polygon
- Geometry per polygon: 3366 cycles
- Polygon rendering per polygon: 1230 Super FX cycles (410 CPU cycles)
- Pixel rendering per 8×8 pixel polygon: 1296 Super FX cycles (432 CPU cycles) Super FX rendering:
- Framebuffer rendering: 256×192 framebuffer (double-buffered, 40 KB), 15 FPS (737.28 KB/s), 983.612 kHz CPU framebuffer DMA (1.334 kHz per KB, 80 cycles setup), 2.950836 MHz Super FX cycles
- CPU polygon rendering: 9.716388 MHz (15 FPS), 5.892 kHz per 8×8 pixel polygon
- Geometry per polygon: 3366 cycles
- Polygon rendering per polygon: 1230 Super FX cycles (410 CPU cycles)
- Pixel rendering per 8×8 pixel polygon: 1296 Super FX cycles (432 CPU cycles)]
- ↑ [Super FX texture mapping: 10.266 kHz per 8×8 texel polygon (4.374 kHz texture mapping per 8×8 texel polygon)
- 1248 Super FX cycles (416 CPU cycles) DMA per 8×8 texel texture
- 3126 divide cycles per 8×8 texel polygon: 73 divides per 8×8 texel polygon, 54 vertex divide cycles per polygon (9 divides per polygon), 3072 texel divide cycles per 8×8 texel polygon (64 divides, 1 divide per texel), 48 cycles per 16-bit divide (6 cycles per 2-bit divide) Super FX texture mapping: 10.266 kHz per 8×8 texel polygon (4.374 kHz texture mapping per 8×8 texel polygon)
- 1248 Super FX cycles (416 CPU cycles) DMA per 8×8 texel texture
- 3126 divide cycles per 8×8 texel polygon: 73 divides per 8×8 texel polygon, 54 vertex divide cycles per polygon (9 divides per polygon), 3072 texel divide cycles per 8×8 texel polygon (64 divides, 1 divide per texel), 48 cycles per 16-bit divide (6 cycles per 2-bit divide)]