Rob Northen compression

From Sega Retro

Rob Northen compression (RNC) is a multi-platform compression format created by Rob Northen in 1991. It is a variant of LZSS and Huffman. Decompression libraries have been written for the PC, Mega Drive, Game Boy, SNES and Atari Lynx. RNC is used in a number of games by UK developers (notably Bullfrog and Traveller's Tales), including Sonic 3D: Flickies' Island, Dungeon Keeper 2, Magic Carpet, Syndicate and Syndicate Wars.

Header Format

Most RNC compressed files come in a standard 18 byte header:

Data Type Description Size (in Bytes)
Signature ASCII "RNC" 3
Compression Method 1 or 2 (In binary) 1
Uncompressed Size Size of the original file 4
Compressed Size Size of the RNC data (excluding header) 4
Uncompressed CRC Checksum of the original file 2
Compressed CRC Checksum of the RNC data 2
Leeway Difference between compressed and uncompressed data in largest pack chunk (if larger than decompressed data) 1
Pack Chunks Amount of pack chunks 1

Each multi-byte portion of the header is big endian.

Method 1

RNC Method 1 consists of a bit stream and a byte stream. The bit stream is read from right to left, as little endian 16 bit words. The first two bits of the bit stream are ignored.

Each pack chunk contains the following:

  • 3 Huffman trees (one for literal data sizes, one for distance values, and one for length values) in the bit stream. These consist of:
    • A 5 bit value for the amount of leaf nodes in the tree
    • 4 bit values for each node representing their bit depth.
  • One 16 bit value in the bitstream for the amount of subchunks in the pack chunk.
  • The subchunk data, which contains for each subchunk:
    • A Huffman code value from the first tree in the bit stream for the amount of literals in the byte stream.
    • Literals from the byte stream.
    • A Huffman code from the bit stream that represents the distance - 1 of a distance/length pair.
    • A Huffman code from the bit stream that represents the length - 2 of a distance/length pair.

Method 2

Method two also consists of a bit stream and a byte stream. It also ignores the first two bits. However, the bit stream is read from left to right as bytes. Also, it does not contain Huffman trees or sub chunks. Instead, it uses a fixed prefix coding in the following format:

Bit Stream Byte Stream Description
0 X Literal Byte X
10 + Length + Distance X Move Length Bytes from (Distance * 256) + X + 1
10111XXXX XXXX * 4 + 12 Bytes Get XXXX * 12 Literal Bytes
110 X Move 2 Bytes from X + 1
1110 + Distance X Get 3 bytes from (Distance * 256) + X + 1
1111 + Distance X Y If X != 0, Get Y + 8 Bytes from (Distance * 256) + X + 1
11110 0 End of File
11111 0 End of pack chunk

The Length and Distance Codes are the following:

Length:

Code Value
00 4
10 5
010 6
011 7
110 8

Distance:

Code Value
0 0
110 1
1000 2
1001 3
10101 4
10111 5
11101 6
11111 7
101000 8
101001 9
101100 10
101101 11
111000 12
111001 13
111100 14
111101 15

External Links