The CC2M File Format

This document describes the basic format of the CC2M file format. The name "CC2M" is short for "Chip's Challenge 2 Map", and this is the native format used for CC2 levels. However, it is also used for the CC1 levels that accompany the CC2 release, and this is what is documented below. Please note that this document only covers those aspects of the file format as used by the CC1 ruleset; CC2-specific features are documented elsewhere.

Please also note that this is a work in progress. There are still several aspects of the CC2M file format that are unexplained below.

The Basic Layout

The file header is ten bytes in size, and consists of the following three fields:

0-3  signature bytes: "CC2M"
4-7  signature value: 2
8-9  file subtype: a value between 0x33 and 0x37

It is not fully known what the file subtype indicates. The assumption is that it is a version number that allows the software to be backwards-compatible with previous file versions. Note that the observed values map to the ASCII digits "3" through "7". It is assumed that this is not coincidental, and the rest of this text will refer to subtypes 3 through 7.

The file header is followed by a list of one or more blocks. These blocks comprise the file's contents.

Each block begins with an eight-byte header:

0-3  block type, e.g. "LOCK", "TITL"
4-7  size of block contents (not including this header)

The block type signature is always four ASCII characters. Type names which are only three characters long are space-padded, such as with "KEY " and "END ".

The header is followed by the block contents, the details of which depend on the block type.

Many blocks consist of a single string. Strings are always terminated with a NUL byte. An empty string block is encoded as a single NUL byte.

The Block Types

The following list covers the details of each block type.

TITL block (required)

A string containing the level's name.

LOCK block (optional)

A string describing the level's provenance, or something similar. In the CC1 levels this block either names Chuck Sommerville, John K. Elion, or Alice Voith.

CLUE block (optional)

A string containing the hint text. Line breaks are marked with a CR LF pair (though the CC1 game will wrap text with long lines). Levels with no hints typically have a single line break as the complete text instead of an empty string (or just being omitted, which is what they original Lynx levels did).

AUTH block (required?)

A string typically providing the name of the level's author.

NOTE block (optional)

An optional string value, presumably with no explicit requirements as to its contents. Typical usage in CC1 is to name the person who recorded the accompanying solution. (Given the frequency of typos in the existing NOTE blocks, this data is not meant to be displayed to the player.) In CC2 it often contains extra hint texts.

VERS block (optional)

A string value, presumably indicating a version number for the map. This block is not used with any CC1 levels, and only appears in a handful of CC2 levels.

OPTN block (required?)

A short string of bytes indicating various global settings. The first two bytes gives the level's time (with zero indicating no time limit). The third byte is always one in CC1. Levels of subtype 3, 4, and 5 always have only three bytes in this block. Levels of subtype 6 and 7 have up to 25 bytes. The next three bytes, if present, are all low integers, with zero being the most common (only two CC1 have a single one value). The remaining bytes, if present, show no obvious pattern.

PACK block (required)

This block defines the level's map and the placement of its initial contents. As the name suggests, the data in this block has been "packed", or compressed. Descriptions of both the data in this block and the unpacking algorithm are given below.

KEY block (optional)

This block is always 16 bytes long. No obvious pattern is present in the values. Its purpose is unknown. Perhaps it is a checksum of some kind? Only one level in the CC1 set lacks a KEY block; presumably this is due to an oversight rather than that level being actually unique in some way.

REPL / PRPL block (optional?)

This block (whichever one is present) contains a solution replay. Files of subtype 3 and 4 use REPL blocks; file of the later subtypes use PRPL blocks. The replay data in each is the same, but the data in the PRPL blocks has been packed. See below for descriptions of the replay data and of the unpacking algorithm.

END block (required)

No content; the END block always marks the end of the block list.

How to Decompress Packed Block Data

Both the PACK blocks and the PRPL blocks are "packed", or compressed, using a simple scheme. The decompression algorithm works as follows.

The first two bytes of the packed data block give the full size of the data after unpacking.

The next byte specifies the number of unpacked bytes at the beginning of the data. This value cannot be greater than 127. (This value also cannot be zero, since packed bytes must be able to refer to bytes that appear earlier in the data stream.)

Unpacked bytes are copied to the output stream without modification.

After a sequence of unpacked bytes comes a sequence of repeated bytes — typically three or more bytes that matches a sequence appearing earlier in the stream. This repeated sequence is encoded by a pair of bytes in the packed data stream.

The first byte of the pair always has the high bit (0x80) set. The other seven bits specify the length of the repeated sequence. The second byte in the pair specifies how many bytes back from the current position that the sequence begins.

(Note that the two instances of the repeated sequence can actually overlap, in which case the bytes before the overlap are repeated throughout the overlapping region — much like using overlap in C's memcpy() function to lay down a repeating sequence. For example, the stream 0x41 0x42 0x88 0x02 will repeat the pair 0x41 0x42 for 10 bytes.)

If the byte immediately after the pair defining the repeated sequence has its high bit set, then it is assumed to be the start of another pair defining another repeated sequence. Otherwise, the third byte instead provides the number of unpacked bytes that follow the triplet.

If, on the other hand, the first byte of a repeated sequence specification does NOT have the high bit set, then it represents a null sequence, expanding to an empty byte string. There is no second or third byte in the specification in this case, and the one byte that is present provides the number of unpacked bytes that follow. This null-specification is used for when there is more than 127 consecutive unpacked bytes in the data stream.

This alternation between unpacked data and strings of repeated sequences continues until the end of the block is reached. The block will always end with an extra NUL byte, which the last encoded repeated sequence points to. (Note that this final NUL byte simply marks the end of the compressed data, and is not copied to the output stream.)

Description of the Map Data

The PACK block contains the data that (after being unpacked) defines the level's map, including all of its contents and their initial placement.

The map data begins with the first two bytes giving the width and height of the map respectively. The bytes following this pair define the contents of each cell of the map in reading order (left-to-right, top-to-bottom).

Most map cells contain a single object or feature, which is specified by a single byte. Some cell contents require a second byte to fully specify them, however. Also, some contents implicitly allow other objects to share the same cell, which appear in the data stream immediately following. The reader of the map data needs to know which values have these features in order to know when to advance to the next cell.

In the following table, the first column gives the byte value, followed by the name of the object that value represents. If the third column contains a +, then that object takes a second byte that further defines it (see below for details). The fourth column contains a * for objects that can coexist with another object, and so will always be followed by at least one more byte. In the case when there is nothing else in the cell, then the following byte will be 0x01 for empty.

01empty        26red key  *
02wall        27blue key  *
03ice        28yellow key  *
04ice corner NE        29green key  *
05ice corner SE        2Aic chip (required)  *
06ice corner SW        2Bic chip (extra)  *
07ice corner NW        2Cchip socket   
08water        2Dpopup wall   
09fire        2Einvisible wall   
0Aforce floor north        2Finvisible wall, temporary   
0Bforce floor east        30blue block wall   
0Cforce floor south        31blue block floor   
0Dforce floor west        32dirt   
0Etoggle wall, closed        33bug +*
0Ftoggle wall, open        34paramecium +*
11teleport        35ball +*
14exit        36blob +*
16Chip +*     37teeth +*
17block +*     38fireball +*
18walker +*     3Ared button   
19glider +*     3Abrown button   
1Bpartition wall south +*     3Bice skates  *
1Cpartition wall east +*     3Cmagnet boots  *
1Dpartition wall SE +*     3Dfire boots  *
1Egravel        3Eflippers  *
1Fgreen button        3Fthief   
20blue button        40bomb  *
21tank +*     42trap   
22red door        43clone machine   
23blue door        45hint button   
24yellow door        46force floor all   
25green door        46partition wall +*

Most contents with an extra byte are creatures, and the second byte indicates the creature's orientation, i.e. which way the creature is initially facing. The orientation byte will be 0x00 for north, 0x01 for east, 0x02 for south, or 0x03 for west.

A different type of byte is used with 0x6D, a generic partition wall value. For this, the following byte indicates which edges have walls. The byte will have one or more bits set, where 0x01 indicates north, 0x02 east, 0x04 south, and 0x08 west. Note that this value is partly redundant with the content values 0x1B, 0x1C, and 0x1D. It is not known why this is. (It is also not know if the CC1 game will actually permit a partition wall to the north or west.)

All creatures permit a second set of contents to occupy the same map cell. (This is consistent with the original Lynx game, which actually stored creatures separately from the basic map data.)

The remaining contents that permit other contents are all objects that can be removed from the cell, i.e. keys, boots, chips, and bombs. (Not fake walls or doors, though.) In the original Lynx game these objects could only share a cell with a creature (usually a block). It's not clear if CC1 would actually handle such combinations as a key atop ice, or a bomb sharing a spot with a toggle wall, but the file format seems to permit such.

The file format also seems to permit an unlimited stacking of contents in a single cell, but presumably this is just for the sake of consistency. In practice the third set of contents in a map cell is always 0x01 for empty.

Note that no metadata is present in this block (other than the map size). This seems to indicate that not only are red and brown buttons implicitly connected via Lynx's reading-order rule, but that it is also impossible to change the ordering of the creature list to differ from the order implied by the map. (This is different from the original Lynx game, which could set an arbitrary monster order, though this possibility was not exercised in the original levels.)

Description of the Replay Data

This data contains a recording of a solution for the level. The data begins with a four-byte value. Its purpose is unknown. By far the most common value is all-bits zero. For levels with a nonzero value in this field, often only one of the four bytes is nonzero, so this "field" may actually be four separate byte values.

The remainder of the block is a stream of two-byte entries. The first byte value of an entry identifies the command being input. Possible base values are:

00 none
02 down
04 left
08 right
10 up
80 special

A horizontal and vertical value pair can also be bitwise-combined to indicate a "diagonal" move. The zero value "none" is used to record periods when no commands are selected.

The second byte value indicates how long the command is selected. (That is, how long the key is held down.) The time unit for this value appears to be 1/3 of a frame (or 1/6 of a tick), which would be equal to 1/60 of a second.

The maximum possible value for the time field appears to be 0xFC (equal to 42 ticks). If a key is held down for longer than this, then the entry will have a time value of 0xFC, the next entry in the stream will be the special value 0x80 0x00, and then this will be followed by a second entry with the same command value. Its time will be added to the 0xFC of the initial entry to give the full amount. Presumably, multiple fields could be chained together in this fashion to record a suitable long period of time.

(For many solutions, the stream is a sequence alternating between one key down and no keys down. For many other solutions, the stream alternates between entries for one key down and for two keys down, depending on the user's keying habits.)

The end of the stream is marked with at least one entry that contains the values 0x00 0xFF. Sometimes there is more than one such entry. There is often also a trailing 0x00 byte, but not always. Not sure why.

Tile World
Brian Raiter