Important information: this site is currently scheduled to go offline indefinitely by December 1st 2023.
If you wish to donate to attempt the preservation of tools and software somewhere else before it goes down, check the GoFundMe

Kornet's Format Information

The Original Forum. Game archives, full of resources. How to open them? Get help here.
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

FlatOut BFS format
P.S - Pascal (Delphi) code :D

unit BfsReader;

interface

uses
Classes, SysUtils, ZLib;

type
TBfsHeader = record
Marker: Cardinal;
Version: Cardinal;
DataOffset: Cardinal;
NumEntries: Cardinal;
end;

TBfsEntry = packed record
Info: packed record
CompMethod: Cardinal; // 4 = raw, 5 = zlib
DataOffs: Cardinal;
Size: Cardinal;
CompSize: Cardinal;
CheckSum: Integer; // Checksum of compressed data. read as signed int, remove the sign and - 1 and you have the crc32. I.E checksum = -165278610 crc32 = 165278609
NameLen: Word;
end;
Name: packed array of Char;
end;

TBfsReader = class
private
function GetCount: Integer;
function GetEntry(Index: Integer): TBfsEntry;
protected
FEntries: array of TBfsEntry;
FHeader: TBfsHeader;
FStream: TStream;
public
constructor Create(FileName: string);
destructor Destroy; override;
function Extract(Index: Integer; Dest: TStream): Integer;
property Entries[Index: Integer]: TBfsEntry read GetEntry;
property Count: Integer read GetCount;
end;

implementation

{ TBfsReader }

constructor TBfsReader.Create(FileName: string);
var
i: Integer;
Offsets: array of Cardinal;
begin
inherited Create;
FStream := TFileStream.Create(FileName, fmOpenRead + fmShareCompat);

// Read header
FStream.Read(FHeader, SizeOf(FHeader));

SetLength(Offsets, FHeader.NumEntries);
SetLength(FEntries, FHeader.NumEntries);

// Read File entry offfsets
FStream.Read(Offsets[0], FHeader.NumEntries * SizeOf(Cardinal));

// Skip unknown (parent/child/sibbling structure or something?) Two words for
// each entry
FStream.Read(i, SizeOf(Cardinal));
FStream.Seek(i * (SizeOf(Word)*2), soFromCurrent);

// Read entries
for i := 0 to FHeader.NumEntries -1 do
begin
FStream.Seek(Offsets, soFromBeginning);
FStream.Read(FEntries.Info, SizeOf(FEntries.Info));
SetLength(FEntries.Name, FEntries.Info.NameLen +1);
FStream.Read(FEntries.Name[0], FEntries.Info.NameLen);
FEntries.Name[FEntries.Info.NameLen] := #0;
end;
end;


destructor TBfsReader.Destroy;
begin
FreeAndNil(FStream);
inherited;
end;

function TBfsReader.Extract(Index: Integer; Dest: TStream): Integer;
var
DecompStream: TDecompressionStream;
begin
Result := 0;
if FEntries[Index].Info.Size = 0 then Exit;
FStream.Seek(FEntries[Index].Info.DataOffs, soFromBeginning);

case FEntries[Index].Info.CompMethod of
// Raw data, just copy
4: Result := Dest.CopyFrom(FStream, FEntries[Index].Info.Size);
// ZLib compressed data, decompress
5: begin
DecompStream := TDecompressionStream.Create(FStream);
try
Result := Dest.CopyFrom(DecompStream, FEntries[Index].Info.Size);
finally
FreeAndNil(DecompStream);
end;
end;
else
raise Exception.Create('Unknown compression format');
end
end;

function TBfsReader.GetCount: Integer;
begin
Result := Length(FEntries);
end;

function TBfsReader.GetEntry(Index: Integer): TBfsEntry;
begin
Result := FEntries[Index];
end;

end.
:D [/b]
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

James Bond NightFire 007 (Delphi - Pascal code)

type N007Header = packed record
Magic: Integer; // Always 1 ?
Version: Integer; // 1 = Used in Nightfire Demo
// 3 = Used in Nightfire Retail
Magic2: Integer;
ID: array[0..3] of char;
NumRootDirs: Integer;
end;
// If version = 3 then following the header is an integer giving the number of entries
// Get32 filename
N007Entry = packed record
Compressed: byte;
Size: integer;
CompSize: integer;
end;
// If version = 1 then follows Size bytes of data (if Compressed = 0) or CompSize (if Compressed = 1)[/b]
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

Spelforce PAK file (Delphi - Pascal code)

type MFHeader = packed record
VersionNum: integer;
ID: array[0..23] of char; // 'MASSIVE PAKFILE V 4.0'
Unknown: array[0..43] of byte;
Unknown2: integer;
NbFatEntry: integer;
Unknown3: integer;
DataOffset: integer;
FileSize: integer;
end;
MFEntry = packed record
Size: integer;
RelOffset: integer;
NameOffset: integer;
Unknown: integer;
end;
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

Heroes of Might & Magic 4 H4R file (Delphi - Pascal code)

type H4R_Header = packed record
ID: array[0..3] of char;
DirOffset: integer;
end;
H4R_Index = packed record
Offset: integer;
Size: integer;
DSize: integer;
Unk: integer; // Always 9C 73 86 3C
end;
// get16 filenameID
// get16 source directory
// 2 bytes 00 00
// Integer (Storage/Compression?)
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

The 11th Hour GJD file (Delphi - Pascal code)

type GJD_Entry = packed record // 32 Bytes
Unknown: integer;
Offset: longword;
Size: longword;
Index: word;
Filename: array[1..18] of char;
end;

type GJD_Entry_7 = packed record // 20 Bytes
Filename: array[1..12] of char;
Offset: longword;
Size: longword;
end;
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

DreamWorks BLB file (C++ code)

struct BLBHeader
{
char szID[4];
BYTE bID;
BYTE bUnknown;
WORD wDataSize;
DWORD dwFileSize;
DWORD dwNumber;
};

szID -- string ID is always "\x40\x49\x00\x02".

bID -- byte ID is always 0x07.

wDataSize -- the size of data section of BLB file (see below).

dwFileSize -- the size of BLB file.

dwNumber -- the number of files stored in BLB file.

After the header comes the array of (dwNumber) file IDs. Each file ID is a
DWORD identifying a file in BLB archive.

After the file IDs array comes the array of (dwNumber) directory entries.
Each directory entry contains the info on a file in BLB archive. Each such
entry has the following format:

struct BLBDirEntry
{
BYTE bType;
BYTE bAction;
WORD wDataIndex;
DWORD dwUnknown;
DWORD dwStart;
DWORD dwFileID;
DWORD dwOutSize;
};

bType -- the type of the file:
0x07 -- sound effect,
0x08 -- music,
0x0A -- video file (SMK -- Smacker video, http://www.smacker.com),
there're some more types, but their purpose is not so evident (e.g. 0x02
seems to be graphic file type).

bAction -- defines the action which should be performed for the file:
0x01 -- none: the file is non-compressed, no additional actions are required,
0x03 -- decompress: the file is PKWARE-compressed (see below),
0x65 -- fake: the file is fake, that is no file is really present (see below).

wDataIndex -- the index into file data array (see below) for the byte
correspondent to the file.

dwStart -- the starting position of the file relative to the beginning of the
BLB archive.

dwFileID -- this field is only relevant for fake files: fake files directory
entries are just placeholders, there's no real file in the archive for the
fake file directory entry, but such entry points to some other file (perhaps,
in other BLB archive) with the file ID equal to (dwFileID) which should be
used instead of the correspondent fake file. So, fake file is a kind of alias
for other (non-fake) file. Note, that, in principle, fake file may be an
alias for another fake file, etc., but finally there should be non-fake file
in the chain of such redirections.

dwOutSize -- the output size of the file:
for non-compressed files that's just the size of the file,
for PKWARE-compressed files that's the size of decompressed file,
for fake files use the value from the directory entry of the correspondent
non-fake file.

After the array of directory entries comes the data section. This is the array
of (wDataSize) bytes. Let's call these bytes "shifts". The shift value for
the file may be obtained by getting the byte with index (wDataIndex) (see above)
from the file data array (indices are zero-based). If the index value is too
large (not less than data section size (wDataSize)), the shift value should
be set to default (0xFF).

After the file data section come files contained in BLB archive.

============================================
2. Decompression of PKWARE-compressed Files
============================================

I will not describe here the algorithm of PKWARE decompression. What I'll
explain is the easy way you may use to access PKWARE-compressed files via
PKWARE's library PKWARE.DLL. This library is supplied with GAP's BLB RF
plug-in (see below).

Here's the sample C code (using Win32 API):

// decompression function -- returns zero on success
DWORD (__cdecl *Uncompress)
(
char *outputBuffer,
DWORD *pOutSize,
char *inputBuffer,
DWORD dwSize
);

// first, load the library
HINSTANCE hDllInst=LoadLibrary("pkware.dll");

// get the decompression function address
Uncompress=(DWORD (__cdecl *)(char*, DWORD*, char*, DWORD))GetProcAddress(hDllInst,"Uncompress");

// decompress file -- it's assumed here that input buffer contains compressed
// file loaded from BLB archive and output buffer is allocated and has proper
// size (that is, (dwOutSize) value from the corresponding directory entry)
Uncompress(outputBuffer,&dwOutSize,inputBuffer,dwSize);

// now (outputBuffer) contains decompressed file and (dwOutSize) is set to
// decompressed file size (you should use directory entry value of output
// size for output buffer allocation)

================================
3. BLBSFX Sound and Music Files
================================

As was pointed out above, files with type bytes 0x07 and 0x08 are sound and
music files. All of them are of the same format which I refer to as BLBSFX.
BLBSFX file has no header, it's just compressed (or non-compressed) waveform
stream. All sound/music files in The Neverhood are 16-bit mono 22050 Hz.
If the shift value for the BLBSFX file is 0xFF, the file in not compressed
and in this case, it's just PCM waveform stream (signed 16-bit).
Otherwise, if the shift byte differs from 0xFF, BLBSFX file is compressed
using DW ADPCM compression algorithm. Refer to the following section for the
description of DW ADPCM decompression scheme.
Note that most of sound files in The Neverhood are PKWARE-compressed, that
is you should first decompress them (e.g. using PKWARE.DLL and the approach
described above) and then apply DW ADPCM decompression scheme (if needed).
All music files are not PKWARE-compressed, but most of them are DW ADPCM
compressed.

====================================
4. DW ADPCM Decompression Algorithm
====================================

During the decompression SHORT variable should be maintained. Decompression
uses shift value, so it should be obtained first.

Here's the code which decompresses DW ADPCM compressed BLBSFX file:

BYTE bShift; // shift value
char *inputBuffer[dwSize];

SHORT iCurValue;
DWORD i;

iCurValue=0x0000;

for (i=0;i<dwSize;i++)
{
iCurValue+=(signed short)inputBuffer;
Output(iCurValue<<bShift);
}

Output() is just a placeholder for any action you would like to perform for
decompressed sample value.
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

Atlantis: The Lost Tale;
Atlantis 2;
Aztec;
Egypte 2;
Odyssee;
Chine;

Formats : APC, .HNM, .BF, .ZIK

struct APCHeader
{
char szID[8];
char szVersion[4];
DWORD dwOutSize;
DWORD dwSampleRate;
LONG lSampleLeft;
LONG lSampleRight;
DWORD dwStereo;
};

szID -- ID string, which is "CRYO_APC".

szVersion -- version ID string, all files in the mentioned games have this
set to "1.20". Note that this field may, in principle, vary.

dwOutSize -- number of samples in the file. May be used for song length
(in seconds) calculation.

dwSampleRate -- sample rate for the file.

lSampleLeft -- the initial value for the left sample (see below).

lSampleRight -- the initial value for the right sample (see below).

dwStereo -- this seems to be boolean stereo flag: if this is not zero, the
audio stream in the file is stereo (music and partly video soundtracks),
otherwise it's mono (sfx, speech).

The resolution is NOT specified in the header, so the default value (16-bit)
should be used.

After the APCHeader IMA ADPCM compressed sound data comes. You may find
IMA ADPCM decompression scheme description further in this document.

=====================================
2. IMA ADPCM Decompression Algorithm
=====================================

During the decompression four LONG variables must be maintained for stereo
stream: lIndexLeft, lIndexRight, lCurSampleLeft, lCurSampleRight and two --
for mono stream: lIndex, lCurSample. At the beginning of the file you must
initialize lCurSampleLeft/Right variables to the values from APCHeader while
lIndexLeft/Right are initialized to zeroes.
Note that LONG here is signed.

Here's the code which decompresses one byte of IMA ADPCM compressed
stereo stream. Other bytes are processed in the same way.

BYTE Input; // current byte of compressed data
BYTE Code;
LONG Delta;

Code=HINIBBLE(Input); // get HIGHER 4-bit nibble

Delta=StepTable[lIndexLeft]>>3;
if (Code & 4)
Delta+=StepTable[lIndexLeft];
if (Code & 2)
Delta+=StepTable[lIndexLeft]>>1;
if (Code & 1)
Delta+=StepTable[lIndexLeft]>>2;
if (Code & 8) // sign bit
lCurSampleLeft-=Delta;
else
lCurSampleLeft+=Delta;

// clip sample
if (lCurSampleLeft>32767)
lCurSampleLeft=32767;
else if (lCurSampleLeft<-32768)
lCurSampleLeft=-32768;

lIndexLeft+=IndexAdjust

Code: Select all

; // adjust index

// clip index
if (lIndexLeft<0)
   lIndexLeft=0;
else if (lIndexLeft>88)
   lIndexLeft=88;

Code=LONIBBLE(Input); // get LOWER 4-bit nibble

Delta=StepTable[lIndexRight]>>3;
if (Code & 4)
   Delta+=StepTable[lIndexRight];
if (Code & 2)
   Delta+=StepTable[lIndexRight]>>1;
if (Code & 1)
   Delta+=StepTable[lIndexRight]>>2;
if (Code & 8) // sign bit
   lCurSampleRight-=Delta;
else
   lCurSampleRight+=Delta;

// clip sample
if (lCurSampleRight>32767)
   lCurSampleRight=32767;
else if (lCurSampleRight<-32768)
   lCurSampleRight=-32768;

lIndexRight+=IndexAdjust[Code]; // adjust index

// clip index
if (lIndexRight<0)
   lIndexRight=0;
else if (lIndexRight>88)
   lIndexRight=88;

// Now we've got lCurSampleLeft and lCurSampleRight which form one stereo
// sample and all is set for the next input byte...
Output((SHORT)lCurSampleLeft,(SHORT)lCurSampleRight); // send the sample to output

HINIBBLE and LONIBBLE are higher and lower 4-bit nibbles:
#define HINIBBLE(byte) ((byte) >> 4)
#define LONIBBLE(byte) ((byte) & 0x0F)
Note that depending on your compiler you may need to use additional nibble
separation in these defines, e.g. (((byte) >> 4) & 0x0F).

StepTable and IndexAdjust are the tables given in the next section of
this document.

Output() is just a placeholder for any action you would like to perform for
decompressed sample value.

Of course, this decompression routine may be greatly optimized.

As to mono sound, it's just analoguous:

Code=HINIBBLE(Input); // get HIGHER 4-bit nibble

Delta=StepTable[lIndex]>>3;
if (Code & 4)
   Delta+=StepTable[lIndex];
if (Code & 2)
   Delta+=StepTable[lIndex]>>1;
if (Code & 1)
   Delta+=StepTable[lIndex]>>2;
if (Code & 8) // sign bit
   lCurSample-=Delta;
else
   lCurSample+=Delta;

// clip sample
if (lCurSample>32767)
   lCurSample=32767;
else if (lCurSample<-32768)
   lCurSample=-32768;

lIndex+=IndexAdjust[Code]; // adjust index

// clip index
if (lIndex<0)
   lIndex=0;
else if (lIndex>88)
   lIndex=88;

Output((SHORT)lCurSample); // send the sample to output

Code=LONIBBLE(Input); // get LOWER 4-bit nibble
// ...just the same as above for lower nibble

Note that HIGHER nibble is processed first for mono sound and corresponds to
LEFT channel for stereo.

====================
3. IMA ADPCM Tables
====================

LONG IndexAdjust[]=
{
    -1,
    -1,
    -1,
    -1,
     2,
     4,
     6,
     8,
    -1,
    -1,
    -1,
    -1,
     2,
     4,
     6,
     8
};

LONG StepTable[]=
{
    7,	   8,	  9,	 10,	11,    12,     13,    14,    16,
    17,    19,	  21,	 23,	25,    28,     31,    34,    37,
    41,    45,	  50,	 55,	60,    66,     73,    80,    88,
    97,    107,   118,	 130,	143,   157,    173,   190,   209,
    230,   253,   279,	 307,	337,   371,    408,   449,   494,
    544,   598,   658,	 724,	796,   876,    963,   1060,  1166,
    1282,  1411,  1552,  1707,	1878,  2066,   2272,  2499,  2749,
    3024,  3327,  3660,  4026,	4428,  4871,   5358,  5894,  6484,
    7132,  7845,  8630,  9493,	10442, 11487,  12635, 13899, 15289,
    16818, 18500, 20350, 22385, 24623, 27086,  29794, 32767
};

=========================
4. HNM Movie Soundtracks
=========================

.HNM movie hasn't got soundtrack inside the HNM movie file itself.
However the soundtrack for HNM movie is a stand-alone APC file, as a rule,
with the same file title. The format of such APC soundtracks is just the
same as described above.

==================================
5. APC Audio Files in BF Archives
==================================

When stored in .BF resources, APC audio files are stored "as is", without
compression or encryption. That means if you want to play/extract APC
file from the BF resource you just need to search for (szID) id-string
("CRYO_APC") and read APC header starting at the beginning position of
found id-string. This will give you starting point of the file and the size
of the file will be the sum of APCHeader size and the compressed audio stream
size correspondent to (dwOutSize) header field (that is, (dwOutSize) for
stereo stream and (dwOutSize)/2 for mono stream).

===================
6. ZIK Audio Files
===================

In the game Chine music is in ZIK files. Most of them are simple RAW files:
16-bit signed stereo 22050 Hz (no header), except for only one which is a
regular WAV file. So, you can just load and play those ZIKs as RAWs (except
for one which is WAV).[/u]
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

EA Games formats : ASF, AS4, KSF, EAS, SPH, BNK, CRD, TGV

=======================
1. ASF/AS4 Music Files
=======================

The music in many Electronic Arts games is in .ASF/.AS4 stand-alone files.
These files have the block structure analoguous to RIFF. Namely, these files
are divided into blocks (without any global file header like RIFFs have).
Each block has the following header:

struct ASFBlockHeader
{
char szBlockID[4];
DWORD dwSize;
};

szBlockID -- string ID for the block.

dwSize -- size of the block (in bytes) INCLUDING this header.

Further I'll describe the contents of blocks of all block types in ASF/AS4
files. When I say "block begins with..." that means "the contents of that
block (which begin just after ASFBlockHeader) begin with...".
Quoted strings are block IDs.

"1SNh": header block. This is the first block in ASF/AS4.
This block begins with the structure describing the audio stream:

struct EACSHeader
{
char szID[4];
DWORD dwSampleRate;
BYTE bBits;
BYTE bChannels;
BYTE bCompression;
BYTE bType;
DWORD dwNumSamples;
DWORD dwLoopStart;
DWORD dwLoopLength;
DWORD dwDataStart;
DWORD dwUnknown;
};

szID -- ID string, always "EACS".

dwSampleRate -- sample rate for the file.

bBits -- if multiplied by 8 gives the resolution of (decompressed) sound
data, that is 1 means 8-bit and 2 means 16-bit.

bChannels -- channels number: 1 for mono, 2 for stereo.

bCompression -- if 0x00, the data in the file is not compressed: signed 8-bit
PCM or signed 16-bit PCM. If this byte is 0x02, the audio data is compressed
with IMA ADPCM. Note that non-compressed 8-bit files use SIGNED format!
Signed 16-bit data may be sent to the wave output without any additional
conversions, while signed 8-bit data should be converted to unsigned format.
For example you can do that so: unsigned8Bit=signed8Bit+0x80
or, just the same: unsigned8Bit=signed8Bit^0x80 (this's a bit faster).

bType -- type of file: always 0x00 for ASF/AS4 (multi-block) files.

dwNumSamples -- number of samples in the file. May be used for song length
(in seconds) calculation.

dwLoopStart -- beginning of the repeat loop (in samples). 0xFFFFFFFF means
no loop.

dwLoopLength -- length of the repeat loop (in samples). Zero for no loop.

dwDataStart -- in ASF/AS4 files this is not used (equal to 0).

After the EACSHeader the first chunk of sound data comes. If the data isn't
compressed, it's just signed 8/16-bit PCM. If the data is compressed, it
starts with a small chunk header:

struct ASFChunkHeader
{
DWORD dwOutSize;
LONG lIndexLeft;
LONG lIndexRight;
LONG lCurSampleLeft;
LONG lCurSampleRight;
};

dwOutSize -- size of uncompressed audio data in this chunk (in samples).

lIndexLeft, lIndexRight, lCurSampleLeft, lCurSampleRight are initial
values for IMA ADPCM decompression routine for this chunk (for left and right
channels respectively). I'll describe the usage of these further when I get to
IMA ADPCM decompression scheme.

Note that the structure above is ONLY for stereo files. For mono there're
just no lIndexRight and lCurSampleRight fields.

After this chunk header the compressed data comes. You may find IMA ADPCM
decompression scheme description further in this document.

Hereafter by "chunk" I mean the audio data in the "1SNd" data block, that is,
compressed data which starts after ASFChunkHeader.

"1SNd": data block. If no compression is used these blocks contain just
signed 8/16-bit PCM audio data. Otherwise the data in each of these blocks
begins with the same ASFChunkHeader described above and after that comes
compressed data.
Note that the first chunk of audio data is in "1SNh" block, along with the
global EACS header!

"1SNl": loop block. This block defines looping point for the song. It
contains only DWORD value, which is the looping jump position (in samples)
relative to the start of the song. Note that you should make the jump NOT
when you encounter this block but when you come across the "1SNe" block
which may appear some "1SNd" data blocks after this block!

"1SNe": end block. This block indicate the end of audio stream. Make looping
jump when you encounter it. It contains no data and its size is 8 bytes that
is the size of ASFBlockHeader.
Interesting that some AS4 files contain audio data beyond this block. This
should be considered as non-standard feature not worth to support.

===================
2. KSF Music Files
===================

Some EA games use other format for music/speech files: .KSF.
These files begin with "KWK`" ID string. Following this ID, comes PATl header.
It begins with "PATl" ID string and its size is 56 bytes (always?) including
its ID string. After PATl header comes TMpl header:

struct TMplHeader
{
char szID[4];
BYTE bUnknown1;
BYTE bBits;
BYTE bChannels;
BYTE bCompression;
WORD wUnknown2;
WORD wSampleRate;
DWORD dwNumSamples; // ???
BYTE bUnknown3[20];
};

szID -- string ID, always "TMpl".

bBits -- resolution of sound data (0x10 for 16-bit, 0x8 for 8-bit).

bChannels -- channels number: 1 for mono, 2 for stereo.

bCompression -- if 0x00, the data in the file is not compressed: signed 8-bit
PCM or signed 16-bit PCM. If this byte is 0x02, the audio data is compressed
with IMA ADPCM. See the note for EACS header above.

wSampleRate -- sample rate for the file.

dwNumSamples -- number of samples in the file. May be used for song length
(in seconds) calculation. Should be divided by 2 for mono sound.

After TMpl header comes sound data. For compressed files, IMA ADPCM compression
is used (see below).

=====================================
3. IMA ADPCM Decompression Algorithm
=====================================

During the decompression four LONG variables must be maintained for stereo
stream: lIndexLeft, lIndexRight, lCurSampleLeft, lCurSampleRight and two --
for mono stream: lIndex, lCurSample. At the beginning of each "1SNd" data
block and at the beginning of the file -- when processing "1SNh" block --
you must initialize these variables using the values in ASFChunkHeader.
Note that LONG here is signed.

Here's the code which decompresses one byte of IMA ADPCM compressed
stereo stream. Other bytes are processed in the same way.

BYTE Input; // current byte of compressed data
BYTE Code;
LONG Delta;

Code=HINIBBLE(Input); // get HIGHER 4-bit nibble

Delta=StepTable[lIndexLeft]>>3;
if (Code & 4)
Delta+=StepTable[lIndexLeft];
if (Code & 2)
Delta+=StepTable[lIndexLeft]>>1;
if (Code & 1)
Delta+=StepTable[lIndexLeft]>>2;
if (Code & 8) // sign bit
lCurSampleLeft-=Delta;
else
lCurSampleLeft+=Delta;

// clip sample
if (lCurSampleLeft>32767)
lCurSampleLeft=32767;
else if (lCurSampleLeft<-32768)
lCurSampleLeft=-32768;

lIndexLeft+=IndexAdjust

Code: Select all

; // adjust index

// clip index
if (lIndexLeft<0)
   lIndexLeft=0;
else if (lIndexLeft>88)
   lIndexLeft=88;

Code=LONIBBLE(Input); // get LOWER 4-bit nibble

Delta=StepTable[lIndexRight]>>3;
if (Code & 4)
   Delta+=StepTable[lIndexRight];
if (Code & 2)
   Delta+=StepTable[lIndexRight]>>1;
if (Code & 1)
   Delta+=StepTable[lIndexRight]>>2;
if (Code & 8) // sign bit
   lCurSampleRight-=Delta;
else
   lCurSampleRight+=Delta;

// clip sample
if (lCurSampleRight>32767)
   lCurSampleRight=32767;
else if (lCurSampleRight<-32768)
   lCurSampleRight=-32768;

lIndexRight+=IndexAdjust[Code]; // adjust index

// clip index
if (lIndexRight<0)
   lIndexRight=0;
else if (lIndexRight>88)
   lIndexRight=88;

// Now we've got lCurSampleLeft and lCurSampleRight which form one stereo
// sample and all is set for the next input byte...
Output((SHORT)lCurSampleLeft,(SHORT)lCurSampleRight); // send the sample to output

HINIBBLE and LONIBBLE are higher and lower 4-bit nibbles:
#define HINIBBLE(byte) ((byte) >> 4)
#define LONIBBLE(byte) ((byte) & 0x0F)
Note that depending on your compiler you may need to use additional nibble
separation in these defines, e.g. (((byte) >> 4) & 0x0F).

StepTable and IndexAdjust are the tables given in the next section of
this document.

Output() is just a placeholder for any action you would like to perform for
decompressed sample value.

Of course, this decompression routine may be greatly optimized.

As to mono sound, it's just analoguous:

Code=HINIBBLE(Input); // get HIGHER 4-bit nibble

Delta=StepTable[lIndex]>>3;
if (Code & 4)
   Delta+=StepTable[lIndex];
if (Code & 2)
   Delta+=StepTable[lIndex]>>1;
if (Code & 1)
   Delta+=StepTable[lIndex]>>2;
if (Code & 8) // sign bit
   lCurSample-=Delta;
else
   lCurSample+=Delta;

// clip sample
if (lCurSample>32767)
   lCurSample=32767;
else if (lCurSample<-32768)
   lCurSample=-32768;

lIndex+=IndexAdjust[Code]; // adjust index

// clip index
if (lIndex<0)
   lIndex=0;
else if (lIndex>88)
   lIndex=88;

Output((SHORT)lCurSample); // send the sample to output

Code=LONIBBLE(Input); // get LOWER 4-bit nibble
// ...just the same as above for lower nibble

Note that HIGHER nibble is processed first for mono sound and corresponds to
LEFT channel for stereo.

====================
4. IMA ADPCM Tables
====================

LONG IndexAdjust[]=
{
    -1,
    -1,
    -1,
    -1,
     2,
     4,
     6,
     8,
    -1,
    -1,
    -1,
    -1,
     2,
     4,
     6,
     8
};

LONG StepTable[]=
{
    7,	   8,	  9,	 10,	11,    12,     13,    14,    16,
    17,    19,	  21,	 23,	25,    28,     31,    34,    37,
    41,    45,	  50,	 55,	60,    66,     73,    80,    88,
    97,    107,   118,	 130,	143,   157,    173,   190,   209,
    230,   253,   279,	 307,	337,   371,    408,   449,   494,
    544,   598,   658,	 724,	796,   876,    963,   1060,  1166,
    1282,  1411,  1552,  1707,	1878,  2066,   2272,  2499,  2749,
    3024,  3327,  3660,  4026,	4428,  4871,   5358,  5894,  6484,
    7132,  7845,  8630,  9493,	10442, 11487,  12635, 13899, 15289,
    16818, 18500, 20350, 22385, 24623, 27086,  29794, 32767
};

=========================
5. TGV Movie Soundtracks
=========================

.TGV movies have the block structure analoguous to that of ASF/AS4.
Video-related data is in "kVGT" and "fVGT" (or "TGVk" and "TGVf") blocks and
sound-related data is just in the same blocks as in ASF/AS4: "1SNh", "1SNd",
"1SNl", "1SNe".
So, to play TGV movie soundtrack, just walk blocks chain, skip video blocks
and process sound blocks.

==================================
6. Sound/Speech Files: .EAS, .SPH
==================================

Some sounds and all speech are usually in .EAS and .SPH files.
These files have the header which is just the same as EACSHeader structure
described above with two additions:
(bType) is always 0xFF for sound/speech files,
(dwDataStart) is the starting position of audio data relative to the beginning
of the file.
After the header, starting at (dwDataStart), comes audio data, up to the end of
the file. The data is either non-compressed or IMA ADPCM compressed depending
on the (bCompression) byte in the header. If it's IMA ADPCM compressed, there're
no initial values for samples and indices at the beginning of the audio data.
Just initialize them all to zeroes and start decompression at (dwDataStart).

====================================
7. Sound Effects in .BNK/.CRD Files
====================================

Most of sound effects are stored in .BNK and .CRD resource files. Those .BNKs
and .CRDs may contain several sounds. They begin with some seemingly
meaningless data, but after some junk of that data (typically starting at
position 0x228, but not necessarily) come several EACS headers describing
all sounds in .BNK/.CRD. Each EACS header has almost the same format as
described above with some minor changes (some fields have different placement):

struct EACSHeader
{
  char	szID[4];
  DWORD dwSampleRate;
  BYTE	bBits;
  BYTE	bChannels;
  BYTE	bCompression;
  BYTE	bType;
  DWORD dwLoopStart;
  DWORD dwLoopLength;
  DWORD dwNumSamples;
  DWORD dwDataStart;
  DWORD dwUnknown;
};

and with the same two additions just as for .EAS/.SPH speech/sound:
(bType) is always 0xFF,
(dwDataStart) is the starting position of sound data relative to the beginning
of the .BNK/.CRD file containing that sound.
So, what you need to do is just search in .BNK/.CRD for "EACS" ID string and
read EACSHeader from the position where you found "EACS". And the same for
all sounds contained within .BNK/.CRD.
The sound data itself (for each EACS header describing it) starts at
(dwDataStart) and its size may be computed using (dwNumSamples) EACSHeader
field (for example) with the following formula:
Size=dwNumSamples*SampleSize/CompressionRatio,
where:
CompressionRatio=1 for non-compressed sounds,
		 2 for 8-bit IMA ADPCM compressed sounds,
		 4 for 16-bit IMA ADPCM compressed sounds,
SampleSize=bChannels*bBits (1 for mono 8-bit, 2 for mono 16-bit, etc.).
So, starting at (dwDataStart) comes just either PCM audio data (as described
above for .EAS/.SPH files) or IMA ADPCM compressed data (without initial
sample/index values, just as in .EAS/.SPH). Set CurSample(Left/Right) and
Index(Left/Right) to zeroes and start the decompression.
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

Westwood formats : AUD, PAK, TLK, VQA

=============
1. AUD Files
=============

Malcolm's AUD files have the same format as C&C's AUDs (which is described
in AUD3.TXT) with only one exception: there's no OutSize field in their header.
So it looks like the following:

struct AUDHeaderOld
{
WORD wSampleRate;
DWORD dwSize;
BYTE bFlags;
BYTE bType;
};

bType is equal to 0x01 for WS ADPCM compressed AUDs.
All WS ADPCM compressed sounds I've ever encountered are 8-bit.

The meanings of the other fields in AUD header are the same as for C&C AUDs.
These AUDs are divided in chunks with the chunk header being the same as for
C&C, but those chunks have variable size (may be NOT 512 bytes) unlike C&C
AUDs!

Note that WS ADPCM compressed AUDs in C&C (death screams) have just the same
format as other AUDs in this game, i.e. with OutSize field.

====================================
2. WS ADPCM Decompression Algorithm
====================================

Each AUD chunk may be decompressed independently of others. This lets you
implement the seeking for WS ADPCM AUDs (unlike IMA ADPCM ones).
But during the decompression of the given chunk a variable (CurSample) should
be maintained for this whole chunk:

SHORT CurSample;
BYTE InputBuffer[InputBufferSize]; // input buffer containing the whole chunk
WORD wSize, wOutSize; // Size and OutSize values from this chunk's header
BYTE code;
CHAR count; // this is a signed char!
WORD i; // index into InputBuffer
WORD input; // shifted input

if (wSize==wOutSize) // such chunks are NOT compressed
{
for (i=0;i<wOutSize;i++)
Output(InputBuffer); // send to output stream
return; // chunk is done!
}

// otherwise we need to decompress chunk

CurSample=0x80; // unsigned 8-bit
i=0;

// note that wOutSize value is crucial for decompression!

while (wOutSize>0) // until wOutSize is exhausted!
{
input=InputBuffer[i++];
input<<=2;
code=HIBYTE(input);
count=LOBYTE(input)>>2;
switch (code) // parse code
{
case 2: // no compression...
if (count & 0x20)
{
count<<=3; // here it's significant that (count) is signed:
CurSample+=count>>3; // the sign bit will be copied by these shifts!

Output((BYTE)CurSample);

wOutSize--; // one byte added to output
}
else // copy (count+1) bytes from input to output
{
for (count++;count>0;count--,wOutSize--,i++)
Output(InputBuffer);
CurSample=InputBuffer[i-1]; // set (CurSample) to the last byte sent to output
}
break;
case 1: // ADPCM 8-bit -> 4-bit
for (count++;count>0;count--) // decode (count+1) bytes
{
code=InputBuffer[i++];

CurSample+=WSTable4bit[(code & 0x0F)]; // lower nibble

CurSample=Clip8BitSample(CurSample);
Output((BYTE)CurSample);

CurSample+=WSTable4bit[(code >> 4)]; // higher nibble

CurSample=Clip8BitSample(CurSample);
Output((BYTE)CurSample);

wOutSize-=2; // two bytes added to output
}
break;
case 0: // ADPCM 8-bit -> 2-bit
for (count++;count>0;count--) // decode (count+1) bytes
{
code=InputBuffer[i++];

CurSample+=WSTable2bit[(code & 0x03)]; // lower 2 bits

CurSample=Clip8BitSample(CurSample);
Output((BYTE)CurSample);

CurSample+=WSTable2bit[((code>>2) & 0x03)]; // lower middle 2 bits

CurSample=Clip8BitSample(CurSample);
Output((BYTE)CurSample);

CurSample+=WSTable2bit[((code>>4) & 0x03)]; // higher middle 2 bits

CurSample=Clip8BitSample(CurSample);
Output((BYTE)CurSample);

CurSample+=WSTable2bit[((code>>6) & 0x03)]; // higher 2 bits

CurSample=Clip8BitSample(CurSample);
Output((BYTE)CurSample);

wOutSize-=4; // 4 bytes sent to output
}
break;
default: // just copy (CurSample) (count+1) times to output
for (count++;count>0;count--,wOutSize--)
Output((BYTE)CurSample);
}
}

HIBYTE and LOBYTE are just higher and lower bytes of WORD:
#define HIBYTE(word) ((word) >> 8)
#define LOBYTE(word) ((word) & 0xFF)
Note that depending on your compiler you may need to use additional byte
separation in these defines, e.g. (((byte) >> 8) & 0xFF). The same holds for
4-bit and 2-bit nibble separation in the code above.

WSTable4bit and WSTable2bit are the delta tables given in the next section.

Output() is just a placeholder for any action you would like to perform for
decompressed sample value.

Clip8BitSample is quite evident:

SHORT Clip8BitSample(SHORT sample)
{
if (sample>255)
return 255;
else if (sample<0)
return 0;
else
return sample;
}

This algorithm is ONLY for mono 8-bit unsigned sound, as I've never seen any
other sound format used with WS ADPCM compression.

Of course, the decompression routine described above may be greatly
optimized.

===================
3. WS ADPCM Tables
===================

CHAR WSTable2bit[]=
{
-2,
-1,
0,
1
};

CHAR WSTable4bit[]=
{
-9, -8, -6, -5, -4, -3, -2, -1,
0, 1, 2, 3, 4, 5, 6, 8
};

=====================================================
4. AUDs in Legend Of Kyrandia III: Malcolm's Revenge
=====================================================

The WS ADPCM compression described above is used for all audio in this game:

Music is stand-alone .AUD files.
Speech is AUDs in .TLK resource files.
Sounds are AUDs in .PAK resource files.

These .TLKs and .PAKs do not use any compression or encryption for AUDs, so
AUDs are stored "as is" in them. If you want to extract/play an AUD from
PAK or TLK you just need to search the PAK or TLK for the AUD id, that is,
DWORD value equal to 0x0000DEAF (or, in other words, string "\xAF\xDE\0\0").
Refer to Vladan Bato's AUD3.TXT for more details on AUD file structure.

=========================
5. VQA Movie Soundtracks
=========================

Soundtrack of VQA movie in Malcolm, C&C, Red Alert and C&C: Tiberian Sun is
stored in SND0, SND1 or SND2 blocks. Refer to VQA_FRMT.TXT by Aaron Glover for
details on the structure of VQA files. Here I only describe the contents of
VQA sound blocks and VQHD (header) block.

VQHD block contains header for VQA. To the best of my knowledge, it has the
following format:

struct VQAHeader
{
WORD wVersion;
WORD unknown1;
WORD wNumFrames;
WORD wWidth;
WORD wHeight;
WORD unknown2;
WORD unknown3;
WORD unknown4;
WORD unknown5;
DWORD unknown6;
WORD unknown7;
WORD wSampleRate;
BYTE bChannels;
BYTE bResolution;
char unknown8[14];
};

wVersion -- version of VQA: 1 -- oldest Malcolm's VQAs, 2 -- C&C, Red Alert,
3 -- C&C: Tiberian Sun.

wNumFrames -- number of frames in VQA. Note that number of sound blocks is
(wNumFrames+1) for VQAs of version 2 (C&C, Red Alert), and (wNumFrames) for
versions 1 and 3. But Dune2000 VQAs have also (wNumFrames) sound blocks
while they're version 2 VQAs.

wSampleRate -- sample rate for soundtrack. Note that version 1 (Malcolm's)
VQAs may have this value set to 0x0000! Use 22050 Hz in such cases.

bChannels -- number of channels (1 -- mono, 2 -- stereo). Note that version 1
VQAs may have this set to 0x00, so use 1 (mono) for such files.

bResolution -- resolution of soundtrack (0x10 -- 16-bit, 0x8 -- 8-bit). Note
that version 1 VQAs may have this set to 0x00, so use 0x8 for such files.

All VQAs in Malcolm have their sound in either SND0 or SND1 blocks.
SND0 blocks contain non-compressed PCM data.
SND1 blocks contain small header and WS ADPCM compressed sound data.
The header is the following:

struct SND1Header
{
WORD wOutSize;
WORD wSize;
};

Following the header comes WS ADPCM compressed sound data.
Each SND1 sound block may be decompressed, just like a chunk of AUD file,
independently of the others and the routine described above may be used for
its decompression without any changes, provided you use wOutSize from the
SND1Header.

As to VQAs in C&C and Red Alert their sound is in the SND2 blocks and
compressed with IMA ADPCM algorithm, described in Vladan Bato's AUD3.TXT.
The contents of SND2 block is just compressed data, without any headers and
those blocks should be decompressed in their turn just like chunks of IMA ADPCM
compressed AUD file as it's described in AUD3.TXT. This holds only for mono
soundtracks.

But there're also stereo soundtracks in C&C and C&C: Tiberian Sun. They have
different left/right channel nibbles layout.

For C&C (version 2) VQAs the layout is the following:
LL RR LL RR ...
That is, first byte contains two nibbles for two left channel values, next
byte contains nibbles for right channel, etc. Note that lower nibble should
be processed first and then higher one (see AUD3.TXT).

For C&C: Tiberian Sun (version 3) VQAs the layout is different: in SND2 block
first go all nibbles for left channel, then all nibbles for right channel:
LL LL LL ... LL RR RR RR ... RR
Note that nibbles should be processed in the same turn: lower nibble first.
So, when decoding SND2 block, just decompress first half of the block data
for left channel, then second half -- for right channel.
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

Need For Speed FSH format

These store the different bitmaps that make up the dashboard.
The 16-byte header format is the following :


offset len data
------ --- ----
00 4 'SHPI'
04 4 length of the file in bytes
08 4 number of objects in the directory
0C 4 directory identifier string
The directory identifier is 'GIMX' in .FSH dashboard files.

This header is followed by the directory entries, each consisting of a 4-byte identifier string, and a 4-byte offset inside the file pointing to the beginning of the corresponding data.

Each entry in the directory represents a piece of the dashboard. There are gaps between the directory and the first bitmap, and between consecutive bitmaps (significance unknown).

Each directory entry points to a bitmap block with the following structure :


offset len data
------ --- ----
00 1 7Bh
01 3 size of the block (= width x length + 10h)
04 2 width of the bitmap in pixels
06 2 heigth of the bitmap in pixels
08 4 ?
0C 2 x position to display the bitmap
0E 2 y position to display the bitmap
10 w.h bitmap data : 1 byte per pixel
Note that the object called 'dash' in the directory takes the whole screen (320x200 or 640x480, at position x=0, y=0)."

The various objects, depending on their 4-letter identifier, represent : the dashboard itself, the steering wheel in various positions, the radar detector lights, the gauges, and also pieces of the steering wheel to redraw over the gauges when necessary. Note that value FF in the bitmaps stands for the background : this is useful when a bitmap is drawn on top of another one.

Also note that some SHPI bitmap directories contain entries that actually describe the palette to be used with the bitmaps. Typically, entries with names like '!PAL', and with bitmap dimension 256x3, correspond to palettes. The palette data consists of 256 3-byte records, each record containing the red, green and blue components of the corresponding color (1 byte each).
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

Operation Flashpoint RTM format

Types:

integer - 32-bit signed integer
float32 - 32-bit floatnumber (single precision)
zstring32 - zero-terminated string with fixed length = 32 (padding), so max. string length = 31

File Structure:


Size (Type) Explanation
----------------------------------
4 (integer) Signature (0x5f4d5452 = 'RTM_')
4 (integer) Version (0x31303130 = '0101' = 1.1)
12 (Vertex) Total Moving for [x,y,z] (apply to all
components, all frames - see below)
4 (integer) Frames Count
4 (integer) Components Count (named selections)
- (zstring32 Array) Component Names Table, elemetns
count = Components Count
- (Frame Array) Frames Table, elemetns count = Frames Count
Vertex:


Size (Type) Explanation
--------------------------
4 (float32) X value
4 (float32) Y value
4 (float32) Z value
Frame:


Size (Type) Explanation
----------------------------------------
4 (float32) KeyFrame Time (valid values: 0.0 .. 1.0)
32 (zstring32) Component Name
48 (Transformation Matrix) Component Transformation Matrix
Transformation Matrix:


Size (Type) Explanation
--------------------------
4 (float32) M11
4 (float32) M12
4 (float32) M13
4 (float32) M21
4 (float32) M22
4 (float32) M23
4 (float32) M31
4 (float32) M32
4 (float32) M33
4 (float32) M41
4 (float32) M42
4 (float32) M43
So final 4x4 matrix looks like:


M11 M12 M13 0.0
M21 M22 M23 0.0
M31 M32 M33 0.0
M41 M42 M43 1.0
If Total Moving (TM) field used:


M11 M12 M13 0.0
M21 M22 M23 0.0
M31 M32 M33 0.0
(M41+TM.X) (M42+TM.Y) (M43+TM.Z) 1.0
i.e. if you not applyed TM then character (soldier) will be running at the same place.

Notes:
1. Most of animations have TM = (0, 0, 0).
2. The same transform matrix used also for wrp-file inserted objects.
3. For applying animation to characters models (soldiers) you must first to center model.

So new vertex position will be:
(LC- LandContact)
x' = x - LC.x;
y' = y - LC.y;
z' = z - LC.z;
x'' = M11 * x' + M21 * y' + M31 * z' + M41 + TM.X
y'' = M12 * x' + M22 * y' + M32 * z' + M42 + TM.Y
z'' = M13 * x' + M23 * y' + M33 * z' + M43 + TM.Z
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

Total Annihilation GAF

I'm also a big believer in examples, so I'll be walking you through a GAF
file (Archipelago.GAF) as I explain.

The first part of the file is the header, which looks like this:

typedef struct _GAFHEADER {
long IDVersion; /* Version stamp - always 0x00010100 */
long Entries; /* Number of items contained in this file */
long Unknown1; /* Always 0 */
} GAFHEADER;

Let's look at a sample header:
00000000 00 01 01 00 39 00 00 00 00 00 00 00

IDVersion is 0x00010100 like we expect. Entries is 0x39, indicating that
there are 57 items contained in this file.

Immediately following the header is a list of pointers, one for each entry.

The list of pointers here looks like:
00000000 68 EA 04 00
00000010 98 EA 04 00 C8 EA 04 00 F8 EA 04 00 78 EB 04 00
00000020 A0 EB 04 00 C0 ED 04 00 40 EE 04 00 68 EE 04 00
00000030 98 EE 04 00 C8 EE 04 00 F8 EE 04 00 78 EF 04 00
00000040 A8 EF 04 00 C8 F1 04 00 48 F2 04 00 78 F2 04 00
00000050 A8 F2 04 00 D8 F2 04 00 08 F3 04 00 88 F3 04 00
00000060 A8 F5 04 00 28 F6 04 00 58 F6 04 00 88 F6 04 00
00000070 B8 F6 04 00 E8 F6 04 00 18 F7 04 00 B8 F7 04 00
00000080 E8 F7 04 00 20 FB 04 00 C0 FB 04 00 F0 FB 04 00
00000090 20 FC 04 00 50 FC 04 00 80 FC 04 00 20 FD 04 00
000000A0 68 00 05 00 08 01 05 00 38 01 05 00 68 01 05 00
000000B0 98 01 05 00 C8 01 05 00 68 02 05 00 98 02 05 00
000000C0 F0 05 05 00 90 06 05 00 C0 06 05 00 F0 06 05 00
000000D0 68 07 05 00 28 08 05 00 58 08 05 00 88 08 05 00
000000E0 B8 08 05 00 E8 08 05 00 18 09 05 00 48 09 05 00

The next byte after the pointer list is at offset F0. Remember this.

Each pointer points to a structure that looks like this:

typedef struct _GAFENTRY {
short Frames; /* Number of frames in this entry */
short Unknown1; /* Unknown - always 1 */
long Unknown2; /* Unknown - always 0 */
char Name[32]; /* Name of the entry */
} GAFENTRY;

The first pointer is directs us to location 0x04EA68. Going
there, we find:

0004EA60 01 00 01 00 00 00 00 00 t...............
0004EA70 46 72 6F 6E 64 30 31 00 00 00 00 00 00 00 00 00 Frond01.........
0004EA80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

This entry has 1 frame, and is called 'Frond01'.

Following each entry is another list of structures, one for each frame.

typedef struct _GAFFRAMEENTRY {
long PtrFrameTable; /* Pointer to frame data */
long Unknown1; /* Unknown - varies */
} GAFFRAMEENTRY;

The frame entry looks like this:

0004EA90 78 D5 03 00 02 00 00 00

PtrFrameTable is 0x03D578. This points us to a structure containing data
about the first frame in this entry. It looks like this:

typedef struct _GAFFRAMEDATA {
short Width; /* Width of the frame */
short Height; /* Height of the frame */
short XPos; /* X offset */
short YPos; /* Y offset */
char Unknown1; /* Unknown - always 9 */
char Compressed; /* Compression flag */
short FramePointers; /* Count of subframes */
long Unknown2; /* Unknown - always 0 */
long PtrFrameData; /* Pointer to pixels or subframes */
long Unknown3; /* Unknown - value varies */
} GAFFRAMEDATA;

Here's the data:

0003D570 31 00 1F 00 15 00 0F 00
0003D580 09 01 00 00 00 00 00 00 F0 00 00 00 00 00 00 00

Width and height are (duh) the width and height of the frame. This frame is
0x31 by 0x1F pixels (49 x 31).

XPos and YPos are actually offsets that give the displacement of the frame
from the entry's actual position on the map. So if the entry itself was
placed at position 100,100, the frame would be at position
100-XPos,100-YPos. These offset can be negative. Here, they place the frame
at 21 pixels left, and 15 pixels above the initial placement of the item on
the map.

Unknown1 is always 9. No idea what it really means.

Compressed is the compression flag. If it's 0, the image is not compressed.
This image is compressed. More on this in a bit.

FramePointers. This is where it gets a little weird. If FramePointers is
0, then PtrFrameData points to pixel data. If it isn't, then PtrFrameData
points to a list of that many more pointers to GAFFRAMEDATA structures.
Each of these subframes is collectively treated as one frame. More in this
in a bit.

Unknown2 is always 0.

PtrFrameData points to the pixel data or to more GAFFRAMEDATA structures,
depending on the value of FramePointers. Here it's pointing to offset 0xF0.
If you remember, this is the first byte after the list of entry pointers way
back at the start of the file.

Unknown3 is a mystery. Sometimes the value is 0. Sometimes it isn't. No
idea what it means or how to calculate it.

Ok. Now we have this frame entry. Since FramePointers is 0, PtrFrameData
points to pixel data. If the frame were not compressed, it'd just be the
raw pixels, 31 chunks of 49 bytes each, corresponding to each line. This
frame is compressed, so things are a little different.

Let's look at the data:

000000F0 07 00 29 00 44 17 00 45 21 0E 00 1B 00 44 17 04
00000100 44 B3 07 00 45 09 00 44 1B 10 00 1B 00 A3 0D 00
00000110 44 07 00 A2 03 00 A2 0F 00 44 1D 16 00 13 00 45
00000120 07 00 44 03 00 45 13 00 44 03 00 45 09 00 45 03
00000130 00 A3 1D 1F 00 15 00 45 07 04 44 A3 07 00 45 03
00000140 00 44 03 00 44 03 00 A3 03 00 45 03 00 45 03 08
00000150 44 B3 A2 1F 1D 00 09 04 45 44 0B 04 44 45 07 08
00000160 44 A3 44 03 00 A2 03 04 A2 B3 05 00 45 05 08 44
00000170 A3 B3 23 25 00 0D 00 45 03 04 B3 44 05 00 45 07
00000180 08 A3 44 B3 03 04 44 5B 05 06 45 05 00 44 03 04
00000190 A2 45 09 04 44 45 07 00 44 0F 2B 00 0F 00 A2 03
000001A0 00 45 03 00 45 03 00 44 03 00 45 03 04 B4 35 03
000001B0 00 B3 05 08 45 B3 44 03 04 A2 46 03 08 B4 44 A2
000001C0 05 00 44 0D 00 44 0D 2B 00 09 00 45 05 34 45 44
000001D0 A3 45 A2 44 B4 45 45 B3 36 45 45 35 03 0C A2 46
000001E0 A3 45 03 04 45 46 03 00 A2 03 04 B3 45 03 00 45
000001F0 0B 00 44 0F 28 00 0B 04 45 A2 05 38 B2 A3 B4 A2
00000200 A2 B4 45 A2 45 45 A3 46 44 46 44 03 14 A2 46 46
00000210 44 B3 37 03 00 35 03 04 A2 45 05 06 45 13 2B 00
00000220 04 A3 44 03 00 44 07 34 B4 A2 44 45 46 45 46 A2
00000230 A2 46 A2 B4 46 A2 03 04 B3 44 03 24 B2 A2 44 A2
00000240 36 A2 45 B4 A2 46 03 04 46 44 17 2A 00 09 08 44
00000250 A3 44 05 08 A3 45 36 03 60 A2 45 B2 45 45 A2 A2
00000260 B4 44 A3 35 45 A3 44 B4 A2 46 B2 45 A2 A2 B2 46
00000270 44 A3 07 04 A2 45 0D 29 00 0B 10 44 45 45 B3 45
00000280 03 60 46 44 45 46 A2 45 B3 45 B2 35 A3 44 B3 A3
00000290 B4 A1 46 45 45 A2 46 A2 B4 45 45 03 0C 46 44 A2
000002A0 45 11 2F 00 00 44 03 0C 44 A3 45 B3 03 04 45 35
000002B0 03 00 44 03 0C 45 B4 45 B4 03 58 46 A2 45 B3 44
000002C0 A3 44 45 A2 46 44 45 45 46 44 44 45 A2 44 45 B4
000002D0 A2 46 13 28 00 00 A3 05 00 45 07 00 46 03 00 45
000002E0 03 04 45 A3 0E 45 48 A2 46 45 A2 46 B4 A2 45 46
000002F0 B3 46 45 46 46 45 A2 A2 B4 A3 0A 45 15 31 00 0B
00000300 20 A2 45 45 35 B3 44 46 45 45 03 06 45 00 46 03
00000310 5C A2 46 44 A2 45 B3 45 45 46 36 B3 A1 A3 44 B2
00000320 A2 45 B4 A2 A2 B3 45 44 45 03 04 A3 44 05 00 44
00000330 2F 00 07 00 45 03 04 45 A3 03 14 B3 36 A3 B2 45
00000340 45 03 70 45 A3 B3 36 45 46 B3 A2 44 A3 45 A3 44
00000350 46 35 B4 45 B3 35 46 A2 36 45 B4 36 A2 A3 B2 45
00000360 0B 2C 00 09 00 45 07 0C A3 45 45 36 03 06 45 20
00000370 B4 45 A2 5B 46 36 45 B4 45 03 24 46 44 46 B3 A2
00000380 45 45 A3 A1 B4 05 0C 46 A2 B4 5A 07 00 45 09 2A
00000390 00 07 00 45 03 04 44 45 03 3C 45 B3 A2 45 45 38
000003A0 A2 45 44 45 45 46 45 45 A2 45 03 28 37 B4 45 A2
000003B0 46 44 46 45 46 45 45 07 00 45 13 26 00 09 28 A3
000003C0 45 B3 A2 46 45 A2 B2 A3 44 38 03 04 B3 45 03 00
000003D0 45 03 00 45 09 18 45 46 A1 36 B4 A2 46 05 08 B3
000003E0 45 45 17 27 00 0F 00 46 0A 45 08 A3 45 46 03 06
000003F0 5B 00 A3 03 0C 5B 45 45 37 05 04 45 A2 03 04 45
00000400 B4 03 04 45 A2 07 00 45 05 00 45 13 24 00 09 06
00000410 45 14 46 44 A3 44 B3 45 03 00 38 05 00 5A 05 04
00000420 44 A3 07 14 46 45 B2 35 45 A2 05 04 44 45 0D 06
00000430 45 0F 23 00 07 00 45 09 04 B3 37 03 00 45 05 00
00000440 36 03 00 44 03 04 45 B3 05 20 45 44 A2 46 44 A3
00000450 B3 44 45 0B 00 44 19 1B 00 0D 0C 45 A2 5A 37 0B
00000460 04 44 A3 03 04 46 A3 03 08 35 B3 45 07 04 B3 45
00000470 0F 00 A3 19 1C 00 0B 04 44 A2 07 00 46 07 08 44
00000480 46 44 07 06 45 03 00 45 03 00 45 03 00 35 03 00
00000490 A2 27 15 00 09 00 45 11 08 44 45 A3 09 00 44 05
000004A0 00 A2 0B 00 46 0F 00 44 17 0B 00 1B 00 A3 11 04
000004B0 B3 44 0D 00 44 25 08 00 19 04 B3 44 0B 00 44 3B
000004C0 07 00 19 00 36 0D 00 45 3B 07 00 27 00 A2 05 00
000004D0 A2 35 04 00 2B 00 44 37

PtrFrameData points to a short integer that is a count of bytes for the
first line. Skip ahead that many bytes, and you get to a count for the
second line, etc, etc. The first line is 7 bytes long, and consists of 29
00 44 17 00 45 21. The Height parameter tells you how many lines there are,
in this case, 31. Broken into lines, minus the length data, we get:

Line 0 29 00 44 17 00 45 21
Line 1 1B 00 44 17 04 44 B3 07 00 45 09 00 44 1B
Line 2 1B 00 A3 0D 00 44 07 00 A2 03 00 A2 0F 00 44 1D
Line 3 13 00 45 07 00 44 03 00 45 13 00 44 03 00 45 09 00 45 03 00 A3 1D
Line 4 15 00 45 07 04 44 A3 07 00 45 03 00 44 03 00 44 03 00 A3 03 00 45 03 00 45 03 08 44 B3 A2 1F
Line 5 09 04 45 44 0B 04 44 45 07 08 44 A3 44 03 00 A2 03 04 A2 B3 05 00 45 05 08 44 A3 B3 23
Line 6 0D 00 45 03 04 B3 44 05 00 45 07 08 A3 44 B3 03 04 44 5B 05 06 45 05 00 44 03 04 A2 45 09 04 44 45 07 00 44 0F
Line 7 0F 00 A2 03 00 45 03 00 45 03 00 44 03 00 45 03 04 B4 35 03 00 B3 05 08 45 B3 44 03 04 A2 46 03 08 B4 44 A2 05 00 44 0D 00 44 0D
Line 8 09 00 45 05 34 45 44 A3 45 A2 44 B4 45 45 B3 36 45 45 35 03 0C A2 46 A3 45 03 04 45 46 03 00 A2 03 04 B3 45 03 00 45 0B 00 44 0F
Line 9 0B 04 45 A2 05 38 B2 A3 B4 A2 A2 B4 45 A2 45 45 A3 46 44 46 44 03 14 A2 46 46 44 B3 37 03 00 35 03 04 A2 45 05 06 45 13
Line 10 04 A3 44 03 00 44 07 34 B4 A2 44 45 46 45 46 A2 A2 46 A2 B4 46 A2 03 04 B3 44 03 24 B2 A2 44 A2 36 A2 45 B4 A2 46 03 04 46 44 17
Line 11 09 08 44 A3 44 05 08 A3 45 36 03 60 A2 45 B2 45 45 A2 A2 B4 44 A3 35 45 A3 44 B4 A2 46 B2 45 A2 A2 B2 46 44 A3 07 04 A2 45 0D
Line 12 0B 10 44 45 45 B3 45 03 60 46 44 45 46 A2 45 B3 45 B2 35 A3 44 B3 A3 B4 A1 46 45 45 A2 46 A2 B4 45 45 03 0C 46 44 A2 45 11
Line 13 00 44 03 0C 44 A3 45 B3 03 04 45 35 03 00 44 03 0C 45 B4 45 B4 03 58 46 A2 45 B3 44 A3 44 45 A2 46 44 45 45 46 44 44 45 A2 44 45 B4 A2 46 13
Line 14 00 A3 05 00 45 07 00 46 03 00 45 03 04 45 A3 0E 45 48 A2 46 45 A2 46 B4 A2 45 46 B3 46 45 46 46 45 A2 A2 B4 A3 0A 45 15
Line 15 0B 20 A2 45 45 35 B3 44 46 45 45 03 06 45 00 46 03 5C A2 46 44 A2 45 B3 45 45 46 36 B3 A1 A3 44 B2 A2 45 B4 A2 A2 B3 45 44 45 03 04 A3 44 05 00 44
Line 16 07 00 45 03 04 45 A3 03 14 B3 36 A3 B2 45 45 03 70 45 A3 B3 36 45 46 B3 A2 44 A3 45 A3 44 46 35 B4 45 B3 35 46 A2 36 45 B4 36 A2 A3 B2 45 0B
Line 17 09 00 45 07 0C A3 45 45 36 03 06 45 20 B4 45 A2 5B 46 36 45 B4 45 03 24 46 44 46 B3 A2 45 45 A3 A1 B4 05 0C 46 A2 B4 5A 07 00 45 09
Line 18 07 00 45 03 04 44 45 03 3C 45 B3 A2 45 45 38 A2 45 44 45 45 46 45 45 A2 45 03 28 37 B4 45 A2 46 44 46 45 46 45 45 07 00 45 13
Line 19 09 28 A3 45 B3 A2 46 45 A2 B2 A3 44 38 03 04 B3 45 03 00 45 03 00 45 09 18 45 46 A1 36 B4 A2 46 05 08 B3 45 45 17
Line 20 0F 00 46 0A 45 08 A3 45 46 03 06 5B 00 A3 03 0C 5B 45 45 37 05 04 45 A2 03 04 45 B4 03 04 45 A2 07 00 45 05 00 45 13
Line 21 09 06 45 14 46 44 A3 44 B3 45 03 00 38 05 00 5A 05 04 44 A3 07 14 46 45 B2 35 45 A2 05 04 44 45 0D 06 45 0F
Line 22 07 00 45 09 04 B3 37 03 00 45 05 00 36 03 00 44 03 04 45 B3 05 20 45 44 A2 46 44 A3 B3 44 45 0B 00 44 19
Line 23 0D 0C 45 A2 5A 37 0B 04 44 A3 03 04 46 A3 03 08 35 B3 45 07 04 B3 45 0F 00 A3 19
Line 24 0B 04 44 A2 07 00 46 07 08 44 46 44 07 06 45 03 00 45 03 00 45 03 00 35 03 00 A2 27
Line 25 09 00 45 11 08 44 45 A3 09 00 44 05 00 A2 0B 00 46 0F 00 44 17
Line 26 1B 00 A3 11 04 B3 44 0D 00 44 25
Line 27 19 04 B3 44 0B 00 44 3B
Line 28 19 00 36 0D 00 45 3B
Line 29 27 00 A2 05 00 A2 35
Line 30 2B 00 44 37

To decode the line, to the following:

1. Read a byte. This is a mask.
2. If (mask & 0x01) = 0x01
skip ahead (mask >> 1) pixels. This is transparency, allowing whatever
was under the frame to show through.
else if (mask & 0x02) = 0x02
copy the next byte (mask >> 2)+1 times to output.
else
copy the next (mask & 0x02)+1 bytes to output.
3. go back to 1, until there are no more bytes left in the line.

A C code fragment to do this is:

char *data; // points to pixel data

for (y = 0; y < FrameData.Height; y++) {
bytes = *((short *) data);
data += sizeof(short);
count = 0;
x = 0;
while (count < bytes) {
mask = (unsigned char) data[count++];
if ((mask & 0x01) == 0x01) {
// transparent
x += (mask >> 1);
else if ((mask & 0x02) == 0x02) {
// repeat next byte
repeat = (mask >> 2) + 1;
while (repeat--)
putpixel(x++, y, data[count]);
count++;
}
else {
repeat = (mask >> 2) + 1;
while (repeat--)
putpixel(x++, y, data[count++]);
}
}
data += bytes; // point to next line
}

We do this to the above mess of data, and we get:

Line 0 44 45 *
Line 1 44 44 B3 45 44 *
Line 2 A3 44 A2 A2 44 *
Line 3 45 44 45 44 45 45 A3 *
Line 4 45 44 A3 45 44 44 A3 45 45 44 B3 A2 *
Line 5 45 44 44 45 44 A3 44 A2 A2 B3 45 44 A3 B3 *
Line 6 45 B3 44 45 A3 44 B3 44 5B 45 45 44 A2 45 44 45 44 *
Line 7 A2 45 45 44 45 B4 35 B3 45 B3 44 A2 46 B4 44 A2 44 44 *
Line 8 45 45 44 A3 45 A2 44 B4 45 45 B3 36 45 45 35 A2 46 A3 45 45 46 A2 B3 45 45 44 *
Line 9 45 A2 B2 A3 B4 A2 A2 B4 45 A2 45 45 A3 46 44 46 44 A2 46 46 44 B3 37 35 A2 45 45 45 *
Line 10 A3 44 44 B4 A2 44 45 46 45 46 A2 A2 46 A2 B4 46 A2 B3 44 B2 A2 44 A2 36 A2 45 B4 A2 46 46 44 *
Line 11 44 A3 44 A3 45 36 A2 45 B2 45 45 A2 A2 B4 44 A3 35 45 A3 44 B4 A2 46 B2 45 A2 A2 B2 46 44 A3 A2 45 *
Line 12 44 45 45 B3 45 46 44 45 46 A2 45 B3 45 B2 35 A3 44 B3 A3 B4 A1 46 45 45 A2 46 A2 B4 45 45 46 44 A2 45 *
Line 13 44 44 A3 45 B3 45 35 44 45 B4 45 B4 46 A2 45 B3 44 A3 44 45 A2 46 44 45 45 46 44 44 45 A2 44 45 B4 A2 46 *
Line 14 A3 45 46 45 45 A3 45 45 45 45 A2 46 45 A2 46 B4 A2 45 46 B3 46 45 46 46 45 A2 A2 B4 A3 45 45 45 *
Line 15 A2 45 45 35 B3 44 46 45 45 45 45 46 A2 46 44 A2 45 B3 45 45 46 36 B3 A1 A3 44 B2 A2 45 B4 A2 A2 B3 45 44 45 A3 44 44*
Line 16 45 45 A3 B3 36 A3 B2 45 45 45 A3 B3 36 45 46 B3 A2 44 A3 45 A3 44 46 35 B4 45 B3 35 46 A2 36 45 B4 36 A2 A3 B2 45 *
Line 17 45 A3 45 45 36 45 45 B4 45 A2 5B 46 36 45 B4 45 46 44 46 B3 A2 45 45 A3 A1 B4 46 A2 B4 5A 45 *
Line 18 45 44 45 45 B3 A2 45 45 38 A2 45 44 45 45 46 45 45 A2 45 37 B4 45 A2 46 44 46 45 46 45 45 45 *
Line 19 A3 45 B3 A2 46 45 A2 B2 A3 44 38 B3 45 45 45 45 46 A1 36 B4 A2 46 B3 45 45 *
Line 20 46 45 45 45 A3 45 46 5B 5B A3 5B 45 45 37 45 A2 45 B4 45 A2 45 45 *
Line 21 45 45 46 44 A3 44 B3 45 38 5A 44 A3 46 45 B2 35 45 A2 44 45 45 45 *
Line 22 45 B3 37 45 36 44 45 B3 45 44 A2 46 44 A3 B3 44 45 44 *
Line 23 45 A2 5A 37 44 A3 46 A3 35 B3 45 B3 45 A3 *
Line 24 44 A2 46 44 46 44 45 45 45 45 35 A2 *
Line 25 45 44 45 A3 44 A2 46 44 *
Line 26 A3 B3 44 44 *
Line 27 B3 44 44 *
Line 28 36 45 *
Line 29 A2 A2 *
Line 30 44 *

This is essentially a big green splat, used to represent a patch of
reclaimable foliage.
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

Total Annihilation HPI format

HPI File Format

I figured some of this stuff out by disassembling WriteHPI. All hail
Eric DeZert.

I'd also like to thank Jesse Michael for his clear and concise explanation
of the compression scheme (which I shamelessly incorporated into this document),
and Barry Pedersen for helpful comments and miscellaneous useful info.

ZLib compression and decompression by Jean-loup Gailly (compression) and
Mark Adler (decompression).
For more info, see the zlib Home Page at http://www.cdrom.com/pub/infozip/zlib/

The rest I figured out on my own by looking at the data and using
a bit of common sense.

Warning: This is intended for use by people that already know what
they're doing.

I'm a C programmer, so I'm doing things in C notation here, but
I'll try to explain it so that those of you that don't speak C
will be able to understand. If you don't understand, write me
at joed@cws.org and I'll try to clear things up.

I'm also a big believer in examples, so I'll be walking you through
an HPI file as I explain.

The first part of the file is a header. Except for the copyright
statement at the end, this is the only unencrypted portion of the
file. The header looks like this:

typedef struct _HPIHeader {
long HPIMarker; /* 'HAPI' */
long SaveMarker; /* 'BANK' if saved gamed */
long DirectorySize; /* The size of the directory */
long HeaderKey; /* Decrypt key */
long Start; /* File offset of directory */
} HPIHeader;

Here's a hex dump of a sample header:
00000000 48 41 50 49 00 00 01 00 24 02 00 00 7D 00 00 00 HAPI....$...}...
00000010 14 00 00 00

Taken individually:

HPIMarker

This is just a marker. The value is always HAPI in ASCII. In
hex, it's 0x49504148.

SaveMarker

If it's a saved game, the value is BANK in ASCII, or 0x4B4E4142 in
hex. Save game files are something of a special case, and I haven't
done much to try to decode those. The value in normal HPI files is
0x00010000, but I have no idea if this means anything. I just check
for BANK, and ignore it otherwise.

DirectorySize

This is the size of the directory contained in the HPI file. Here,
the value is 0x224, or 548 bytes. This includes the size of the
header.

HeaderKey

The decryption key. Its value is 0x0000007D. More on this later.

Start

The offset in the file where the directory starts. I have yet to
see one that didn't start immediately after the header at offset
0x14, but you never know.

Now we know enough to read the directory. But first, a small
implementation note. Instead of allocating a buffer of DirectorySize
bytes and then reading the directory into it, allocate a buffer of
DirectorySize bytes, and read DirectorySize-Start bytes into the buffer
at position Start. This is because the directory contains pointers,
but the pointers are relative to the start of the file, not the start
of the directory. By moving the directory down Start bytes into the
buffer, we simplify the program. If we didn't do this, we'd have to
subtract Start from every offset, and that would be a royal pain.

Now some of you are undoubtedly looking at an HPI file with a hex dump
program, and saying "That sure doesn't look like a directory to me!"
Well, you're right. That's because it's encrypted.

To decrypt it, first calculate the decryption key from the HeaderKey
variable:

Key = NOT ((HeaderKey * 4) OR (HeaderKey >> 6))

Doing this on the 0x0000007D, you get FFFFFE0A (I think).

Here is the C code for the decryption routine. Since everything in the
file is encrypted, I found it easier to combine the read and decryption
functions into one.

int ReadAndDecrypt(int fpos, char *buff, int buffsize)
/*
Read "buffsize" bytes from the HPI file at position "fpos"
into "buff", and then decrypt it.
*/
{
int count;
int tkey;
int result;

/* first, position the file */
fseek(HPIFile, fpos, SEEK_SET);

/* read the data into buff */
result = fread(buff, 1, buffsize, HPIFile);

/* for each character in buff... */
for (count = 0; count < buffsize; count++) {

/* compute tkey = (fpos + count) XOR Key */
tkey = (fpos + count) ^ Key;

/* and then decode the character:
buff[count] = tkey XOR (NOT buff[count]) */
buff[count] = tkey ^ ~buff[count];
}

/* result is the number of bytes actually read in,
and should be equal to buffsize */
return result;
}

Note that the position of the byte in the file (fpos+count) is used to
decrypt.

And here is a decoded directory to make it easy to follow. Note that
I loaded the actual directory starting at offset 0x14, so that the
first 0x14 bytes are all zeros. See the implementation
note above.

All numbers here are 32-bit integers, ie "longs".

00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000010 00 00 00 00 08 00 00 00 1C 00 00 00 64 00 00 00 ............d...
00000020 6A 00 00 00 01 97 00 00 00 A0 00 00 00 01 C6 00 j...............
00000030 00 00 CF 00 00 00 01 13 01 00 00 1D 01 00 00 01 ................
00000040 66 01 00 00 6E 01 00 00 01 94 01 00 00 9D 01 00 f...n...........
00000050 00 01 C3 01 00 00 C9 01 00 00 01 EF 01 00 00 F7 ................
00000060 01 00 00 01 61 6E 69 6D 73 00 01 00 00 00 72 00 ....anims.....r.
00000070 00 00 7B 00 00 00 8E 00 00 00 00 61 72 6D 66 6C ..{........armfl
00000080 61 6B 5F 67 61 64 67 65 74 2E 67 61 66 00 24 02 ak_gadget.gaf.$.
00000090 00 00 D8 2D 00 00 01 64 6F 77 6E 6C 6F 61 64 00 ...-...download.
000000A0 01 00 00 00 A8 00 00 00 B1 00 00 00 BD 00 00 00 ................
000000B0 00 41 52 4D 46 4C 41 4B 2E 54 44 46 00 61 28 00 .ARMFLAK.TDF.a(.
000000C0 00 01 01 00 00 01 66 65 61 74 75 72 65 73 00 01 ......features..
000000D0 00 00 00 D7 00 00 00 E0 00 00 00 E8 00 00 00 01 ................
000000E0 63 6F 72 70 73 65 73 00 01 00 00 00 F0 00 00 00 corpses.........
000000F0 F9 00 00 00 0A 01 00 00 00 61 72 6D 66 6C 61 6B .........armflak
00000100 5F 64 65 61 64 2E 74 64 66 00 E3 28 00 00 68 02 _dead.tdf..(..h.
00000110 00 00 01 6F 62 6A 65 63 74 73 33 64 00 02 00 00 ...objects3d....
00000120 00 25 01 00 00 37 01 00 00 43 01 00 00 00 4C 01 .%...7...C....L.
00000130 00 00 5D 01 00 00 00 61 72 6D 66 6C 61 6B 2E 33 ..]....armflak.3
00000140 64 6F 00 39 2A 00 00 7B 14 00 00 01 61 72 6D 66 do.9*..{....armf
00000150 6C 61 6B 5F 64 65 61 64 2E 33 64 6F 00 C9 34 00 lak_dead.3do..4.
00000160 00 1A 11 00 00 01 73 63 72 69 70 74 73 00 01 00 ......scripts...
00000170 00 00 76 01 00 00 7F 01 00 00 8B 01 00 00 00 41 ..v............A
00000180 52 4D 46 4C 41 4B 2E 43 4F 42 00 67 3F 00 00 E4 RMFLAK.COB.g?...
00000190 09 00 00 01 75 6E 69 74 70 69 63 73 00 01 00 00 ....unitpics....
000001A0 00 A5 01 00 00 AE 01 00 00 BA 01 00 00 00 41 52 ..............AR
000001B0 4D 46 4C 41 4B 2E 50 43 58 00 B4 42 00 00 91 25 MFLAK.PCX..B...%
000001C0 00 00 01 75 6E 69 74 73 00 01 00 00 00 D1 01 00 ...units........
000001D0 00 DA 01 00 00 E6 01 00 00 00 41 52 4D 46 4C 41 ..........ARMFLA
000001E0 4B 2E 46 42 49 00 89 63 00 00 39 05 00 00 01 77 K.FBI..c..9....w
000001F0 65 61 70 6F 6E 73 00 01 00 00 00 FF 01 00 00 08 eapons..........
00000200 02 00 00 1B 02 00 00 00 61 72 6D 66 6C 61 6B 5F ........armflak_
00000210 77 65 61 70 6F 6E 2E 74 64 66 00 2D 67 00 00 42 weapon.tdf.-g..B
00000220 02 00 00 01

Let's get started...

00000010 00 00 00 00 08 00 00 00 1C 00 00 00 64 00 00 00 ............d...
^^^^^^^^^^^ ^^^^^^^^^^^

At offset 0x14, you see the number 0x8. This is the number of
entries in the directory. Grabbing the next 32-bit number at
offset 0x18, you get 0x1C. This is the offset of a list of directory entries.
In this case, there are 8 entries in the list. The format of an entry is:

typedef struct _HPIEntry {
long NameOffset; /* points to the file name */
long DirDataOffset; /* points to directory data */
char Flag; /* file flag */
} HPIEntry;

NameOffset

Pointer to the file name. This is a 0-terminated string of varying length.

DirDataOffset

Pointer to the directory data for the file. The actual data varies depending
on whether it's a file or a subdirectory.

Flag
If this is 1, the entry is a subdirectory. If it's 0, it's a file.

Looking at offset 0x1C, we see:

00000010 64 00 00 00 ............d...
00000020 6A 00 00 00 01 97 00 00 00 A0 00 00 00 01 C6 00 j...............
00000030 00 00 CF 00 00 00 01 13 01 00 00 1D 01 00 00 01 ................
00000040 66 01 00 00 6E 01 00 00 01 94 01 00 00 9D 01 00 f...n...........
00000050 00 01 C3 01 00 00 C9 01 00 00 01 EF 01 00 00 F7 ................
00000060 01 00 00 01

The 8 entries are:
0x064, 0x06A, 1
0x097, 0x0A0, 1
0x0C6, 0x0CF, 1
0x113, 0x11D, 1
0x166, 0x16E, 1
0x194, 0x19D, 1
0x1C3, 0x1C9, 1
0x1EF, 0x1F7, 1

Let's look at the first entry. The Flag is 1, so it's a
subdirectory. At offset 0x64, we see:

00000060 01 00 00 01 61 6E 69 6D 73 00 01 00 00 00 72 00 ....anims.....r.
^^^^^^^^^^^^^^^^^
or 'anims'. This is the name. Since this is a subdirectory,
offset 0x6A contains the number of entries in the subdirectory,
followed by a pointer to the first entry. This is exactly like
the count/pointer pair at 0x14 that got us started. Think recursion.

00000060 01 00 00 01 61 6E 69 6D 73 00 01 00 00 00 72 00 ....anims.....r.
^^^^^^^^^^^ ^^^^^
00000070 00 00 7B 00 00 00 8E 00 00 00 00 61 72 6D 66 6C ..{........armfl
^^^^^

The number at offset 0x6A is a 1, indicating that there's only 1
file in this subdirectory. 0x6E contains the offset of the first
(and only) entry in the subdirectory, which is:

0x7B, 0x8E, 0

The 0 indicates that this is a file. Looking at offset 0x7B, we see:

00000070 00 00 7B 00 00 00 8E 00 00 00 00 61 72 6D 66 6C ..{........armfl
^^^^^^^^^^^^^^
00000080 61 6B 5F 67 61 64 67 65 74 2E 67 61 66 00 24 02 ak_gadget.gaf.$.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

or 'armflak_gadget.gaf'. This is the name of the first (and only) file
in the 'anims' subdirectory. Since this is a file, the data at offset
0x8E is a little different.

There are 3 items here instead if one:

typedef struct _HPIFileData {
long DataOffset; /* starting offset of the file */
long FileSize; /* size of the decompressed file */
char Flag; /* file flag */
} HPIEntry;

DataOffset

This is the offset in the HPI file that this file starts at.

FileSize

This is the decompressed file size. When you extract the file, it
should be this many bytes long.

Flag

If this is 1, the file is compressed with LZ77 compression.
If it's 2, it's compressed with ZLIb compression.
If it's 0, it's not compressed at all. This is the format used by the
unit viewer.

00000080 61 6B 5F 67 61 64 67 65 74 2E 67 61 66 00 24 02 ak_gadget.gaf.$.
^^^^^
00000090 00 00 D8 2D 00 00 01 64 6F 77 6E 6C 6F 61 64 00 ...-...download.
^^^^^ ^^^^^^^^^^^ ^^

Looking at offset 0x8E, we see that the three items are:

0x224, 0x2DD8, 1

If you recall, the directory size was 0x224 bytes. This says the file
starts at the first offset after the directory, which makes sense and means
we're interpreting things correctly. This also says that the extracted
file should be 0x2DD8 (or 11,736) bytes long.

At this point, we know enough to actually traverse the directory tree in the
HPI file. Here's a recursive pseudocode function to do it. The initial call
to it would be 'TraverseTree(".", Header.Start)'.

TraverseTree(string ParentName, int offset)

Entries = Directory[offset]
EntryOffset = Directory[offset+4]

for count = 1 to Entries
NameOffset = Directory[EntryOffset]
DataOffset = Directory[EntryOffset+4]
Flag = Directory[EntryOffset+8]

Name = ParentName+"\"+Directory[NameOffset]

print "Processing ",Name

if Flag = 1
TraverseTree(Name, DataOffset) <- recursion!
else
ProcessFile(Name, DataOffset)
End If

EntryOffset = EntryOffset + 9
Next Count

If you code this up in your language of choice and run it, it
should print something like this: (if you haven't guessed already,
the file I'm using as an example is the "Arm Flakker" unit's
aflakker.ufo file)

.\anims
.\anims\armflak_gadget.gaf
.\download
.\download\ARMFLAK.TDF
.\features
.\features\corpses
.\features\corpses\armflak_dead.tdf
.\objects3d
.\objects3d\armflak.3do
.\objects3d\armflak_dead.3do
.\scripts
.\scripts\ARMFLAK.COB
.\unitpics
.\unitpics\ARMFLAK.PCX
.\units
.\units\ARMFLAK.FBI
.\weapons
.\weapons\armflak_weapon.tdf

At this point, I urge you to go look at that directory hex dump and
traverse the thing by hand until it makes sense.

I can hear you now. "What the heck is that 'ProcessFile' function?"
It decodes the file. I'll explain in a bit.

But first, here's a list of the various files in this HPI file, and
their starting offsets.

If you don't understand where I got the starting offsets,
go reread the directory hex dump until you do.

anims\armflak_gadget.gaf 0x0224
download\ARMFLAK.TDF 0x2861
features\corpses\armflak_dead.tdf 0x28E3
objects3d\armflak.3do 0x2A39
objects3d\armflak_dead.3do 0x34C9
scripts\ARMFLAK.COB 0x3F67
unitpics\ARMFLAK.PCX 0x42B4
units\ARMFLAK.FBI 0x6389
weapons\armflak_weapon.tdf 0x672D

Because it's a short file, and because it decodes to readable
text, I'm going to use the ARMFLAK.TDF file as the example.

If the file was not compressed at all, then the file is just inserted
into the HPI file as one big chunk.

But if it is...

This is where the -REAL- fun begins. I'm going to take it slow
here because I'm still half figuring it out myself (writing this
has actually made me realize some things that I hadn't before).

When the file was compressed, it was broken up into chunks of 64K (65536)
bytes each, plus one more chunk to hold anything left over. Each chunk
was then compressed. Note that some chunks are larger when compressed
than decompressed, which means that some compressed chunks can be larger
than 64K.

The total number of chunks in the file can be obtained by
the following formula:

chunks = Entry.FileSize / 65536
if (Entry.FileSize mod 65536) <> 0
chunks = chunks + 1

The offset in the directory points to a list of 32-bit numbers
that are the compressed sizes of each compressed chunk of data.

Following the list of sizes are the actual compressed chunks
of data.

In this HPI file, each file has only one chunk, but the totala1.hpi file
contains some files with a dozen or so, and the hpi files on the CDs have
files in them with over a hundred.

Going to offset 0x2861, we read in a chunk o'data and decrypt it
to find this:

00002860 7E 00 00 00 53 51 53 48 02 01 01 6B 00 00 00 .~...SQSH...k...
00002870 01 01 00 00 FE 36 00 00 20 5B 51 49 4E 55 3C 0E .....6.. [QINU<.
00002880 64 64 94 5D 49 5D 11 14 29 7B D5 26 18 55 6E 75 dd.]I]..){.&.Unu
00002890 64 54 34 41 79 6C 9C 71 81 83 8B 3B 59 49 CB 4D dT4Ayl.q...;YI.M
000028A0 43 D1 42 54 9A A5 A8 AA AF B0 B8 64 61 AC 6D B0 C.BT.......da.m.
000028B0 B1 82 72 34 79 B8 B0 BD DD A3 82 81 E8 86 AC 89 ..r4y...........
000028C0 98 92 C2 CF 98 EB 9D E0 56 BF A2 6F AB 5F A8 96 ........V..o._..
000028D0 B5 C3 9F B8 EB B9 BE 7D 4F C7 5F CE 2F D1 4C D1 .......}O._./.L.
000028E0 D0 D2 90

The decompressed file size of ARMFLAK.TDF is 257 bytes. This
tells us that there's only one chunk. The size of this chunk
is 0x7E bytes. The chunk itself immediately follows.

Each chunk looks like this:

typedef struct _HPIChunk {
long Marker; /* always 0x48535153 (SQSH) */
char Unknown1;
char CompMethod; /* 1=LZ77, 2=ZLib */
char Encrypt; /* is the block encrypted? */
long CompressedSize; /* the length of the compressed data */
long DecompressedSize; /* the length of the decompressed data */
long Checksum; /* Checksum */
char data[]; /* 'CompressedSize' bytes of data */
} HPIChunk;

Marker

This is the start-of-chunk marker, and is always 0x48535153 (ASCII 'SQSH').

Unknown1
I know not what this is for. It's always 0x02.
Maybe some sort of version number?

CompMethod
This is the compression method. It's 1 for LZ77, 2 for ZLib.

Encrypt
This tells whether the block is encrypted a second time. See below.

CompressedSize

This is the size of the compressed data in the chunk. 0x6B bytes.

DecompressedSize

This is the size of the decompressed data in the chunk. 0x101 bytes.

Checksum

This is a checksum of the data. It's merely the sum of all the bytes of
data (treated as unsigned numbers) added together.

data

The actual compressed data in the chunk. CompressedSize (0x6B) bytes of it.

Let's look at the data.

00002870 20 5B 51 49 4E 55 3C 0E .....6.. [QINU<.
00002880 64 64 94 5D 49 5D 11 14 29 7B D5 26 18 55 6E 75 dd.]I]..){.&.Unu
00002890 64 54 34 41 79 6C 9C 71 81 83 8B 3B 59 49 CB 4D dT4Ayl.q...;YI.M
000028A0 43 D1 42 54 9A A5 A8 AA AF B0 B8 64 61 AC 6D B0 C.BT.......da.m.
000028B0 B1 82 72 34 79 B8 B0 BD DD A3 82 81 E8 86 AC 89 ..r4y...........
000028C0 98 92 C2 CF 98 EB 9D E0 56 BF A2 6F AB 5F A8 96 ........V..o._..
000028D0 B5 C3 9F B8 EB B9 BE 7D 4F C7 5F CE 2F D1 4C D1 .......}O._./.L.
000028E0 D0 D2 90

Doesn't look like much, does it. That's because (YOU GUESSED IT!) it's
encrypted yet again! Note: the checksum is calculated BEFORE this
decryption.

The 'Encrypt' field in the HPIChunk header is set to 1 to indicate
that this decryption needs to be done.

To decrypt, do this (more pseudocode):

for x = 0 to CompressedSize-1
data[x] = (data[x] - x) XOR x
next x

This gives us:

00002870 20 5B 4D 45 4E 55 30 00 [MENU0.
00002880 54 52 80 59 31 5D 0D 0A 09 7B D1 00 10 55 4E 49 TR.Y1]...{...UNI
00002890 54 22 00 3D 41 52 60 4D 41 43 4B 3B 11 01 83 01 T".=AR`MACK;....
000028A0 33 81 32 02 42 55 54 54 4F 4E B4 02 19 42 01 4E 3.2.BUTTON...B.N
000028B0 41 70 02 C2 01 46 4C 41 DD 23 02 7D E0 04 20 05 Ap...FLA.#.}.. .
000028C0 18 00 32 CF 00 D3 01 DE 56 3F 02 4F 03 5F 04 68 ..2.....V?.O._.h
000028D0 05 33 1F 06 D3 01 3E 41 8F 07 9F 08 AF 09 80 0D .3....>A........
000028E0 00 00 4C ..L

Woohoo! Look! Readable word fragments! But remember, the chunk is still
compressed.

In this case, the block is compressed with LZ77, since CompMethod is 1.

The compression algorithm is a very basic sliding window compression scheme
from the LZ77 family using a 4095 byte history and matches from 2 to 17
bytes long.

The first byte is kind of a "tag" byte which determines if the next eight
pieces of data are literal bytes or history matches. Starting with the
least-significant bit, this tag byte is scanned to figure out what to do.

When the current bit is a zero, the next byte of the input is transfered
directly to the output and added to end of the history buffer.

When the current bit is a one, the next two bytes taken from the input are
used as a offset/length pair. The upper 12 bits are the offset into the
history buffer and the lower 4 bits are the length. If the offset is
zero, the end of the input data has been reached and the decompressor
simply exits.

Since we can assume that there will be no matches with a length of zero
or only one byte, any match is a mimimum of two bytes so we just add two
to the length which extends our range from 0-15 to 2-17 bytes.

The match is then copied from the history buffer to the output and also added
onto the end of the history buffer to keep it in sync with the output.

When all eight bits of the tag byte have been used, the mask is reset and
the next tag byte is loaded.

Here is some decompress code:


int Decompress(char *out, char *in, int len)
{
/*
Decompress buffer "in" of size "len" into buffer "out" (previously
allocated) returns the number of decompressed bytes.
*/

int x;
int outbufptr;
int mask;
int tag;
int inptr;
int outptr;
int count;
int done;
char Window[4096];
int inbufptr;

for (x = 0; x < len; x++) {
in[x] = (in[x] - x) ^ x;
}

done = FALSE;

inptr = 0;
outptr = 0;
outbufptr = 1;
mask = 1;
tag = in[inptr++];

while (!done) {
if ((mask & tag) == 0) {
out[outptr++] = in[inptr];
Window[outbufptr] = in[inptr];
outbufptr = (outbufptr + 1) & 0xFFF;
inptr++;
}
else {
count = *((unsigned short *) (in+inptr));
inptr += 2;
inbufptr = count >> 4;
if (inbufptr == 0)
return outptr;
else {
count = (count & 0x0f) + 2;
if (count >= 0) {
for (x = 0; x < count; x++) {
out[outptr++] = Window[inbufptr];
Window[outbufptr] = Window[inbufptr];
inbufptr = (inbufptr + 1) & 0xFFF;
outbufptr = (outbufptr + 1) & 0xFFF;
}
}
}
}
mask *= 2;
if (mask & 0x0100) {
mask = 1;
tag = in[inptr++];
}
}
return outptr;
}

When fed the data, the routine spits out:

00000000 5B 4D 45 4E 55 45 4E 54 52 59 31 5D 0D 0A 09 7B [MENUENTRY1]...{
00000010 0D 0A 09 55 4E 49 54 4D 45 4E 55 3D 41 52 4D 41 ...UNITMENU=ARMA
00000020 43 4B 3B 0D 0A 09 4D 45 4E 55 3D 33 3B 0D 0A 09 CK;...MENU=3;...
00000030 42 55 54 54 4F 4E 3D 33 3B 0D 0A 09 55 4E 49 54 BUTTON=3;...UNIT
00000040 4E 41 4D 45 3D 41 52 4D 46 4C 41 4B 3B 0D 0A 09 NAME=ARMFLAK;...
00000050 7D 0D 0A 0D 0A 5B 4D 45 4E 55 45 4E 54 52 59 32 }....[MENUENTRY2
00000060 5D 0D 0A 09 7B 0D 0A 09 55 4E 49 54 4D 45 4E 55 ]...{...UNITMENU
00000070 3D 41 52 4D 41 43 56 3B 0D 0A 09 4D 45 4E 55 3D =ARMACV;...MENU=
00000080 33 3B 0D 0A 09 42 55 54 54 4F 4E 3D 33 3B 0D 0A 3;...BUTTON=3;..
00000090 09 55 4E 49 54 4E 41 4D 45 3D 41 52 4D 46 4C 41 .UNITNAME=ARMFLA
000000A0 4B 3B 0D 0A 09 7D 0D 0A 0D 0A 5B 4D 45 4E 55 45 K;...}....[MENUE
000000B0 4E 54 52 59 33 5D 0D 0A 09 7B 0D 0A 09 55 4E 49 NTRY3]...{...UNI
000000C0 54 4D 45 4E 55 3D 41 52 4D 41 43 41 3B 0D 0A 09 TMENU=ARMACA;...
000000D0 4D 45 4E 55 3D 33 3B 0D 0A 09 42 55 54 54 4F 4E MENU=3;...BUTTON
000000E0 3D 33 3B 0D 0A 09 55 4E 49 54 4E 41 4D 45 3D 41 =3;...UNITNAME=A
000000F0 52 4D 46 4C 41 4B 3B 0D 0A 09 7D 0D 0A 0D 0A 0D RMFLAK;...}.....
00000100 0A .

Yay! Clear decoded text. Write this chunk out, and go get the next one.
When there are no more chunks, close the file, and go process the next one.

To recompress, do something like the following:

WHILE look ahead buffer is not empty
find a match in the window to previously output data
IF match length > minimum match length
output reference pair
move the window match length to the right
ELSE
output window first data item
move the window one to the right
ENDIF
END

If CompMethod is 2, use ZLib compression to decompress the block. You can get the zlib
source code from the zlib home page at http://www.cdrom.com/pub/infozip/zlib/

From here, you've got enough data to proceed on your own. Good luck!

Like I said, if you have any questions, let me know.
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

Heretic 2 image M8 file format

BYTES NAME DESCRIPTION
==============================================
4 Version version 2 is Heretic2
32 File Name byte string of directory and file name
4[16] Width width of images
4[16] Height height of images
4[16] Offset file image offsets for image data
32 Animation Name name of next frame in animation chain
768 Palette bpp RGB (palette format is the same as pcx ver 5)
4 Flags texture flags
4 Contents I don't know :O
4 Value I don't know :P
==============================================
Image data starts at offset 1040

I'm not sure but I think it may be possible to create animated skins for
Heretic2 skins. Just make the "Animation Name" equal the next skins name.
At the end of the animation just make the "Animation Name" equal the
first skin name of the animation. Then again maybe not. I know that
Quake textures are treated like that. Just a thought. :)
KorNet
VVIP member
VVIP member
Posts: 444
Joined: Tue Apr 12, 2005 11:36 am
Been thanked: 4 times

Post by KorNet »

Carmageddon 2: Carpocalyse Now TWT file format

NOTE: All values are in hexadecimal (0 to F).

Main header
-----------

Offset Type Description
---------------------------------------------------------------
0 DWORD Size of the entire TWT file
4 DWORD The number of files in this TWT file
Now you get the file definition headers.

File definition headers
-----------------------
The file definition headers are 38h bytes long. A + means
from the beginning for the file header (this is not the
absolute offset)

Offset Type Description
--------------------------------------------------------------
+0 DWORD This is the size of the file
+4 34h x Char This is the filename. Unused characters
have a value of CDh
Now you get data of the files.

NOTE: altought the sizes of the files (TWT and the files inside)
are defined in the TWT file, Carmageddon sometimes doesn't allow
the file to have a diffirent size (this is case with the
"data.twt" file)
Post Reply