XeNTaX Forum Index
Forum MultiEx Commander Tools Tools Home
It is currently Thu Sep 21, 2017 6:15 am

All times are UTC + 1 hour


Forum rules


Please click here to view the forum rules



Post new topic Reply to topic  [ 97 posts ]  Go to page 1, 2, 3, 4, 5 ... 7  Next
Author Message
 Post subject: The Division SDF Archive Format
PostPosted: Tue Jan 26, 2016 12:51 pm 
Offline
ultra-n00b

Joined: Tue Jan 26, 2016 2:11 am
Posts: 1
Has thanked: 0 time
Have thanks: 1 time

The Division beta is coming up and I've been looking around the archive format. This is what I've
discovered at the moment. I'm still looking into it but I'm not very skilled with figuring out
archive formats.

There are three types of files in the data folder.

SDFVER

sdfver files are text based files. They contain a numeric key value pair separated by spaces.
Each pair is for a different section in the sdfdata files (0 = A, 1 = B, 2 = C, etc.)
The key and value are separated by a colon. Each value consists of 5 numbers divided by a space.

Currently do not know what the numbers mean, I think it has something to do patching / version info.

SDFTOC

sdftoc files are binary based files with the header magic of "WEST". All sdftoc files have a separator(?).
The separator starts with "massive " and end with "ubisoft " (spaces being 0x00) in between there are 20
random characters, this separator is repeated multiple times in a file, and is always at the beginning
of the file at offset 0x1C and at the very end of the file.

Looking at the name I'm assuming the sdftoc files are Table of Content files, the actual content isn't
readable directly.

SDFDATA

sdfdata files are also in a binary format and have the header magic of "BERG". They have the same separator
as sdftoc files but starting separator is at offset 0x08 and also end with it.

I have come across sdfdata files that have plain text in them and bnk files.

A lot of the sdfdata files only contain the word "dummy".

Other Observations

Looking at the data folders, there are two "sdf" and "sdf_streaming".

The sdf folder contains most of the game data and is about 27GB in size for the PC Beta.
In the folder is a single sdftoc and sdfver file both named "sdf". There are 2700 sdfdata files.
Named "sdf-{0}-{1}.sdfdata". {0} = A, B, or C each section is 1000 files large. {1} = 0000-2699

The sdf_streaming data contains a folder for nyc_manhatten with a lot
of tile folders and a global folder each of these folders contain a data folder with the previous metioned
sdf data structure. There is also a "tileconfig.txt" file which contains a JSON like structure with the following
properties "hasGlobal", "tileSize", and "tiles" which is an array with three numeric values per item.

I have a high feeling this data structure is meant to be mounted similar to a hard drive.

You can make the ads go away by registering



Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Wed Feb 03, 2016 2:31 pm 
Offline
advanced

Joined: Sun Jun 17, 2012 3:32 am
Posts: 40
Has thanked: 3 times
Have thanks: 3 times
Anything new on this?


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Wed Feb 03, 2016 11:01 pm 
Offline
ultra-n00b

Joined: Wed Feb 03, 2016 10:06 pm
Posts: 2
Has thanked: 0 time
Have thanks: 2 times
I hope i can extend the great and investigative post of sgtfrankieboy with some further notes.

My primary interest in this content was the audio/music so i did experiments with the resources as focusing on this particular type of content.
About 260+ of the sdfdata resource files seems to contain both Audiokinetic Wwise RIFF Vorbis and Wwise_SoundBank files as embedded content.

The wwise audio files with RIFF/Wave headers easy enough to split out into individual files using regular splitter tools with header management (vgmtoolbox did a great job on this). The wave headers were "usual" and a very few of them length were illegal causing the splitter to extract multiple wave headers into one file (this only a few files, compared to the some*10k audio files it found and saved properly). Extracting the actual bnk files and using bnkextr might help to get the individual files properly - not tested so far.

Regardless, the ww2ogg has done a nice job using the --pcb packed_codebooks_aoTuV_603.bin switch, then revorbto do the polishing jobs. The ogg files are coming in mono/stereo/surround with VBR and 48khz sample rate.

The music content is just perfect for my taste i made up some playlists for personal joy. All the music is in stereo format, i found no evidence of 5.1 version of any of them. Ambiences are mostly 6 channel ogg files, but stereo and mono are happens sometimes too. Looped music content is very usual, that most likely helps the cinematic experience in the gameplay for sure.

Fun-facts, that some of the dialogues (which was never actually the part of the beta gameplay) are already "here" (spoiler alert heh), and a few lines are text-to-read voice recoded for now. I also found one particular speech line that exists both t2r and real voice actor versions, which (i'm speculate to) all indicates the packages maybe not entirely polished or simply rolling forward without cleaning old/unused content, and no replace will occure in later patches either. However, about 25% of the contents are happens to be binary-duplicates (sometimes more than one duplicate in different sdfdata files) and sometimes alternative versions of a given content (music/speech etc) also occures, that strengthens my suspicions about the largely unpolished packages, and may result in duplicates in other contents (3d models, textures etc) too as well. Such a waste of hdd space.

As a side note, the extracted ogg audio files were always complete and i did not hit missing file ends, which indicates (the high possibility of) that the sdfdata files are always complete. This of course doesn't goes against that any/all of the sdfdata contents would be part of a huge virtual disk file. I also speculate that the sdf_streaming data will most likely be (as the folder structure also indicates) the sliced up open-world model w/o the textures, which helps the snowdrop engine to organize and stream in/out far-parts of the environment with the given lod requirements.

I wish the bests for anyone up to the challenge to decyphering the resource format, and i sure hope this bit of information will also prove to be useful in the process.
:up:


Last edited by bulihack on Fri Feb 05, 2016 7:01 am, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Thu Feb 04, 2016 7:27 pm 
Offline
ultra-n00b

Joined: Thu Feb 04, 2016 7:24 pm
Posts: 9
Has thanked: 0 time
Have thanks: 0 time
This page herehttp://zenhax.com/viewtopic.php?t=2072 details how to dump everything from the archives,
But that method produces thousands and thousands of unnamed files which makes finding anything specific pretty hard.

Good news is it seems everything is pretty much regular formats, xml, lua, .dat, dds, and the raw shader sources are there aswell.

If someone could make a tool that dumps everything with the correct folder structure and filenames... that would be great!


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Fri Feb 05, 2016 7:23 am 
Offline
ultra-n00b

Joined: Wed Feb 03, 2016 10:06 pm
Posts: 2
Has thanked: 0 time
Have thanks: 2 times
offzip does indeed extract some content but it's a hit and miss. ran trhu a few audio filled sdfdata files but all i got was junk instead of wem/bnk. this is actually the expected result since it indicates that some parts are zip'ed while others are uncompressed data streams.

i also have tested the sdf/../sdf.sdftoc file with the offzip tool and it quickly revealed that the table of contents is actually there in a ziped format, but the unpack had a broken result, where only partial filenames can be human-readed in the output stream. this indicates that either the ziped stream is malformed, the offzip is broken, the inflated/deflated stream is encrypted or the zip algorithm is altered which makes it harder to get the proper index file out of the stream. Anyway, this index file seems to contain all filenames embeded in the /sdf file resource files which have to be used to recreate the folder structure. the same applies to the .sdftoc files in the sdf_streaming folders too.


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Mon Feb 22, 2016 8:11 am 
Offline
ultra-n00b

Joined: Mon Feb 22, 2016 8:04 am
Posts: 1
Has thanked: 2 times
Have thanks: 0 time
Has anyone made any progress on recreating the file structures? I've matched a few files up based on their association with other files, but thats about it.

I've also been unable to find extract the image data, both the pngs and the sprite sheets. This is my first time working with offzip so I don't know whether or not it outputs png from the extracted data but none of the files were extracted as png. I also haven't been able to identify any by the file signature.


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Mon Feb 22, 2016 2:02 pm 
Offline
Moderator

Joined: Mon Jul 05, 2010 8:55 pm
Posts: 586
Has thanked: 19 times
Have thanks: 215 times
I took a look at the SDF format a while ago. It is pretty complex at the moment. Right now, the TOC files need to be decrypted, that's most likely why nobody has written an unpacker because the TOC seems to be obfuscated/encrypted.

_________________
Click the thanks button if I helped!


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Thu Feb 25, 2016 10:38 pm 
Offline
veteran

Joined: Mon Aug 06, 2012 4:14 am
Posts: 94
Has thanked: 0 time
Have thanks: 61 times
There's no encryption/obfuscation going on in the TOC file.


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Thu Feb 25, 2016 10:43 pm 
Offline
Moderator

Joined: Mon Jul 05, 2010 8:55 pm
Posts: 586
Has thanked: 19 times
Have thanks: 215 times
Sir Kane wrote:
There's no encryption/obfuscation going on in the TOC file.


Then what's this block of data? (:

Image

_________________
Click the thanks button if I helped!


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Sat Feb 27, 2016 2:44 am 
Offline
ultra-n00b

Joined: Sat Feb 27, 2016 2:40 am
Posts: 1
Has thanked: 0 time
Have thanks: 0 time
offzip takes care of that

Image


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Sat Feb 27, 2016 3:23 am 
Offline
Moderator

Joined: Mon Jul 05, 2010 8:55 pm
Posts: 586
Has thanked: 19 times
Have thanks: 215 times
Scrapz wrote:
offzip takes care of that

Image


Actually, no it doesn't because that block I highlighted is NOT zlib compressed data. :)

The problem here is figuring out how to extract files properly without offzip. It's an interesting file format, I've not seen one like this yet. But I'm fairly certain some information is missing in order to unpack the files properly like the offsets and sizes of the data.

_________________
Click the thanks button if I helped!


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Sat Feb 27, 2016 4:38 pm 
Offline
veteran

Joined: Mon Aug 06, 2012 4:14 am
Posts: 94
Has thanked: 0 time
Have thanks: 61 times
Names, offsets, sdfdata indices and all that are in the TOC file's zlib compressed chunk. That 0x140 bytes block is probably some signature or something like that.


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Wed Mar 02, 2016 8:12 pm 
Offline
ultra-n00b

Joined: Wed Mar 02, 2016 8:09 pm
Posts: 4
Has thanked: 2 times
Have thanks: 0 time
Sir Kane wrote:
Names, offsets, sdfdata indices and all that are in the TOC file's zlib compressed chunk. That 0x140 bytes block is probably some signature or something like that.


I wasn't able to get clean enough data to determine that, unless I am not viewing it properly... I used offzip to unzip the sdftoc. Then I opened the single dat file with HxD to view it. I see patterns, but also see data that looks fragmented. Maybe HxD isn't the best viewer for the content.

Example pattern:
Code:
®..1.. ..á..C......_discover.mmissionB.
®..1'.—.=ã..C......eventtemplate.mmissionB.
®..1å.š.Ôä..C.....uO-£..o_verticalslice_sÿ,£.p ,£.cw,£..bar.mmissionB.


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Wed Mar 02, 2016 10:59 pm 
Offline
ultra-n00b

Joined: Wed Mar 02, 2016 8:09 pm
Posts: 4
Has thanked: 2 times
Have thanks: 0 time
After a little more digging, it appears that there is a more specific pattern at play.

In hex:
Code:
42 __ __ AE 1E 00 __


The 42 is what generates the ASCII 'B' at the end of every visible path. 1E is technically considered a Record Separator. Since AE and 1E are always paired, I am assuming that they symbolize the termination of a record.

So a regex pattern for a record match with two sub matches might be:
Code:
(([0-9A-F]+)42([0-9A-F]{4}))AE1E


Top
 Profile  
 
 Post subject: Re: The Division SDF Archive Format
PostPosted: Tue Mar 08, 2016 2:04 pm 
Offline
ultra-n00b

Joined: Thu Feb 04, 2016 7:24 pm
Posts: 9
Has thanked: 0 time
Have thanks: 0 time
:)



Last edited by redspike474 on Thu Mar 24, 2016 7:35 pm, edited 1 time in total.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 97 posts ]  Go to page 1, 2, 3, 4, 5 ... 7  Next

All times are UTC + 1 hour


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group