The rules have been updated, read them now: Rules!

Help with The Godfather II .str files

Need help translating games in other languages? Have your language problems solved here.
venixuc
beginner
Posts: 22
Joined: Tue Aug 13, 2019 3:00 am
Has thanked: 2 times
Been thanked: 1 time

Re: Help with The Godfather II .str files

Post by venixuc » Tue Jan 26, 2021 3:21 am

Delete 4 bytes from the beginning of NAM_1.dat and you will be able to decompress it.
You are right. This makes is possible to decompress the file and it looks exactly like kapR. It means that GE only reads the file after those 4 bytes, and nothing else; this finding makes GE obsolete.

This script, written by aluigi, extracts only decompressed NAM_1.dat from .str (the file has the same name as the .str archive - dac49de4 in this case):

Code: Select all

comtype dk2
goto 0x34
get SIZE long
goto 0x44
get ZSIZE long
savepos OFFSET
get NAME basename
clog NAME OFFSET ZSIZE SIZE
The decompressed file is the same as kapR, but the script doesn't compress it back.

The decompressed NAM_1 (and/or kapR) starts with 4C 43 48 32 (LCH2). The compressed one extracted from .str has B8 FF 01 00 10 FB 03 65 DF E2 (¸˙...ű.eßâ) before this string, and it cannot be decompressed without removing B8 FF 01 00.
Using dk2 as comtype parameter in bms script for decompression makes the data fully readable; xrefpack gives an error "the uncompressed data is bigger than the allocated buffer"; and ct_refpack crashes QuickBMS.

The compression results are different. For example, "Skarsvaag" string is compressed as "Skŕvaag" in the original file and as "Sűkŕvaag" in dk2 or ct_refpack comtype script; the strings before LCH2 are also different - BC FF 01 00 E0 10 FB 01 FF 06 05 BC E0 FB 0A 05 BC FB FB E4 B8 FF 01 00 10 FB 03 65 DF E2 (Ľ˙..ŕ.ű.˙..Ľŕű..Ľűűä¸˙...ű.eßâ) for ct_refpack and 10 FB 01 FF BC E4 BC FF 01 00 E0 10 FB 01 FF 06 05 BC E0 FB 0A 05 BC FB FB E4 01 13 B8 FB 10 FB 03 65 DF E2 (.ű.˙ĽäĽ˙..ŕ.ű.˙..Ľŕű..Ľűűä..¸ű.ű.eßâ) for dk2. Using xrefpack for compression gives an error "unsupported compression -1 in reimport mode". The B8 FF 01 00 part is present only in ct_refpack compression.

The data compression for dk2 and ct_refpack is similar, and the "gap" from 0001b3c0 to 0001ffbc is present for both comtypes. The last byte in the original file is at offset 0001ffbb, and it's part of compressed data.

An important question is: are dk2, ct_refpack and xrefpack the only refpack comtypes?

User avatar
ikskoks
Moderator
Posts: 738
Joined: Thu Jul 26, 2012 5:06 pm
Location: Poland, Łódź
Has thanked: 463 times
Been thanked: 203 times
Contact:

Re: Help with The Godfather II .str files

Post by ikskoks » Tue Jan 26, 2021 12:00 pm

So it seems that the only problem you may have now with packing is making compressed file smaller or equal original to be able to replace it in archive.
I think that I have solved that issue, explanation below.

I have checked refpack-brute-force once again and it seems that it works fine, but the author didn't attach any progress tracking,
so both of us probably thought that tool is broken. :D
I made some minor changes to the source and now progress can be tracked https://i.imgur.com/tjnFxeO.png
It will take some time (like 5-10 minutes), but it works and it will output file smaller than orginal. https://i.imgur.com/bRxydAa.png

If you need this new file to match original size (131 000), just fill the file with zeroes at the end and that should do the trick.

Updated refpack-brute-force code is in the attachment with compiled executable ready to use.
You do not have the required permissions to view the files attached to this post.

venixuc
beginner
Posts: 22
Joined: Tue Aug 13, 2019 3:00 am
Has thanked: 2 times
Been thanked: 1 time

Re: Help with The Godfather II .str files

Post by venixuc » Wed Jan 27, 2021 2:12 am

Updated refpack-brute-force code is in the attachment with compiled executable ready to use.
This one works! But it takes some extra steps to avoid game crashing.

The only way to reimport modified kapR into the .str is by using dead_space_3.bms script. This script, however, reads the NAM_1.dat inside the archive from B8 FF 01 00; the file compressed with refpack-brute-force doesn't have this.

Inside the .str, the content from 00000040 to 0000004F is 6B 61 70 52 B8 FF 01 00 10 FB 03 65 DF E2 4C 43 (kapR¸˙...ű.eßâLC). After using dead_space_3.bms to reimport the kapR compressed with refpack-brute-force, the "B8 FF 01 00" is overwritten, and 4 extra zerobytes are added at the end, before "COHSÔ...TADSbal Borbon" string at the beginning of NAM_2.bal.

This crashes the game. But the workaround is simple. It's necessary to add those 4 bytes manually between 6B 61 70 52 and 10 FB 03 65, and remove 4 zerobytes at the end. Then it fully works, the game doesn't crash and the text in game is displayed properly.

This wraps up the compression; now, it's possible to edit text and test the results.

Just one thing remains. How to find the string pointer inside decompressed kapR? The first text string is "1", followed by "Specs", then "Choose a hand accessory style", and so on. The first pointers are 88 1B, 8A 1B and 90 1B.

But, the names are located relatively far from those first strings. The Specs string starts at 00011B8A, while "Big Greenie" Grinberg starts at 00018363 (viewed with HxD).

So, how to find a pointer of "Big Greenie" Grinberg string?

User avatar
ikskoks
Moderator
Posts: 738
Joined: Thu Jul 26, 2012 5:06 pm
Location: Poland, Łódź
Has thanked: 463 times
Been thanked: 203 times
Contact:

Re: Help with The Godfather II .str files

Post by ikskoks » Wed Jan 27, 2021 4:29 pm

This wraps up the compression; now, it's possible to edit text and test the results.
Great!

So, how to find a pointer of "Big Greenie" Grinberg string?
I have created a text exporter that will make it easier for you.
It outputs data in four columns - ID, pointer offset, text offset and text.
https://github.com/bartlomiejduda/Tools ... xt_Tool.py

So for example pointer for text "Big Greenie" Grinberg is at offset 41196.
https://i.imgur.com/S92tDGd.png

venixuc
beginner
Posts: 22
Joined: Tue Aug 13, 2019 3:00 am
Has thanked: 2 times
Been thanked: 1 time

Re: Help with The Godfather II .str files

Post by venixuc » Thu Jan 28, 2021 2:38 am

I have created a text exporter that will make it easier for you.
It outputs data in four columns - ID, pointer offset, text offset and text.
https://github.com/bartlomiejduda/Tools ... xt_Tool.py
This one is great. It's basically vital for finding pointers.

Now, in one of your earlier posts, you wrote:
As for pointers, I have explained it with screenshots earlier
https://i.imgur.com/poy87uD.png
https://i.imgur.com/5glpvrs.png

You only need to change values. For example you have "Specs" word with pointer 7050.
When you change it to 7052, game will read this as "ecs".
But ot will give you of course 2 extra characters to use in the previous string.
Here are screenshots of "Big Greenie" Grinberg pointer, at offset 0000A0EC (value 63 83):

https://ibb.co/z6dHmMp (Little endian)

https://ibb.co/GWxSFY4 (Big endian)

Since increasing pointer value makes the string shorter, then decreasing it should make it longer (by reading fragments of previous string). Does this mean that pointers mark beginning of the string, while null terminators mark its end?

I don't know if you played the game; it has a "Don's View", where various things can be viewed - including "Family Tree" and businesses (with owners). The very first string I changed was "Your Family" (string length 12) at offset 00012F44; I changed it to "Corleone" (string length 9) without editing pointers, I just filled the remaining bytes with zeroes - and the text was properly shown in game.
So the player family name now was Corleone instead of Your Family; the business ownership text was also changed, since it also reads from the same offset - for example, instead of reading Appliance King - Controlled by Your Family it reads Appliance King - Controlled by Corleone.

But it worked this way because I didn't edit previous or next string.
Again, i'll take the first 2 Corleone names for example; changing "Big Greenie" Grinberg to Alessandro Esposito (length 23 to 20) and Michael Galli to Virgilio Zingaro (length 14 to 17). The pointer value for the first name is 63 83 (25475 for int16/uint16) and 7A 83 (31363 for int16/uint16) for the next one.

Michael Galli (Virgilio Zingaro) after pointer editing in game:

https://ibb.co/bXcqbKZ (original)

https://ibb.co/NYjSCx9 (edited)

Alessandro Esposito string works just well without pointer editing, since its beginning isn't moved. The screenshots for this one are missing because I forgot to take a screenshot in game, he was spawned first and I was focused on finding Zingaro.

The obvious limits for editing the decompressed file are file size and total number of strings. Are these the only limits?

User avatar
ikskoks
Moderator
Posts: 738
Joined: Thu Jul 26, 2012 5:06 pm
Location: Poland, Łódź
Has thanked: 463 times
Been thanked: 203 times
Contact:

Re: Help with The Godfather II .str files

Post by ikskoks » Thu Jan 28, 2021 12:49 pm

Does this mean that pointers mark beginning of the string, while null terminators mark its end?
Yes, exactly.
The obvious limits for editing the decompressed file are file size and total number of strings. Are these the only limits?
Yes. ;)

venixuc
beginner
Posts: 22
Joined: Tue Aug 13, 2019 3:00 am
Has thanked: 2 times
Been thanked: 1 time

Re: Help with The Godfather II .str files

Post by venixuc » Fri Jan 29, 2021 1:56 am

Well, the game was released in April 2009 - if we exclude the trainers and graphic enhancement (which is also an "external" mod like Silent's Patch), it took nearly 12 years to be able to mod it. Was it because it never became as popular as several other games of its genre, or because the file formats weren't easy to reverse, or both, I don't know.

I started playing it in 2015; by that time, the online mode was already history - not that I would've played it anyways, since I'm not into online gaming. But, I've stumbled upon numerous sites where people discussed translating the game into other languages - to this day, no one succeded. This means that there are still people who care about this game (offline, single player mode).

Since the English localization is everything I need, I definitely won't be translating it; I'm only interested in editing names. However, I believe that this topic, with all the information, screenshots and attachments, is a great place to start for anyone who may be interested in translating the game into their language.

ikskoks, thanks for your patience and time you invested in this topic.

User avatar
ikskoks
Moderator
Posts: 738
Joined: Thu Jul 26, 2012 5:06 pm
Location: Poland, Łódź
Has thanked: 463 times
Been thanked: 203 times
Contact:

Re: Help with The Godfather II .str files

Post by ikskoks » Fri Jan 29, 2021 10:08 am

ikskoks, thanks for your patience and time you invested in this topic.
Sure, no problem. ;)

venixuc
beginner
Posts: 22
Joined: Tue Aug 13, 2019 3:00 am
Has thanked: 2 times
Been thanked: 1 time

Re: Help with The Godfather II .str files

Post by venixuc » Sun Jan 31, 2021 2:29 am

One last thing. When refpack-brute-force is used for compression, the file size is 125 kb as opposed to 127 kb. There is no problem in reimporting the smaller file into .str.

But, it compresses it in a way that after compressed data, there are exactly 2901 zerobytes added during reimporting to match the original size - this is relatively large waste of space; these extra bytes could be used for replacing shorter strings with longer ones.

I decided to test that out, knowing what could happen - I replaced the 12-characters long string with another one (19 characters long) by overwriting it in hex editor, and removed 7 zerobytes. When I launched the game, it didn't crash; but as I expected, the text stored after that string was displayed incorrectly due to the strings being shifted without pointer editing (every pointer after that string has to have its value increased by 7).

The last pointer value inside decompressed file is at offset 00011B84, so doing this manually would take ages. Is there a faster way to update the pointer values after inserting a longer string?

User avatar
ikskoks
Moderator
Posts: 738
Joined: Thu Jul 26, 2012 5:06 pm
Location: Poland, Łódź
Has thanked: 463 times
Been thanked: 203 times
Contact:

Re: Help with The Godfather II .str files

Post by ikskoks » Sun Jan 31, 2021 11:11 am

Is there a faster way to update the pointer values after inserting a longer string?
Yeah, you could write text importer which automates this task and it changes pointers during program execution.

venixuc
beginner
Posts: 22
Joined: Tue Aug 13, 2019 3:00 am
Has thanked: 2 times
Been thanked: 1 time

Re: Help with The Godfather II .str files

Post by venixuc » Mon Feb 01, 2021 1:53 am

Yeah, you could write text importer which automates this task and it changes pointers during program execution.
That would be the best solution - basically a custom tool. However, it would require not only understanding the pointers and text strings, but also the data above them. According to https://www.metadata2go.com/, the decompressed file header looks like this:
https://ibb.co/7n8ZhvQ
I don't know what the data between the header and the pointers block represents.

When I edit a string and move its beginning, I only edit the uint16 pointer value - the other values update automatically. While the last pointer is at offset 00011B84, the last important pointer is at offset 0000F0A0 (it belongs to last text string stored inside NAM_1.dat - Cristo); other pointers belong to text strings located in NAM_2.bal.

Considering the fact that editing kapR and compressing it to NAM_1.dat mustn't move the NAM_2.bal content (it would mean that the .str size is changed), the pointers from 0000F0A4 to 00011B84 don't require value editing.

I never considered a custom text importer for kapR. If I replaced, for example, the string Steve Szakal (12 characters) at offset 00018CB1 (pointer at 0000A364) with 19-characters long string, all pointers from 0000A368 to 0000F0A4 would require having their values increased by 7.

I thought of something like this: replacing the string with hex editor, then running some kind of external script which would increase (or decrease) the pointer values at required offsets. It would be necessary to repeat this for every longer string, but still easier than reversing the kapR and writing a custom tool (in my opinion).

I'm not quite sure what to do; but if everything else fails, I'll go back to manual pointer editing.
------------------------------------------------------------------------------------------------------------------------------------------------

The Granados mobster names are located inside NAM_2.bal file. That file is just the plain text; the pointers are inside NAM_1.dat. How to find corresponding NAM_2.bal pointers inside kapR?

User avatar
ikskoks
Moderator
Posts: 738
Joined: Thu Jul 26, 2012 5:06 pm
Location: Poland, Łódź
Has thanked: 463 times
Been thanked: 203 times
Contact:

Re: Help with The Godfather II .str files

Post by ikskoks » Mon Feb 01, 2021 6:30 pm

I thought of something like this: replacing the string with hex editor, then running some kind of external script which would increase (or decrease) the pointer values at required offsets. It would be necessary to repeat this for every longer string, but still easier than reversing the kapR and writing a custom tool (in my opinion).
That could work, but why bother, when you lose all benefits of the text importer.

I would do it like that:

#Export# (that's basically similar logic to the one from my tool in Python)
1. Read unknown data block from kapr and save it as "temp.bin"
2. Read pointer table
3. Read all texts using pointer table
4. Save text in file "out.txt"

#Import
1. Read data from "temp.bin"
2. Read "out.txt"
3. Calculate new pointers from string lengths
4. Save data from temp.bin, then new pointer table and then new texts as a new "kapr" file.
5. (optionally) do some checks on file size etc.
6. (optionally) automate it even more by executing brute-force refpack at the end.
The Granados mobster names are located inside NAM_2.bal file. That file is just the plain text; the pointers are inside NAM_1.dat. How to find corresponding NAM_2.bal pointers inside kapR?
Why do you think pointers are in NAM_1.dat? I don't see any pointers at the beginning of this unknown data block.
Besides, BAL file uses completely different format (string + NULL), so I think that pointers are somewhere in the game executable.

venixuc
beginner
Posts: 22
Joined: Tue Aug 13, 2019 3:00 am
Has thanked: 2 times
Been thanked: 1 time

Re: Help with The Godfather II .str files

Post by venixuc » Tue Feb 02, 2021 1:54 am

Why do you think pointers are in NAM_1.dat? I don't see any pointers at the beginning of this unknown data block.
I don't think that the NAM_2.bal pointers are within the unknown data block inside NAM_1.dat; that wouldn't make sense, since that block is located above pointers block. I think that NAM_2.bal pointers are inside NAM_1.dat's pointer block.

When I used your Python script to get string pointers from NAM_1.dat, the last one listed was at offset 0000F0A0 (or 00061600):
https://ibb.co/mX7sdKH

However, that pointer isn't the last one inside NAM_1.dat; there are plenty more pointers from 0000F0A4 to 00011B84:

The next one:
https://ibb.co/g6Djmgw

The last one, before first text strings:
https://ibb.co/LPfmLV3

Roughly 2/3 of the text is inside NAM_1.dat, the remaining third is inside NAM_2.bal. I strongly suspect that NAM_2 is actually .dat file, the .bal extension could be derived from first 4 bytes (bal ):
https://ibb.co/GFwjmJZ

And, the last string inside NAM_1.dat is Cristo, but that isn't a standalone string; the full string is Cristobal Borbon - bal Borbon fragment is inside NAM_2. There is no null terminator after Cristo string inside NAM_1.dat, nor before bal Borbon inside NAM_2 - this probably means that pointer at 0000F0A0 covers the whole string.

If that's the case, then pointer at 0000F0A4 should belong to Cody Flowers string - the first standalone string inside NAM_2.

Also, this part of your Python script caught my attention:

Code: Select all

pointer_arr_offset = 36308  # These values are hardcoded 
    num_of_entries = 6324       # for decompressed NAM_1.DAT from The Godfather
So I decided to merge kapR with NAM_2.bal:
https://ibb.co/dMcVD1D
This could be useful for connecting the remaining pointers with their strings - but num_of_entries value has to be changed.

I've attached kapR merged with NAM_2. What would be the new value for num_of_entries?

That could work, but why bother, when you lose all benefits of the text importer.
My programming skills aren't the best - if I decided to write something as complex as that, it would probably take more time than editing every single pointer manually. And I obviously can't ask you to do it, because I'm not the only one on this forum who needs help.
But if you get some time, the pointer value-changing script would be enough. Don't take this as a request, neither.
You do not have the required permissions to view the files attached to this post.

User avatar
ikskoks
Moderator
Posts: 738
Joined: Thu Jul 26, 2012 5:06 pm
Location: Poland, Łódź
Has thanked: 463 times
Been thanked: 203 times
Contact:

Re: Help with The Godfather II .str files

Post by ikskoks » Tue Feb 02, 2021 10:11 am

I don't think that the NAM_2.bal pointers are within the unknown data block inside NAM_1.dat; that wouldn't make sense, since that block is located above pointers block. I think that NAM_2.bal pointers are inside NAM_1.dat's pointer block.
However, that pointer isn't the last one inside NAM_1.dat; there are plenty more pointers from 0000F0A4 to 00011B84:
Yes, of course, now I see it. :D You are right. Good catch with this one. To be honest, I haven't noticed it earlier.
I've attached kapR merged with NAM_2. What would be the new value for num_of_entries?
New value should be 9069. That will cover merged files.
But if you get some time, the pointer value-changing script would be enough. Don't take this as a request, neither.
I'll see what I can do. No promises, but I will check it. ;)

venixuc
beginner
Posts: 22
Joined: Tue Aug 13, 2019 3:00 am
Has thanked: 2 times
Been thanked: 1 time

Re: Help with The Godfather II .str files

Post by venixuc » Wed Feb 03, 2021 3:40 am

Ok.
Now, excluding the header, the .str file is divided in 3 chunks: NAM_1.dat, NAM_2.bal and NAM_3.dat. Their sizes are 127 KB (131,004 bytes), 67.1 KB (68,808 bytes) and 60.7 KB (62,240 bytes) respectively.

It appears that not only the .str archive size matters, but also the sizes of these individual files. The kapR (decompressed NAM_1.dat) size is 217 KB (222,687 bytes); refpack-brute-force's higher compression ratio creates 2901 spare zerobytes at the end of compressed file, giving an opportunity to make several strings longer than original which then requires updating pointer values for NAM_1.dat strings.

With NAM_2.bal, the case is different. The text inside it isn't compressed, and the file itself doesn't have spare zerobytes at its end. It can be compressed and reimported into .str, but then the text inside it appears as compressed garbage in game.

The NAM_3.dat file consists only of zerobytes; it has "LLIF" identificator at the beginning (FILL) which means that this file's purpose is keeping the .str size in check.

I tried to use NAM_3.dat's zerobytes as spare ones by making one string inside NAM_2.bal 2 bytes longer,and removing 2 zerobytes at NAM_3.dat's end. When I launched the game, it didn't load and it didn't crash; instead, it froze. No keyboard combination worked when I tried exiting to desktop, the only way out was to use the reset button.

So, when .str size is preserved but with altering size of its chunks, the game locks up - it obviously doesn't have an error statement for this type of situation and gets stuck after unsuccessfull loading and unsuccessfull crashing.

Is it possible that unknown data block inside NAM_1.dat contains size checks for these 3 chunks? If it does, then it would (theoretically) be possible to edit size checks for NAM_2.bal and NAM_3.dat in order to make the former larger and latter smaller.

Post Reply