The Forum is up for sale: XeNTaX Forum looking for new owner

dealing with unicode filenames

Coders and would-be coders alike, this is the place to talk about programming.
Post Reply
finale00
M-M-M-Monster veteran
M-M-M-Monster veteran
Posts: 2382
Joined: Sat Apr 09, 2011 1:22 am
Has thanked: 170 times
Been thanked: 307 times

dealing with unicode filenames

Post by finale00 »

Is there any way to deal with non-english filenames with quickbms?
Suppose I knew what encoding the names used.
chrrox
Moderator
Posts: 2602
Joined: Sun May 18, 2008 3:01 pm
Has thanked: 57 times
Been thanked: 1411 times

Re: dealing with unicode filenames

Post by chrrox »

i have no issues with non english characters in quickbms.
what error / problem are you getting?
finale00
M-M-M-Monster veteran
M-M-M-Monster veteran
Posts: 2382
Joined: Sat Apr 09, 2011 1:22 am
Has thanked: 170 times
Been thanked: 307 times

Re: dealing with unicode filenames

Post by finale00 »

No error, I just don't know how to extract them properly.
Which commands do I use?
chrrox
Moderator
Posts: 2602
Joined: Sun May 18, 2008 3:01 pm
Has thanked: 57 times
Been thanked: 1411 times

Re: dealing with unicode filenames

Post by chrrox »

get name string
or
getdstring name length
what is the problem?
finale00
M-M-M-Monster veteran
M-M-M-Monster veteran
Posts: 2382
Joined: Sat Apr 09, 2011 1:22 am
Has thanked: 170 times
Been thanked: 307 times

Re: dealing with unicode filenames

Post by finale00 »

All of the non-english filenames are garbled when I extract the launcherSkin.axp archive with this script.

The sample archives I posted only has a couple non-english names but all meshes are non-english.
My computer can read chinese korean and japanese fine so I think it is how I am getting the names.

Image

encoding: GB2312
chrrox
Moderator
Posts: 2602
Joined: Sun May 18, 2008 3:01 pm
Has thanked: 57 times
Been thanked: 1411 times

Re: dealing with unicode filenames

Post by chrrox »

i dont have any problem just logging normal file names i am not sure the problem you are having.
finale00
M-M-M-Monster veteran
M-M-M-Monster veteran
Posts: 2382
Joined: Sat Apr 09, 2011 1:22 am
Has thanked: 170 times
Been thanked: 307 times

Re: dealing with unicode filenames

Post by finale00 »

Do you remember any games that used non-english names that you wrote scripts for?
Could be my computer settings too lol
Mippithedork
ultra-n00b
Posts: 4
Joined: Mon Feb 20, 2012 3:47 am
Has thanked: 4 times

Re: dealing with unicode filenames

Post by Mippithedork »

It seems i'm having similar issues, the names are not coming out in chinese even tho i have chinese implemented on my PC. I need the file names to be correct according to what they are supposed to be within the packs. I only get garbled text for names and symbols, but no chinese, also it asks me to rename practically every single file in some of these axp's.

One more thing, I'm also looking for information on how to recompile these files back into the axp pack format after i've made my changes. Any ideas?
User avatar
aluigi
VVIP member
VVIP member
Posts: 1916
Joined: Thu Dec 08, 2005 12:26 pm
Location: www.ZENHAX.com
Has thanked: 4 times
Been thanked: 661 times
Contact:

Re: dealing with unicode filenames

Post by aluigi »

I'm enough interested on this point.

at the moment quickbms reads the strings till the reaching of a zero so any ascii/utf8 string gets read correctly.

then when writing the files it cleans the filename using the function src\utils.h->clean_filename which removes the carriage return/line feed, any of the following "?%*:|\"<>", removes all the spaces and dots at the end of the filename (for example "myname.txt ") and obviously removes the absolute and directory traversal paths ("c:", "..").

so at a first look I don't see problems with "get NAME string".

instead the "unicode" type (16bit like bytes "61 00 61 00") is a different story and it's bugged for sure because it's written to work with english only chars so:
getdstring NAME 0x20
set NAME unicode NAME
will produce an invalid name if it's asian.
maybe I will find an universal solution in future but I doubt because looks like you must specify a charset for the unicode->utf8 conversion and so if you don't set it (if you do then it's no longer "universal") any other name that use a different charset will be converted in a wrong way.

so resuming:
- get NAME string: should be 99% correct for asian names
- set NAME unicode NAME: correct only for english chars
Post Reply