Do you see any ?s or chinese in the output ? Read this!


Reg2Inf converter will help a user convert a .reg file into a .inf file that can be used in updatepacks/addonpacks and for other uses.

Do you see any ?s or chinese in the output ? Read this!

Postby n7Epsilon » Thu Aug 17, 2006 2:47 am

Starting from version 0.35 onward, Reg2Inf now treats REG a little differently.

Let me explain by telling the difference between REG files that have a REGEDIT4 (v4) signature and those that have a Windows Registry Editor 5.00 signature (v5).

REGEDIT4 files are designed to be readable by non-Unicode operating systems (NT/9x/ME) so are saved in ASCII encoding by REGEDIT when you export to v4 format (or using NT/9x/ME's REGEDIT).

In v4 REG files, REG_EXPAND_SZ (hex(2)) and and REG_MULTI_SZ (hex(7)) entries are specified as comma separated, null-terminated single byte char hex values (and hence cannot support non-ASCII codepages).

On the other hand, in v5 REG files, REG_EXPAND_SZ (hex(2)) and REG_MULTI_SZ (hex(7)) are specified as comma separated, null-terminated double byte char hex values (and thus support all languages!).

A common problem is that sometimes when copying/pasting multiple REG tweaks from different sources, some tweaks may be in v4 format and others in v5 format and so on...

Now mixing these together in a single REG file with incorrect signature WILL lead to problems.

For example, let's consider this example, I created a new key and a test multi string value and exported it twice in either REG formats...

Here's the output:
v5:
Code: Select all
Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\n7Epsilon]
"TestMultiString"=hex(7):48,00,65,00,6c,00,6c,00,6f,00,21,00,00,00,00,00


v4:
Code: Select all
REGEDIT4

[HKEY_LOCAL_MACHINE\SOFTWARE\n7Epsilon]
"TestMultiString"=hex(7):48,65,6c,6c,6f,21,00,00


Notice in v5, how each character of the word (Hello!) is expressed as double-bytes. (H = 48h,00h | E = 65h,00h) while in v4 it is expressed as a single byte (H = 48h | E = 65h)...

Now, the way the old Reg2Inf engine worked was to convert consecutive 00,00s into a \r\n which is then used as a basis of splitting the lines of the REG_MULTI_SZ, but this only worked in English letters (simply because all of them use 00 for the second byte which would satisfy the replacement condition) while in other languages that used both bytes, it would not split or return incorrect results.

So now, in v0.35 of Reg2Inf, if you have a REG_MULTI_SZ that uses double-byte characters, you MUST specify Windows Registry Editor Version 5.00 signature or it will not convert correctly.

The same thing goes for REG_EXPAND_SZ too...

* Now, what does Reg2Inf do when it encounters a incorrect condition ?:
- In case of erroraneous v4 signature, any 00s may incorrectly be interpreted as new lines in the REG_MULTI_SZ and any non-ASCII hex codes will be converted to a ? (REG_EXPAND_SZ and REG_MULTI_SZ).

- In case of erroraneous v5 signature, you may get Chinese characters in the output INF instead of the result you expect.

So, to sum it up, use Unicode v5 strings whenever you can, if you have any blocks using the single byte ordering, put them in a separate REG file that has the v4 signature and then convert both of them simultaneously with the @"list.txt" method and Reg2Inf will properly merge and convert both.

Pheww... long post :D . If anyone has any comments and questions, please do post them :) .
n7Epsilon
Member

Posts: 50
Users Information
Location: Cairo, Egypt
Joined: Sun May 21, 2006 11:41 pm

Postby MrWoo » Thu Dec 20, 2007 12:53 am

Interesting. I came across that. I had to find a way to read both Ascii and Unicode files, and output to inf. What I ended up doing was converting the Ascii to Unicode piece by piece.

Take for instance your reg test file that uses a unicode character of ى

This little gem is unicode meaning that it is expressed as 49,06 . What ascii shows would be maybe just a 49. So an extended character like that will never be found in ascii, nor can you ever convert the value to an ascii inf or registry.

In order to be able to read the unicode values, I would have to upgrade something like 20,2c,20 (",") to 20,00,2c,00,20,00 (again, unicode ","). That way I could inspect a hex value that was unicode, and with not much more effort, upgrage UTF 8 to UTF 16. The process takes just a little bit longer when upgrading like that, but at least I can handle both varieties.

I like your app. You have done a great job with it.

MrWoo
MrWoo
Junior Member

Posts: 4
Users Information
Joined: Wed Dec 19, 2007 4:38 pm


Return to Reg2Inf converter

Who is online

Users browsing this forum: No registered users and 1 guest

cron