- changed status to invalid
Writing a unicode string to file is incorrect
Issue #235
invalid
There is a problem writing a Unicode string to a text file in DWScript using FileWrite() function.
// uses dwsFileFunctions
var s: string;
s := 'Test测试'; // Unicode(Chinese) string
Println(s);
var f := FileCreate('.\test.txt');
FileWrite(f, s);
FileClose(f);
The test.txt binary is now: $54 $65 $73 $74 $4B $D5
which is wrong, should be:
ANSI: $54 $65 $73 $74 $B2 $E2 $CA $D4
UTF8: $54 $65 $73 $74 $E6 $B5 $8B $E8 $AF $95
Workaround:
//dwsUtils.pas
procedure RawByteStringToScriptString(const s : RawByteString; var result : UnicodeString); overload;
begin
if s = '' then
result := ''
else
// BytesToScriptString(Pointer(s), Length(s), result);
result := Utf8Decode(s); // use UTF8 function instead
end;
procedure ScriptStringToRawByteString(const s : UnicodeString; var result : RawByteString); overload;
var
n : Integer;
begin
if s = '' then
result := ''
else begin
// n := Length(s);
// SetLength(Result, n);
// WordsToBytes(Pointer(s), Pointer(Result), n);
result := Utf8Encode(s); // use UTF8 function instead
end;
end;
Comments (2)
-
repo owner -
reporter Thank you for the information.
- Log in to comment
This is as designed, File functions treat strings as containers of byte data, so only the lower 8bits are used.
If you want to write with utf-8 encoding you have to use
or one of the other encoders for other formats (like utf16)