Reworked unicode support

Issue #17 new
Gennady Trafimenkov repo owner created an issue

At the moment there is a number of bugs related to unicode string support: #4, #5, #6

Most of them come from differences in wchar_t on various platforms (2 bytes on Windows, 4 bytes on Linux, ? bytes on MinGW on Linux, ? bytes on MinGW on Linux).

There is also an issue with Android NDK, which doesn't properly support wchar_t.

It might be a good idea to:

  • don't use wchar_t at all;
  • keep unicode strings incoded in utf-8;
  • transform to utf-16 where it is necessary (displaying, writing into files (saves, for example).

The ISO/IEC 10646:2003 Unicode standard 4.0 says: "The width of wchar_t is compiler-specific and can be as small as 8 bits. Consequently, programs that need to be portable across any C or C++ compiler should not use wchar_t for storing Unicode text. The wchar_t type is intended for storing compiler-defined wide characters, which may be Unicode characters in some compilers."

Here is a seemingly good library for the job: UTF8-CPP: UTF-8 with C++ in a Portable Way Another option is QString, but it is a huge dependency for the project.

Comments (1)

  1. Log in to comment