citrus_euctw's wcrtomb(3) doesn't understand CNS11643-* character.

Issue #51 closed
Takehiko NOZAKI repo owner created an issue

euctw's wchar_t mapping is:

  • CNS1163-1 = 'G' << 24 = 0x47000000
  • CNS1163-2 = 'H' << 24 = 0x48000000
  • CNS1163-3 = 'I' << 24 = 0x49000000
  • CNS1163-4 = 'J' << 24 = 0x4a000000
  • CNS1163-5 = 'K' << 24 = 0x4b000000
  • CNS1163-6 = 'L' << 24 = 0x4c000000
  • CNS1163-7 = 'M' << 24 = 0x4d000000

but wcrtomb using wchar_t mask is 0x7f000080, so every CNS11643-* character got EILSEQ.

#include <assert.h>
#include <limits.h>
#include <locale.h>
#include <string.h>
#include <wchar.h>

int
main(void) {
        mbstate_t st;
        size_t i, n, ret;
        wchar_t wc = 0x4800a3a5;
        const char *test[] = {
            "\xa3\xa5",
            "\x8e\xa2\xa3\xa5"
        }, *s;
        char b[MB_LEN_MAX];

        setlocale(LC_CTYPE, "zh_TW.eucTW");
        for (i = 0; i < __arraycount(test); ++i) {
                s = test[i];
                n = strlen(s);
                memset(&st, 0, sizeof(st));
                assert(ret == n);
                memset(&st, 0, sizeof(st));
                ret = wcrtomb(b, wc, &st);
                assert(ret == n);
        }
}

Comments (5)

  1. Log in to comment