citrus_euc module accept wrong byte sequence.

Issue #12 closed
Takehiko NOZAKI repo owner created an issue

citrus_euc module doesn't check second byte range, so illegal byte sequence (ex: "\xA4\x0") may wrongly converted to wchar_t(by mbrtowc) and other encoding(by iconv). this bug derived from 4.4BSD rune, it's 20 years old!

Comments (3)

  1. Takehiko NOZAKI reporter

    minimal test case:

    #include <assert.h>
    #include <errno.h>
    #include <locale.h>
    #include <stdio.h>
    #include <string.h>
    #include <wchar.h>
    
    int
    main(void)
    {
            char *s = "\xa4";
            mbstate_t st;
            wchar_t wc;
            size_t ret;
            int e;
            setlocale(LC_CTYPE, "ja_JP.eucJP");
            memset(&st, 0, sizeof(st));
            ret = mbrtowc(&wc, s, 2, &st);
            e = errno;
            printf("%zd\n", ret);
            printf("%s\n", strerror(e));
            assert(ret == (size_t)-1 && e == EILSEQ);
    }
    
  2. Log in to comment