UTF-16 Mac file not detected

Issue #74 resolved
Former user created an issue

Original [issue 74](https://code.google.com/p/okapi/issues/detail?id=74) created by @ysavourel on 2009-05-26T20:00:45.000Z:

UTF-16 Mac files (with BOM) don't seem to be detected properly with the PlainText Filter. It seems they are read like if they were UTF-16LE files instead of UTF-16BE files. See a input file example.

Comments (4)

  1. Former user Account Deleted

    Comment [3.](https://code.google.com/p/okapi/issues/detail?id=74#c3) originally posted by @ysavourel on 2009-05-27T00:51:09.000Z:

    Good catch. It seems to be a problem with BOMNewlineEncodingDetector. It detects correctly the BOM but return UTF-16, not UTF-16LE or UTF-16BE, so it seems the stream uses the UTF-16xx of the platform. We'll have to resolve this: it affects most filters. That also means you shouldn't have to change the code of PlainTextFilter.

  2. Log in to comment