- changed status to open
UTF-16 Mac file not detected
Original [issue 74](https://code.google.com/p/okapi/issues/detail?id=74) created by @ysavourel on 2009-05-26T20:00:45.000Z:
UTF-16 Mac files (with BOM) don't seem to be detected properly with the PlainText Filter. It seems they are read like if they were UTF-16LE files instead of UTF-16BE files. See a input file example.
Comments (4)
-
Account Deleted -
Account Deleted Comment [2.](https://code.google.com/p/okapi/issues/detail?id=74#c2) originally posted by @ysavourel on 2009-05-26T21:03:39.000Z:
The open (InputStream input) code that caused this was from the Properties Filter.
-
Account Deleted Comment [3.](https://code.google.com/p/okapi/issues/detail?id=74#c3) originally posted by @ysavourel on 2009-05-27T00:51:09.000Z:
Good catch. It seems to be a problem with BOMNewlineEncodingDetector. It detects correctly the BOM but return UTF-16, not UTF-16LE or UTF-16BE, so it seems the stream uses the UTF-16xx of the platform. We'll have to resolve this: it affects most filters. That also means you shouldn't have to change the code of PlainTextFilter.
-
Account Deleted - changed status to resolved
Comment [4.](https://code.google.com/p/okapi/issues/detail?id=74#c4) originally posted by @ysavourel on 2009-06-11T00:08:24.000Z:
Fixed by the new RawDocument, good job, Jim.
- Log in to comment
Comment [1.](https://code.google.com/p/okapi/issues/detail?id=74#c1) originally posted by @ysavourel on 2009-05-26T20:26:53.000Z: