- attached testBOM.html
hasUTF8BOM not set in HTML
Original [issue 19](https://code.google.com/p/okapi/issues/detail?id=19) created by @ysavourel on 2009-03-10T02:55:20.000Z:
In the same html example file as for issue comment 18\. (ruby.htm), the origianl file is UTF-8 and has a BOM. When the START\_DOCUMENT event is sent, the resource has the method hasUTF8BOM() returning false. It should be true. This prevent the writer to pre-pend the BOM as it should.
Comments (3)
-
Account Deleted -
Account Deleted Comment [2.](https://code.google.com/p/okapi/issues/detail?id=19#c2) originally posted by @ysavourel on 2009-03-11T17:21:26.000Z:
This code passes - so we know the detector is working - I will look at the filter next
InputStream htmlStream = HtmlDetectBomTest.class.getResourceAsStream("/ruby.html");
BOMNewlineEncodingDetector bomDetector = new BOMNewlineEncodingDetector(htmlStream);
assertTrue(bomDetector.hasBom()); assertTrue(bomDetector.hasUtf8Bom()); assertFalse(bomDetector.hasUtf7Bom());
-
Account Deleted - changed status to resolved
Comment [3.](https://code.google.com/p/okapi/issues/detail?id=19#c3) originally posted by @ysavourel on 2009-03-11T22:08:09.000Z:
BaseMarkupFIlter modified to better handle BOM and newline detection - corrct values now set in BaseFilter.
- Log in to comment
Comment [1.](https://code.google.com/p/okapi/issues/detail?id=19#c1) originally posted by @ysavourel on 2009-03-10T03:06:36.000Z:
Another example, with a little twist: The attached file as a UTF-8 BOM. As with the previous example, the hasUTF8BOM flag is not set. But here, the skleton of the first document part comes with and extra
r
n before the <html> element (where the BOM is).