Commits

Tim Hatch committed 019c68d Merge

Merged in andyli/pygments-main/BOM (pull request #139: fixed #822 (remove BOM if present))

Comments (0)

Files changed (3)

pygments/lexer.py

                 text = decoded
             else:
                 text = text.decode(self.encoding)
+        else:
+            if text.startswith(u'\ufeff'): 
+                text = text[len(u'\ufeff'):]
+        
         # text now *is* a unicode string
         text = text.replace('\r\n', '\n')
         text = text.replace('\r', '\n')

tests/examplefiles/BOM.js

+/* There is a BOM at the beginning of this file. */

tests/test_examplefiles.py

     text = text.strip(b('\n')) + b('\n')
     try:
         text = text.decode('utf-8')
+        if text.startswith(u'\ufeff'):
+            text = text[len(u'\ufeff'):]
     except UnicodeError:
         text = text.decode('latin1')
     ntext = []