File character encoding changes from utf-8 to iso8859-1

Issue #13450 duplicate
Tomas Jakstas
created an issue

Bitbucket online editor handles file character encoding incorrectly

Adding a dot after options for some reason changes a pound sign as well. File was correctly encoded before using the online editor

Looking at diff on bitbucket shows:

- from just £99 per month. Perfect for new and small businesses. 
+ from just £99 per month. Perfect for new and small businesses. 

- options..
+ options...

file before commit:

file -bi src/pages/index.md
text/html; charset=utf-8

file after commit with online editor:

file -bi src/pages/index.md
text/html; charset=iso-8859-1

Modifying a file with local editor correctly changes just that line

-options...
+options....

Attached 2 files first before modification, second after modification with online editor

File character encoding should remain utf-8

Comments (14)

  1. Sean Farley staff

    Ah, yes, this is due to us not being to able to set the encoding in the repo (i.e. gitattributes). I've made significant progress on this over the last two months (basically, upgrading our pygit2 usuage) but don't know how long the UI part will take.

  2. Giovanni Mascellani

    I also am experiencing the same bug. What is funny is that I do not understand where BitBucket might understand that my file is Latin1 instead of UTF-8. Also, this bug shows sometimes, and sometimes it does not. I have no idea on which causes. But at least the resulting file was always correctly encoded in at least one between UTF-8 and Latin1, which is much better than having some characters encoded with one of them and the others with the other (or with other encodings).

  3. Giovanni Mascellani

    The bug is for example triggered in https://bitbucket.org/giomasce/test/commits/77e5a9429106b048bec0f2c7e726fbff5cb4af86?at=master. That commit was authored with the BitBucket editor, using Chrome (recent version) on Windows 7. The OS language is Italian. You can see an accented letter "ò" which is converted from UTF-8 to Latin 1 with the commit.

    In the same repository I tried to trigger the same bug many time, but only that commit is faulty. I don't know what it depends on.

  4. Giovanni Mascellani

    Actually, in my case it seems that the file is converted to CP1252 rather than Latin1. This seems to indicate an interaction with the computer's default encoding, since as I said the computer is running Windows.

  5. Log in to comment