Commit messages in UTF-8 are garbled (BB-8715)

BitBucket does not display commit messages in UTF-8 correctly. It interprets them as Latin-1, which generates garbage. Consequently, the affected commit messages are unreadable.

Since it is very simple to determine whether a string is valid UTF-8, BitBucket can easily use UTF-8 for UTF-8 messages, and only fall back to something obsolete like Latin-1 if the message is not valid UTF-8.

Note that such a change has no effect on commit messages in pure ASCII, which is probably the vast majority.

Comments (10)

  1. Timwi reporter

    Argh. I didn’t see the previous comment from 5 days earlier. I’ll have to test this to see if it’s fixed now. In the meantime, please delete my rash comment. (I tried deleting it myself, but it just stays there and doesn’t get deleted. Perhaps worth another bugreport.)

  2. Timwi reporter

    Well, I tested it now, and it has not been fixed. Change descriptions are still badly mangled (looks like double-conversion to UTF-8).

