Issue #1790 resolved

Files invisible in web UI when folder contains international character (BB-802)

Morten Mertner
created an issue

I populated a repository with files from my Windows 7 partition. These contained danish (aka international) characters. Mercurial turned out to be unfriendly in the way it obtains file names and thus checks in junk names.

In the bitbucket source browser I noticed that one folder (imported from a folder with the Å character in it) was empty, even though it actually contained several files locally. After renaming the folder to not contain the character the files inside became visible.

Note that Å is a particularly troublesome character because it has two UTF-8 encodings (e.g. A-WITH-RING and RINGED-A), both representing the same character. These are known as NFC and NFD encodings. Because my files originated on a Mac, which converts everything it saves to disk to NFD encoding, their current encoding is NFD (Windows uses the encoding it's given, which avoids a ton of problems, but I digress).

On another note, I had serious problems getting the push to work at all. Anything above 30-50MB is lucky to get through, returning either 502 Bad Gateway or 500 Server Error. It took 14 (!!!) pushes to finally send all the repository data across the wire. Is this a bitbucket or Mercurial problem?

Comments (7)

  1. Morten Mertner reporter

    In this day and age I actually think that calls for a :( rather than a :)

    I could understand if there was a problem with individual files > 2GB, but not being able to transmit a bunch of medium-sized files (none over 10MB) is just plain sad. Why is there even a hard-limit on HTTP in the first place?

    And if this limit applies to all files being pushed, could we not have the limit raised to 1GB or something that better reflects the file/directory sizes of today?

    Maybe I should open a separate issue for this, as my rant is causing us to digress from the original bug description ;)

  2. Erik van Zijst staff

    This issue is nearing its 5th anniversary. A lot has happened in the meantime and most of the code that interacts with the repositories on disk has been rewritten.

    I'm going to preemptively close this issue, but please reopen if you are still having encoding related issues.

  3. Log in to comment