handling of repository directory is very brittle

Issue #150 resolved
Former user created an issue

Observation: Rhodecode is very sensitive to unexpected situations in the Mercurial repository directory. It will give error 500 on 1) special characters like ü in repository folder names 2) defective repositories (in my case, the .hg directory was missing a file due to an aborted copy)

This error 500 even happens on trying to display the login page, so Rhodecode becomes completely unusable on any kind of unexpected situation in the repo directory.

Expectation: 1) special characters in paths should be handled correctly 2) Defective repositories should not impair the use of non-broken repositories or Rhodecode as a whole

Comments (5)

  1. Marcin Kuzminski repo owner

    as od 1) it's already known and there are two issues regarding that it was fixed in 1.2beta and that fix will be ported into next bugfix release.

    2) need more details if repository was corrupted i found that correct that rhodecode will crash . Missing file in hg index is a serious corruption, and such repository should be either fixed or not present at all.

  2. Former user Account Deleted

    You're right Marcin, corrupted repositories obviously are a bad thing.

    However, I still think Rhodecode should recover more gracefully if it stumbles over one. Not even being able to log in anymore is too strong a reaction. Ideally, there would be some kind of notification about a broken repo, but even just ignoring it would be better - at least Rhodecode as a whole would remain operational.

    In the particular case, the entire .hg/storage/ directory was missing, so it was even a rather simple case, as opposed to some funny database corruption.

  3. Marcin Kuzminski repo owner

    Please provide me with steps to reproduce error nr 2, so i'll make fixes to not crash entire app then.

  4. Thomas Waldmann

    Let me state that non-ascii stuff in file/directory names is a bad situation and either tricky or even impossible to handle, due to following reasons:

    On Windows, filenames could be unicode (== NOT byte strings). There might be different apis / behaviours. (some windows using developer might be able to provide details, I usually do not use windows)

    On Posix OSes, usually all filenames that are byte strings (with 0-termination) are acceptable, the OS does not really care about the encoding, but handles this transparently. In the best POSIX case, all filenames are utf-8 encoded (ascii is a subset of this), it is known that all is utf-8 and everything is fine. In the worst POSIX case, the fs encoding is not known and you have a historical mix of different codings. This is often seen on Samba shares that are a bit older and lived through misc. codings. So, even if you can decode some filenames correctly, there might be some (from an older encoding, like iso-8859-1) that can't be decoded.

    Thus, the best way is to avoid that pain by not using anything non-ASCII on the filesystem.

  5. Log in to comment