Issue #2597 open

Better slugs (BB-1429)

Renato Pedigoni
created an issue

IMHO, characters with accents (á, ã, â, ...) should be converted to it's ASCII correspondent, not to hyphens. Like in Django's slugify filter. For example, "Maçã" should be "maca". Currently it gets converted to "ma--".

It happens in projects slugs and issues slugs.

Thank you

Comments (24)

  1. David Chambers

    Excellent suggestion, Renato. The current behaviour is far from ideal.

    I've written a patch which replaces our custom slug function with Django's `slugify`. There are backwards-compatibility issues to address before it can be deployed, though. I'll keep you posted.

    David

  2. Timwi

    I disagree with the proposed fix. I think all characters should be kept as they are. “Maçã” should stay as “Maçã”.

    In the URL, this means encoding the Unicode characters to UTF-8 sequences; in this case, “Ma%C3%A7%C3%A3”.

  3. David Chambers

    I like the gist of your suggestion, Timwi. Do you mean to say, though, that a repository named "Foo Bar" would appear as "Foo%20Bar" in URLs? This doesn't seem ideal.

    What's required is a fundamental change to the way repository slugs are treated. At the moment they're derived from their respective repository names. This is the root of the problem. If instead we were to allow users to craft repository slugs, "maca", "Maca", and possibly even "Ma%C3%A7%C3%A3" could be used. The repository's name would simply become an optional label, in no way tied to the slug (this is exactly how a user's first and last names are treated). This suggests that when creating a repository one should be prompted for a slug rather than a name. If one wished one could go on to provide a name for the repository, though this would be unnecessary in many cases.

  4. Andrew Berezovskyi

    Hi everyone!

    I'm facing the same issue, non-ASCII chars are converted to -. As I search the issue tracker, I found the following same issues:

    As you are using Python, I suggest you to make use of Unidecode package to handle non-ASCII chars correctly.

    P.S. I'm only suggesting to bring this improvement to issue slugs.

  5. yark

    That #8529 was mine, sorry, can't find this issue in search.

    As that slug is optional and both

    https://bitbucket.org/site/master/issue/2597/better-slugs-bb-1429

    and

    https://bitbucket.org/site/master/issue/2597/better-wewewerwe

    will work fine i think it's possible to leave unicode symbols as is. Spaces can be replaced to plus sign or underscore. See how wikipedia do that. Or transliterate unicode to ascii using something like iconv

    iconv -f UTF-8 -t ASCII//TRANSLIT

  6. Log in to comment