1. Bitbucket
  2. Public Issue Tracker
  3. master

Issues

Issue #2597 open

Better slugs (BB-1429)

Renato Pedigoni
created an issue

IMHO, characters with accents (á, ã, â, ...) should be converted to it's ASCII correspondent, not to hyphens. Like in Django's slugify filter. For example, "Maçã" should be "maca". Currently it gets converted to "ma--".

It happens in projects slugs and issues slugs.

Thank you

Comments (32)

  1. David Chambers

    Excellent suggestion, Renato. The current behaviour is far from ideal.

    I've written a patch which replaces our custom slug function with Django's `slugify`. There are backwards-compatibility issues to address before it can be deployed, though. I'll keep you posted.

    David

  2. Timwi

    I disagree with the proposed fix. I think all characters should be kept as they are. “Maçã” should stay as “Maçã”.

    In the URL, this means encoding the Unicode characters to UTF-8 sequences; in this case, “Ma%C3%A7%C3%A3”.

  3. David Chambers

    I like the gist of your suggestion, Timwi. Do you mean to say, though, that a repository named "Foo Bar" would appear as "Foo%20Bar" in URLs? This doesn't seem ideal.

    What's required is a fundamental change to the way repository slugs are treated. At the moment they're derived from their respective repository names. This is the root of the problem. If instead we were to allow users to craft repository slugs, "maca", "Maca", and possibly even "Ma%C3%A7%C3%A3" could be used. The repository's name would simply become an optional label, in no way tied to the slug (this is exactly how a user's first and last names are treated). This suggests that when creating a repository one should be prompted for a slug rather than a name. If one wished one could go on to provide a name for the repository, though this would be unnecessary in many cases.

  4. Andrew Berezovskyi

    Hi everyone!

    I'm facing the same issue, non-ASCII chars are converted to -. As I search the issue tracker, I found the following same issues:

    As you are using Python, I suggest you to make use of Unidecode package to handle non-ASCII chars correctly.

    P.S. I'm only suggesting to bring this improvement to issue slugs.

  5. yark

    That #8529 was mine, sorry, can't find this issue in search.

    As that slug is optional and both

    https://bitbucket.org/site/master/issue/2597/better-slugs-bb-1429

    and

    https://bitbucket.org/site/master/issue/2597/better-wewewerwe

    will work fine i think it's possible to leave unicode symbols as is. Spaces can be replaced to plus sign or underscore. See how wikipedia do that. Or transliterate unicode to ascii using something like iconv

    iconv -f UTF-8 -t ASCII//TRANSLIT

  6. Davi Alexandre

    I think David Chambers is correct. The best solution would be to make the slug totally independent of the repository name. Besides fixing the problem with accents, it would also make the following situation possible:

    Repository Name: SGL - Sistema Gerenciador de Laboratórios Slug: sgl

  7. Oskar Thornblad

    I don't have a problem with the dashes, but you currently can't ping people with ASCII usernames but non-ASCII names, and you can't auto-link issues with non-ASCII chars in the issue name.

    If I type Johan Åkerberg it doesn't convert to a link because of his name, even thought the username is ASCII. (edit: he's now pingable because he changed Å to A to make it work)

    If I type issue #123 it won't link to the issue if the issue's title has non-ASCII characters.

    That is more than an aesthetic issue.

  8. Log in to comment