ASCII

Issue #3 open
Former user created an issue

Hello. Please do this:

class UrlHelper(object):
    def __init__(self, full_path):
        # If full_path is an UrlHelper instance, extract the full path from it
        if type(full_path) is UrlHelper:
            full_path = full_path.get_full_path()

        # parse the path
        r = urlparse.urlparse(full_path.encode('ASCII'))

Because incorrect for example, this url ?search_word=слово&

Comments (27)

  1. Branko Vukelic

    Thanks for reporting this.

    Could you please write a short test case for this (doesn't need to be a pull-request, feel free to write it in the comments here), the API call and expected output.

  2. Branko Vukelic

    @iaa I'm working on this issue right now. Sorry wasn't paying closer attention yesterday, was in middle of some work. Anyway, I believe UrlHelpers work correctly without your fix. The URL is correctly encoded using % sequences, and in most browsers, when you paste the URL with % escape sequences, you see the correct unicode representation.

    I think it was mentioned on boottle issue tracker once, but URLs are just a stream of bytes with no predefined encoding scheme, so we can't simply assume UTF-8, or something along those lines. It seems that % encoding is a generally accepted solution to this problem right now.

    In all methods that should return Python unicode strings, we still need to make sure that's the case, so I'll be adding your tests to the test case. So far, it looks like they will all pass.

  3. Branko Vukelic

    Ok, one issue that definitely needs to be dealt with is passing pre-encoded URLs to update_query_data(). We do want to make sure this works:

    u.update_query_data(redir='/foo/bar/?q=два+слова')
    

    (currently doesn't).

  4. Алексей Ионов

    If my URL

    /foo?foo=1&bar=слово
    

    then in template tags, for example add_params:

    /foo?foo=1&bar=слово
    
  5. Branko Vukelic

    This is strange. I just added this test case:

    def test_full_path_with_unicode_query_param(self):
        u = UrlHelper('/foo')
        u.update_query_data(foo='слово')
        self.assertEqual(u.get_full_path(),
                         '/foo?foo=%D1%81%D0%BB%D0%BE%D0%B2%D0%BE')
    

    Works as expected. This is what the {% add_params %} tag is doing behind the scenes, so it's definitely not a tag issue.

  6. Branko Vukelic

    Anyway, between 0.0.7 and 0.0.8, we fixed issue #2, which dealt with some double-encoding issue. I released the patch a short while ago. Perhaps you are seeing that issue?

    EDIT: Sorry, was too late with my comment. :)

  7. Branko Vukelic

    Can you give me some examples of what i.name and sort might be? And also the full path for that page with any query strings that are already there.

  8. Branko Vukelic

    That's totally confusing. I see no reason why that shouldn't work... I've just ran all tests in Djangos 1.4.x, 1.5.x and 1.6.x and it all passes except for the one described in #4 which is expected to fail.

  9. Branko Vukelic

    Well, what can I say. I'll add the remaining tests, and push them and then poke around a bit. If you find out something new, let me know.

  10. Branko Vukelic

    Hey completely forgot about this. There was a patch that hasn't been released yet. Could you try getting the master branch and see if that works for you?

  11. Branko Vukelic

    You know what. Sorry, I'm totally tripping. All patches have made into 0.0.8. Sorry for the confusion.

    Anyhow, I pushed all the unit tests, so if you care, you could try and run them on your end, make sure everything passes (except test_with_query_params_in_url_unicode).

  12. Алексей Ионов

    Ok, tests are normal, but when the url:

    ?search_by_field=name&search_word=слово
    

    in the tag displayed

    ?search_word=%C3%91%C2%81%C3%90%C2%BB%C3%90%C2%BE%C3%90%C2%B2%C3%90%C2%BE&search_by_field=name
    

    , although in the request.get_full_path

    ?search_by_field=name&search_word=%D1%81%D0%BB%D0%BE%D0%B2%D0%BE
    
  13. Log in to comment