ASCII

Issue #3 open

Former user created an issue 2014-02-12

Hello. Please do this:

class UrlHelper(object):
    def __init__(self, full_path):
        # If full_path is an UrlHelper instance, extract the full path from it
        if type(full_path) is UrlHelper:
            full_path = full_path.get_full_path()

        # parse the path
        r = urlparse.urlparse(full_path.encode('ASCII'))

Because incorrect for example, this url ?search_word=слово&

Comments (27)

Branko Vukelic
Thanks for reporting this.

Could you please write a short test case for this (doesn't need to be a pull-request, feel free to write it in the comments here), the API call and expected output.
- 2014-02-12T15:23:45+00:00
Branko Vukelic
- changed status to open
- 2014-02-12T15:23:58+00:00
Алексей Ионов
take a look at this http://yadi.sk/d/RTE_NBp2HmwFJ
- 2014-02-12T16:07:53+00:00
Branko Vukelic
Thanks for that. I'll make the necessary corrections asap.
- 2014-02-12T16:16:01+00:00
Branko Vukelic
- changed milestone to 0.1
- marked as enhancement
- 2014-02-12T16:16:39+00:00
Branko Vukelic
- assigned issue to
  
  Branko Vukelic
- 2014-02-12T16:16:47+00:00
Алексей Ионов
Thank you very much!
- 2014-02-12T17:21:50+00:00
Branko Vukelic
@iaa I'm working on this issue right now. Sorry wasn't paying closer attention yesterday, was in middle of some work. Anyway, I believe UrlHelpers work correctly without your fix. The URL is correctly encoded using % sequences, and in most browsers, when you paste the URL with % escape sequences, you see the correct unicode representation.

I think it was mentioned on boottle issue tracker once, but URLs are just a stream of bytes with no predefined encoding scheme, so we can't simply assume UTF-8, or something along those lines. It seems that % encoding is a generally accepted solution to this problem right now.

In all methods that should return Python unicode strings, we still need to make sure that's the case, so I'll be adding your tests to the test case. So far, it looks like they will all pass.
- 2014-02-13T08:15:01+00:00
Branko Vukelic
To expand on my 'correctly works' bit. Here's your input URL:
```
/foo?foo=1&bar=слово
```
and here is the expected output of the UrlHelpers.get_query_string call.
```
foo=1&bar=%D1%81%D0%BB%D0%BE%D0%B2%D0%BE
```
If you take your browser to example.com/?foo=1&bar=%D1%81%D0%BB%D0%BE%D0%B2%D0%BE, you'll see that the % sequences are decoded properly.
- 2014-02-13T08:17:10+00:00
Branko Vukelic
Ok, one issue that definitely needs to be dealt with is passing pre-encoded URLs to update_query_data(). We do want to make sure this works:
```
u.update_query_data(redir='/foo/bar/?q=два+слова')
```
(currently doesn't).
- 2014-02-13T08:27:12+00:00
Алексей Ионов
If my URL
```
/foo?foo=1&bar=слово
```
then in template tags, for example add_params:
```
/foo?foo=1&bar=ÑÐ»Ð¾Ð²Ð¾
```
- 2014-02-13T08:44:07+00:00
Branko Vukelic
Are you using the latest v0.0.8?
- 2014-02-13T09:12:23+00:00
Branko Vukelic
Also, can you paste the template code you are using?
- 2014-02-13T09:13:11+00:00

Branko Vukelic

This is strange. I just added this test case:

def test_full_path_with_unicode_query_param(self):
    u = UrlHelper('/foo')
    u.update_query_data(foo='слово')
    self.assertEqual(u.get_full_path(),
                     '/foo?foo=%D1%81%D0%BB%D0%BE%D0%B2%D0%BE')

Works as expected. This is what the {% add_params %} tag is doing behind the scenes, so it's definitely not a tag issue.

2014-02-13T09:18:13+00:00

Алексей Ионов
yes, 0.0.8

template code: http://yadi.sk/d/-iAWoXBSHp278
- 2014-02-13T09:20:56+00:00
Branko Vukelic
Anyway, between 0.0.7 and 0.0.8, we fixed issue #2, which dealt with some double-encoding issue. I released the patch a short while ago. Perhaps you are seeing that issue?

EDIT: Sorry, was too late with my comment. :)
- 2014-02-13T09:21:42+00:00
Branko Vukelic
Can you give me some examples of what i.name and sort might be? And also the full path for that page with any query strings that are already there.
- 2014-02-13T09:24:21+00:00
Алексей Ионов
for example 'id' and 'asc'
- 2014-02-13T09:26:28+00:00
Branko Vukelic
That's totally confusing. I see no reason why that shouldn't work... I've just ran all tests in Djangos 1.4.x, 1.5.x and 1.6.x and it all passes except for the one described in #4 which is expected to fail.
- 2014-02-13T09:32:49+00:00
Алексей Ионов
for
```
test_full_path_with_unicode_query_param
```
yes, is strange..
- 2014-02-13T09:35:16+00:00
Branko Vukelic
Well, what can I say. I'll add the remaining tests, and push them and then poke around a bit. If you find out something new, let me know.
- 2014-02-13T09:51:59+00:00
Алексей Ионов
Well, have agreed
- 2014-02-13T09:58:31+00:00
Branko Vukelic
Hey completely forgot about this. There was a patch that hasn't been released yet. Could you try getting the master branch and see if that works for you?
- 2014-02-13T10:05:10+00:00
Branko Vukelic
You know what. Sorry, I'm totally tripping. All patches have made into 0.0.8. Sorry for the confusion.

Anyhow, I pushed all the unit tests, so if you care, you could try and run them on your end, make sure everything passes (except test_with_query_params_in_url_unicode).
- 2014-02-13T10:23:07+00:00
Алексей Ионов
Well, I'll try
- 2014-02-13T10:28:32+00:00

Алексей Ионов

Ok, tests are normal, but when the url:

?search_by_field=name&search_word=слово

in the tag displayed

?search_word=%C3%91%C2%81%C3%90%C2%BB%C3%90%C2%BE%C3%90%C2%B2%C3%90%C2%BE&search_by_field=name

, although in the request.get_full_path

?search_by_field=name&search_word=%D1%81%D0%BB%D0%BE%D0%B2%D0%BE

2014-02-16T21:59:23+00:00

Branko Vukelic
Ok, I get it now. The quoted character s are quoted again. Will look into it.
- 2014-02-17T10:22:57+00:00
Log in to comment

Assignee: Branko Vukelic

Type: enhancement

Priority: major

Status: open

Component: Entire project

Milestone: 0.1

Version: 0.0.8

Votes: 0

Watchers: 2