Reserved characters / spaces are double-escaped
Hey, I'm really loving your template tags. :)
I did just attempt using add_params with a datetime, though, and I hit a snag: it comes out as:
/?my_date=2013-10-01%252000%3A00%3A00
What tipped me off was the %2520
where there should have been a space (separating the date from the time). Of course, this can't be parsed naturally on the receiving end.
I dug deeper, and it looks like the problem is that update_query_data stores the values using iri_to_uri (which encodes the space), and then get_full_path uses urlquote (which re-encodes the space's %20 and encodes the time's colons as well).
So the overall process looks like this:
>>> from url_tools.helper import UrlHelper
>>> url = UrlHelper('/')
>>> url.update_query_data(test='1111-11-11 22:22:22')
>>> print url.query_dict
<QueryDict: {u'test': [u'1111-11-11%2022:22:22']}>
>>> print url.get_full_path()
/?test=1111-11-11%252022%3A22%3A22
>>> print url.get_query_string() # more specifically (get_full_path calls this)
test=1111-11-11%252022%3A22%3A22
I took a crack at addressing this myself (so as not to be an empty complainer). I chose to alter update_query_data since the other setters (i.e. overload_params) don't seem to do any escaping, rather than changing the outgoing get_query_string.
Specifically, my changes were to lines 37 and 39 of helpers.py, removing iri_to_uri and passing in the raw val / v instead:
def update_query_data(self, **kwargs):
for key in kwargs:
val = kwargs[key]
if hasattr(val, '__iter__'):
self.query_dict.setlist(key, [v for v in val])
else:
self.query_dict[key] = val
This solved the problem totally for me, but I'm only using add_params
and del_params
via the template tags.
Thoughts? I can make a formal change if you agree!
Comments (17)
-
-
Guys, I'm really sorry. For some reason I missed a few issues that were filed on various repos on BitBucket, including this one.
@hjkelly Yes, the change looks good.
-
-
assigned issue to
-
assigned issue to
-
- changed status to open
-
reporter Branko, looks like you're taking care of the code tweaks. If that's wrong, lemme know and I can fumble through the contributory process and try to get a patch to you. :)
-
@hjkelly If you'd like, I'd appreciate a pull request. :) Otherwise, I'll take care of it in a day or two.
-
I am working on it now, got a test for it. Just need to learn enough git to see me through, as you're one of the funny people who use Bitbucket without hg ;)
-
LOL
-
You can track my progress at https://bitbucket.org/trojjer/django-url-tools if you like. I need to head home from the office now though, approaching 6pm!
-
See comment regarding lack of string coercion for non-string values: https://bitbucket.org/trojjer/django-url-tools/commits/bc0958749c032a4839a888775e3e09c09432d9e5
Either smart_text, force_text or six.text_type(value if six.PY3 else bytes(value)) could be used for this in update_query_data -- the latter seems preferable, given comments in Django's bytes_to_text function regarding querystring data conversion. Unfortunately, that function can't be used in this case as it does nothing!
-
Of course, if this project isn't targeting Python 3 at all yet it can be simplified to unicode(bytes(value)). It's a bit annoying that bytes_to_text doesn't do this, given the seemingly undocumented expectation that values stored in QueryDicts must be Unicode strings.
-
Ah, so confusing! Seems that unicode(1) is essentially the same as unicode(bytes(1)) for this purpose after all, at least in Py 2.7.3 at work. It seems that errors='replace' is an important kwarg for querystring data though, and to preserve the QueryDict's internal encoding the encoding kwarg can be specified. Funnily enough, converting the value with bytes then becomes necessary in 2.x!
-
@trojjer @hjkelly When setting values for query params in the UrlHelper, I think it's best we leave values as they are (no coercion to
unicode
). Since the helpers is going to calliri_to_uri
on the values when getting the full URL, I see no reason we we should do anything to the values until needed (doing the opposite created this bug to begin with). As far as no other test case breaks, I think it's fine to change the three failing cases to expect ints. I doubt anyone depends on the particular behavior we have right now. Thoughts? -
OK I will be submitting the pull request soon.
-
- changed status to resolved
Thanks you guys. This should now be fixed. I'll make a new patch release soonish.
-
@hjkelly Released in v0.0.8. Please test it and let me know if any new issues crop up. File a new report for any issues found in the new release.
-
- changed milestone to 0.0.7 COMPLETE
- Log in to comment
Watching this issue, hoping it is resolved soon so I can use this lovely module! I was expecting the querystring values themselves to be form encoded, e.g. 'Space+here' instead of 'Space%20here' -- looks like Django's urlquote_plus could be used for this, as it uses urllib.quote_plus. Am I missing something, or is that not preferable as a default case? Edit: Nevermind that, turns out that QueryDict.urlencode() does the job of course. As hjkelly said, it appears that iri_to_uri() is unnecessary in update_query_data().