Issue #5 resolved

Don't quote ':@&+$,' in SCRIPT_NAME and PATH_INFO.

Daniel Nouri
created an issue

Adding an issue on Sergey's request. From the pull request:

From RFC 2396: 'Within a path segment, the characters "/", ";", "=", and "?" are reserved.'

That is, ":", "@", "&", "+" and "$" don't need to be quoted with path segments.

The current quoting behaviour easily leads to quoting hell. Consider this example:

{{{

request.environ['PATH_INFO'] '/@@mypage' request.url 'http://example.org/%40%40mypage' return HTTPFound(location=request.url) }}}

On the next request, request.url will be

{{{

request.url 'http://example.org/%2540%2540mypage' }}} and the redirect will no longer work.

The patch is here: https://bitbucket.org/dnouri/webob/changeset/bb042d67bca1

Comments (12)

  1. Sergey Schetinin
    • changed status to open

    One thing that seems wrong to me is the claim that the redirect will cause a different url. The %-escaping of the @ sign is not necessary, but the user-agent should not re-escape the %-sign itself. In other words, both '%40' and '@' are valid encodings of the same string.

    Having said that, I think that we should not do unnecessary escaping in req.url and related properties, so if we clear this out, I'd accept the patch w/ a different commit message.

  2. Daniel Nouri reporter

    I should've been more clear. Here's the second example with another line added:

    On the next request, request.url will be

      >>> request.environ['PATH_INFO']
      '/%40%40mypage'
      >>> request.url
      'http://example.org/%2540%2540mypage'
    

    In effect, it's not the user agent that's quoting anything, It's just that request.url's quoting behaviour becomes quite annoying when you have any of these characters in the path, since it'll happily quote %40 to %2540.

  3. Daniel Nouri reporter

    I passed the quoted form in

        return HTTPFound(location=request.url)
    

    I'm using paste.httpserver and wsg_intercept.zope_testbrowser and the behaviour is consistent with the two. That is, no unescaping seems to be happen.

  4. Sergey Schetinin

    Yes, the HTTP redirect and then the request will contain /%40... but the server should set the PATH_INFO to /@...

    I'm not sure how you get that result, a demo app would be nice, but the paste.httpserver most certainly does the unescaping as required by the spec. See http://svn.pythonpaste.org/Paste/trunk/paste/httpserver.py in the wsgi_setup method

            (scheme, netloc, path, query, fragment) = urlparse.urlsplit(self.path)
            path = urllib.unquote(path)
            ...
            self.wsgi_environ = {
                   ...
                   ,'PATH_INFO': path
                   ...
                   }
    
  5. Daniel Nouri reporter

    Looks like I'm mistaken. I can no longer reproduce with paste.httpserver. Seems to only happen with wsgi_intercept.zope_testbrowser. I'll look into wsgi_intercept, it's probably missing that unescaping bit.

    In any case, it'd be nice to land the patch. Do you want me to make a separate commit that removes the example from the commit message?

  6. Sergey Schetinin

    If you don't mind, I'll just commit the patch w/ a different message: "Make sure that req.url and related properties do not unnecessarily escape ":", "@", "&", "+" and "$" in the URI path".

  7. Log in to comment