Issue #1208 new

Pollution of `mapper.environ` with RoutesDispatcher

Alfredo Deza
created an issue

There is a behavioral repercussion when setting the environ in a routes.request_config object.

The RoutesDispatcher in CherryPy will set this if it finds it in cherrypy.serving.request, but the problem is that setting the environ is doing a couple of things:

  • Calls mapper.routematch with whatever it finds in environ['PATH_INFO']
  • Sets the following attributes in the config object: mapper_dict, route, host, and protocol.
  • Sets the WSGI environ to self.mapper.environ

If we look closer into routes._RequestConfig.load_wsgi_environ it is actually calculating a proper match for the path. This has consequences as a few attributes got changed, not to mention a performance hit because the RoutesDispatcher calculates this again (from RoutesDispatcher.find_handler) when it calls self.mapper.match::

    config = routes.request_config()
    config.mapper = self.mapper
    if hasattr(request, 'wsgi_environ'):
        config.environ = request.wsgi_environ = request.headers.get('Host', None)
    config.protocol = request.scheme
    config.redirect = self.redirect

    result = self.mapper.match(path_info)

    config.mapper_dict = result

Another thing to note is that setting the environ in the routes config like the RoutesDispatcher does is optional, as per the docs in routes::

environ (optional)
    Set to the WSGI environ for automatic prefix support if the
    webapp is underneath a 'SCRIPT_NAME'

The approach I propose to avoid the behavior when setting the environ onto config.environ is to pass the environ to the match method from the mapper. This method allows to pass the path_info and environ as keyword arguments, and will respect the fact that if an environ is passed it will prefer that environ as opposed to the one in mapper.environ.

That is a cleaner option because it ensures that the values that are being retrieved from CherryPy (host, protocol and redirect) will take precedence and the RoutesDispatcher will not delegate that to Routes figuring it out from the WSGI environ.

The above example would now look like::

    config = routes.request_config()
    config.mapper = self.mapper = request.headers.get('Host', None)
    config.protocol = request.scheme
    config.redirect = self.redirect

    result = self.mapper.match(url=path_info, 
                               environ=getattr(request, 'wsgi_environ'))

    config.mapper_dict = result

Doing a getattr on the request object will fall back to None, preserving the default value for environ.

Finally, this fixes a problem on our end that has been really difficult to reproduce, where some code was calling cherrpy.request.dispatch.mapper.generate and causing it to cache the wrong values because the environ that was retrieving from the mapper was polluted with values that were no longer correct (I assume because of the side effect of setting config.environ).

Comments (4)

  1. Jason R. Coombs

    This script, attached, reproduces the issue. At some point, maybe someone would consider integrating this into the test suite. However, because it depends on setting up a CherryPy server and relies on 'localhost' and port 8080, incorporates a routes dispatcher, and triggers undesirable state changes, I'm not prepared to include it in the test suite directly.

  2. Log in to comment