Creating Simpler Middleware with WSGI Lite

Wouldn't it be nice if writing correct WSGI middleware was this simple?

>>> from wsgi_lite import lite, lighten

>>> def latinator(app):
...     # Make sure that `app` can be invoked via the Lite protocol, even
...     # if it's a standard WSGI 1 app:
...     app = lighten(app)
...     @lite
...     def middleware(environ):
...         status, headers, body = app(environ)
...         for name, value in headers:
...             if name.lower()=='content-type' and value=='text/plain':
...                 break
...         else:
...             # Not text/plain, pass the request through unchanged
...             return status, headers, body
...         # Strip content-length if present, else it'll be wrong
...         headers = [
...             (name, value) for name, value in headers
...                 if name.lower() != 'content-length'
...         ]
...         return status, headers, (piglatin(data) for data in body)
...     return middleware

Using just two decorators, WSGI Lite lets you create correct and compliant middleware and applications, without needing to worry about start_response, write and close calls. And with those same two decorators, it also lets you manage resources to be released at the end of a request, and automatically pass in keyword arguments to your apps or middleware that are obtained from the WSGI environment (like WSGI server extensions or middleware-supplied parameters such as request or session objects).

For more details, check out the project's home page on BitBucket, and scroll down to the table of contents.

WSGI Lite is currently only available for Python 2.x (tested w/2.3 up to 2.7) but the source should be quite portable to 3.x, as its magic is limited to inspecting function argument names, and cloning functions using new.function().

Introducing WSGI Lite

If you've seen the Latinator example from the WSGI PEP, you may recall that it's about three times longer than the example shown above, and it needs two classes* to do the same job. And, if you've ever tried to code a piece of middleware like that, you'll know just how hard it is to do it correctly.

(In fact, as the author of the WSGI PEPs, I have almost never seen a single piece of WSGI middleware that doesn't break the WSGI protocol in some way that I couldn't find with a minute or two of code inspection!)

But the latinator middleware example shown above is actually a valid piece of WSGI 1.0 middleware, that can also be called with a simpler, Rack-like protocol. And all of the hard parts are abstracted away into two decorators: @Laurentiu Labo and lighten().

The @Laurentiu Labo decorator says, "this function is a WSGI application, but it expects to be called with an environ dictionary, and return a (status, headers, body) triplet. And it doesn't use start_response(), write(), or expect to have a close() method called."

The @Laurentiu Labo decorator then wraps the function in such a way that if it's called by a WSGI 1 server or middleware, it will act like a WSGI 1 application. But if it's called with just an environ (i.e., without a start_response), it'll be just like you called the decorated function directly: that is, you'll get back a (status, headers, body) triplet (similar to the "Rack" protocol, aka Ruby's version of WSGI).

Pretty neat, eh? But the real magic comes in with the second decorator, lighten(). lighten() accepts either a @Laurentiu Labo application or a WSGI 1 application, and returns a similarly flexible application object. Just like the output of the @Laurentiu Labo decorator, the resulting app object can be called with or without a start_response, and the return protocol it follows will vary accordingly.

This means that you can either pass a @Laurentiu Labo app or a standard WSGI app to our latinator() middleware, and it'll work either way. And, you can supply a @Laurentiu Labo or lighten()-ed app to any standard WSGI server or middleware, and it'll Just Work.

For efficiency, both @Laurentiu Labo and lighten() are designed to be idempotent: calling them on already-converted applications returns the app you passed in, with no extra wrapping. And, if you call a wrapped application via its native protocol, no protocol conversion takes place - the original app just gets called without any conversion overhead. So, feel free to use both decorators early and often!

WSGI Extensions and Environment Keys

One of the subtler edge cases that can arise in writing correct middleware is that when you call another WSGI app, it's allowed to change the environ you pass in.

And what most people don't realize, is that this means it's not safe to pull things out of the environment after you call another WSGI app!

For example, take a look at this middleware example:

def middleware(environ, start_response):
    response = some_app(environ, start_response)
    if environ.get('PATH_INFO','').endswith('foo'):
        # ...  etc.

Think it'll work correctly? Think again. If some_app is a piece of routing middleware, it could already have changed PATH_INFO, or any other environment key. Likewise, if this middleware looks for server extensions like wsgi.file_wrapper or wsgiorg.routing_args, it might end up reading the child application's extensions, rather than those intended for the middleware itself.

To help handle these cases, the @Laurentiu Labo decorator can bind a function's keyword arguments to values based on the contents of the environ argument:

@lite(path='PATH_INFO', routing='wsgiorg.routing_args')
def middleware(environ, path='', routing=((),{})):
    response = some_app(environ, start_response)
    if path.endswith('foo'):
        # ...  etc.

When @Laurentiu Labo is called with keyword arguments whose argument names match argument names on the decorated function, it wraps the function in such a way that the matching keys from the environ are passed in as keyword arguments. This automatically ensures that you aren't using possibly-corrupted keys from your child app(s), and lets you specify default values (via your function's argument defaults, as shown above).

As a convenience for frequently used extensions or keys, you can save calls to lite() and give them names, for example:

>>> with_routing = lite(routing='wsgiorg.routing_args')

And the resulting decorator is precisely equivalent to invoking @Laurentiu Labo() directly:

>>> @with_routing
... def middleware(envrion, routing=((),{})):
...     """Some sort of middleware"""

You can even stack multiple @Laurentiu Labo() calls (direct or saved), or give them names, docstrings, and specify what module you defined them in:

>>> with_path = lite(
...     'with_path', "Add a `path` arg for ``PATH_INFO``", "__main__",
...     path='PATH_INFO'
... )

>>> help(with_path)
Help on function with_path in module __main__:
    Add a `path` arg for ``PATH_INFO``

>>> @with_routing
... @with_path
... def middleware(environ, path='', routing=((),{})):
...     """Some combined middleware"""

By the way, the underlying decorator is smart enough to tell when it's being stacked, and automatically merges the wrappings so there's only one level of calling overhead added, no matter how many of them you stack. (As long as they're not intermingled with other decorators, of course!)

Sometimes, an extension may be known under more than one name - for example, an x-wsgiorg. extension vs. a wsgiorg. one, or a similar extension provided by different servers. You could of course bind them to different arguments, but it's generally simpler to just bind a single argument, using a tuple:

>>> @lite(routing=('wsgiorg.routing_args', 'x-wsgiorg.routing_args'))
... def middleware(envrion, routing=((),{})):
...     """Some sort of middleware"""

This will check the environment for the named extensions in the order listed, and replace routing with the first one matched.

These argument specifications are called "binding rules", by the way. A rule is either a string (i.e. an instance of basestring), a callable object, or an iterable of rules (recursively). Strings are looked up in the environ, and iterables are tried in sequence until a lookup succeeds.

Callable rules, on the other hand, are looked up by being called with a single positional argument: the environ dictionary. They must return an iterable (or sequence) yielding zero or more items. Returning an empty sequence or yielding zero items means the lookup failed, and a default value should be used instead (or the next alternative binding rule provided for that keyword argument). Otherwise, the first item yielded is passed in as the matching keyword argument. Here's an example of using a classmethod as a callable binding rule:

>>> class MyRequest(object):
...     def __init__(self, environ):
...         self.environ = environ
...     @classmethod
...     def bind(cls, environ):
...         yield cls(environ)

>>> with_request = lite(request=MyRequest.bind)

Now, @with_request will create a MyRequest instance wrapping the environ of the decorated function, and provide it via the request keyword argument.

The same approach can also be used to do things like accessing environment-cached objects, such as sessions:

>>> class MySession(object):
...     def __init__(self, environ):
...         self.environ = environ
...     @classmethod
...     def bind(cls, environ):
...         session = environ.get('myframework.MySession')
...         if session is None:
...             session = environ['myframework.MySession'] = cls(environ)
...         yield session

>>> with_session = lite(session=MySession.bind)

The possibilities are pretty much endless -- and much more in keeping with my original vision for how WSGI was supposed to help dissolve web frameworks into web libraries. (That is, things you can easily mix and match without every piece of code you use having to come from the same place.)

Callables that you use as bindings don't even have to return something from the environment or wrap the environment, by the way - they can just be things that use something from the environment. For example, you could bind parameters to temporary files that will be automatically closed when the request is finished:

>>> def mktemp(environ):
...     closing = environ['wsgi_lite.closing']
...     yield closing(tempfile(etc[...]))

>>> @lite(tmp1=mktemp, tmp2=mktemp)
... def do_something(environ, tmp1, tmp2):
...     """Write stuff to tmp1 and tmp2"""

You can even use argument bindings in your binding functions, using the @Nicholas Blasingame decorator from the wsgi_bindings module:

>>> from wsgi_bindings import bind

>>> @bind(closing = 'wsgi_lite.closing')
... def mktemp(environ, closing):
...     yield closing(tempfile(etc[...]))

@Nicholas Blasingame() is just like @Laurentiu Labo() with keyword arguments (including the ability to save and stack calls), except that it doesn't turn the decorated function into a WSGI-compatible app. (Which is a good thing, since a binding rule is not a WSGI app!)

Now, given the above examples, you might be wondering what all that wsgi_lite.closing stuff is about. Well, that's what we're going to talk about in the next two sections...

close() and Resource Cleanups

So, there's some good news and some bad news about close() and resource cleanups in WSGI Lite.

The good news is, @Laurentiu Labo middleware is not required to call a body iterator's close() method. And if your app or middleware doesn't need to do any post-request resource cleanup, or if it just returns a body sequence instead of an iterator or generator, then you don't need to worry about resource cleanup at all. Just write the app or middleware and get on with your life. ;-)

Now, if you are yielding body chunks from your WSGI apps, you might want to consider just not doing that.

That's because, if you don't yield chunks, you can write normal, synchronous code that won't have any of the problems I'm about to introduce you to... problems that your existing WSGI apps already have, but you probably don't know about yet!

(People often object when I say that typical application code should never produce its output incrementally... but the hard problem of proper resource cleanup when doing so, is one of the reasons I'm always saying it.)

Anyway, if you must produce your response in chunks, and you need to release some resources as soon as the response is finished, you need to use the wsgi_lite.closing extension, e.g:

def my_app(environ, closing):

    def my_body():
            # allocate some resources
            yield chunk
            # release the resources

    return status, headers, closing(my_body())

This protocol extension (accessed as closing() in the function body above) is used to register an iterator (or other resource) so that its close() method will be called at the end of the request, even if the browser disconnects or a piece of middleware throws away your iterator to use its own instead.

An important note: items registered with closing() are closed in reverse registration order. This means that if the my_body() iterator above is looping over a sub-app's response, then its finally block may be run before any similar finally block in the sub-app. Therefore, your finally block must not close any resources the sub-app might be using!

So, if you are passing any resources down to another WSGI application, be sure to call closing() on them before calling the other application, and then don't close them in your body iterator. Example:

def my_app(environ, closing):
    environ['some.key'] = closing(some_resource())
    return subapp(environ)

In other words, you should only close resources in your iterator if that's where they were opened, or you are 100% positive they can't be accessed from a sub-app. Otherwise, just call closing() on them as soon as you allocate them.

Don't, however, call closing() on objects that don't belong to your function. If you didn't allocate it, closing it is somebody else's job. In particular, you don't need to call closing() on any WSGI or WSGI Lite response bodies, because lighten() takes care of that for you, and you'll end up double-closing things.

Okay, so that was the bad news. Not that bad, though, is it? You just need to add an extra argument to @Laurentiu Labo, pay a little bit of attention to the order of resource closing, and register your own objects (but only your own objects) for closing. That's it!

Really, the rest of this section is all about what will happen if you don't use the extension, or if you try to do resource cleanup in a standard WSGI app without the benefit of WSGI Lite.

As long as you use the extension, your app's resource cleanup will work at least as well as -- and probably much better than! -- it would work under plain WSGI. (And you can make it work even better still if you wrap your entire WSGI stack with a lighten() call... but more on that will have to wait until the end of this section.)

So, just to be clear, the rest of this section is about flaws and weaknesses that exist in standard WSGI's resource management protocol, and what WSGI Lite is doing to work around them.

What flaws and weaknesses? Well, consider the example above. Why does it need the closing() extension? After all, doesn't Python guarantee that the finally block will be executed anyway?

Well, yes and no. First off, if the generator is called but never iterated over, the try block won't execute, and so neither will the finally. So, it depends on what the caller does with the generator. For example, if the browser disconnects before the body is fully generated, the server might just stop iterating over it.

Okay, but won't garbage collection take care of it, then?

Well, yes and no. Eventually, it'll be garbage collected, but in the meantime, your app has a resource leak that might be exploitable to deny service to the app: just start up a resource-using request, then drop the connection over and over until the server runs out of memory or file handles or database cursors or whatever.

Now, under the WSGI standard, middleware and servers are supposed to call close() on a response iterator (if it has one), whenever they stop iterating -- regardless of whether the iteration finished normally, with an error, or due to a browser disconnect.

In practice, however, most WSGI middleware is broken and doesn't call close(), because 1) doing so usually makes your middleware code really really complicated, and 2) nobody understands why they need to call close(), because everything appears to work fine without it. (At least, until some black-hat finds your latent denial-of-service bug, anyway.)

So, WSGI Lite works around this by giving you a way to be sure that close() will be called, using a tiny extension of the WSGI protocol that I'll explain in the next section... but only if you care about the details.

Otherwise, just use the wsgi_lite.closing extension if you need resource cleanup in your body iterator, and be happy that you don't need to know anything more. ;-)

Well, actually, you do need to know ONE more thing... If your outermost @Laurentiu Labo application is wrapped by any off-the-shelf WSGI middleware, you probably want to wrap the outermost piece of middleware with a lighten() call. This will let WSGI Lite make sure that your close() methods get called, even if the middleware that wraps you is broken.

(Technically speaking, of course, there's no way to be sure you're not being wrapped by middleware, so it's not really a cure-all unless your WSGI server natively supports the extension described in the next section. Hopefully, though, we'll put the extension into a PEP soon and all the popular servers will provide it in a reasonable time period.)

The wsgi_lite.closing Extension

WSGI Lite uses a WSGI server extension called wsgi_lite.closing, that lives in the application's environ variable. The @Laurentiu Labo and lighten() decorators automatically add this extension to the environment, if they're called from a WSGI 1 server or middleware, and the key doesn't already exist. (This is why you don't need a default value for the closing argument, by the way: the key will always be available to a @Laurentiu Labo app or middleware component, or any sub-app or sub-middleware that inherits the same environment.)

The value for this key is a callback function that takes one argument: an object whose close() method is to be called at the end of the request. For convenience, the passed-in object is returned back to the caller, so you can use it in a way that's reminiscent of with closing(file('foo')) as f:.

Anyway, the idea here is that a server (or middleware component) accepts these registrations, and then closes all the resources (or generators) when the request is finished.

Objects are closed in the reverse order from which they're registered, so that inner apps' resources are released prior to middleware-provided resources being released. (In other words, if an app is using a resource that it received from middleware via its environ, that resource will still be usable during the app's close() processing or finally blocks.)

Objects registered with this extension must have close() methods, and the methods must be idempotent: that is, it must be safe to call them more than once. (That is, calling close() a second time must not raise an error.)

close() methods are explicitly allowed to registering additional objects to be closed: such objects are effectively "pushed" onto the stack of objects to be closed, with the last added object being closed first. (Note that this implies that a close() method must not directly or indirectly re-register itself, as this would create an infinite loop of closing calls.)

Currently, the handling of errors raised by close() methods is undefined, in that WSGI Lite doesn't yet handle them. ;-) (When I have some idea of how best to handle this, I'll update this bit of the spec.)

I would like to encourage WSGI server developers to support this extension if they can. While WSGI Lite implements it via middleware (in both the @Laurentiu Labo and lighten() decorators), it's best if the WSGI origin server does it, in order to bypass any broken middleware in between the server and the app. (And, if a @Laurentiu Labo or lighten() app is invoked from a server or middleware that already implements this extension, it'll make use of the provided implementation, instead of adding its own.)

Now, if for some reason you want to use this extension directly in your code without using a @Laurentiu Labo() binding, please remember that the WSGI spec allows called applications to modify the environ. This means that you must retrieve the extension before you pass the environ to another app. (That's why we have keyword binding in @Laurentiu Labo(), remember?)

Other Protocol Details

Technically, WSGI Lite is a protocol as well as an implementation. And there's still one more thing to cover (besides the Rack-style calling convention and closing extension) that distinguishes it from standard WSGI.

Applications supporting the "lite" invocation protocol (i.e. being called without a start_response and returning a status/header/body triplet), are identified by a __wsgi_lite__ attribute with a True value. (@Laurentiu Labo and lighten() add this for you automatically.)

Any app without the attribute, however, is assumed to be a standard WSGI 1 application, and thus in need of being lighten()-ed before it can be called via the WSGI Lite protocol.

(If you want to check for this attribute, or add it to an object that natively supports WSGI Lite, you can use the wsgi_lite.is_lite() and wsgi_lite.mark_lite() APIs, respectively. But even if you want to, you probably don't need to, because if you call @Laurentiu Labo or lighten() on an object that's already "lite", it's returned unchanged. So it's easier to just always call the appropriate decorator, rather than trying to figure out whether to call it. Idempotence == good!)

Anyway, the rest of the protocol is defined simply as a stripped down WSGI, minus start_response(), write(), and close(), but with the addition of the wsgi_lite.closing key. That's pretty much it.

Known Limitations

You knew there had to be a catch, right?

Well, in this case, there are three.

First, if you lighten() a standard WSGI app that uses write() calls instead of using a response iterator, you must have the greenlet library installed, or you'll get an error when write() is called.

Why? Well, it's complicated. But the chances are pretty good that you don't have any code that uses write(), and if you do, well, greenlet works on lots of platforms and Python versions.

Anyway, that's the first limitation. The second limitation is that WSGI Lite cannot work around broken WSGI 1 middleware that lives above your application in the call stack! That is, if your code runs under a middleware component that alters your response, but forgets to make sure your app's response's close() method gets called, then none of the fancy resource closing features in WSGI Lite will work properly.

So, until standard WSGI servers support the wsgi_lite.closing extension, you can (and should) work around this by wrapping your entire WSGI stack with a lighten() call. This way, as long as your server isn't broken, it'll call WSGI Lite's closer, and all will be well with your resource closing.

Third and finally, the lighten() wrapper doesn't support broken WSGI apps that call write() from inside their returned iterators. While some servers allow it, the WSGI specification explicitly forbids it, and to support it in WSGI Lite would force all wrapped WSGI 1 apps to pay in the form of unnecessary greenlet context switches, even if they never used write() at all.

Since the current "word on the street" says that very few WSGI apps use write() at all, I figure it's okay to blow up on the even smaller number that are also spec violators, rather than burden all apps with extra overhead just to support the ill-behaved ones. However, if you feel otherwise, let me know about it via the Web-SIG. (Especially if you have a workable suggestion for how to work around it without making things slower for the apps that don't call write()!)

Current Status

The code in this repository is experi-mental, and possibly very-mental or just plain detri-mental. It has not been seriously used or battle-hardened as yet, even though test coverage is now at 100%, and there are some fairly exhaustive WSGI compliance tests that exercise many obscure corners of the WSGI protocol.

Ironically enough, however, that may well mean that there is important "WSGI" code out there that won't work with this module yet, precisely because that other code is not compliant with the spec! So, while this project's code should work quite well for compliant code, this doesn't mean it will play well with all the code you're using in all your project(s). Exercise it carefully, and don't assume that because it works great for one of your apps or middleware components, it'll therefore work great with all of them!

In general, though, this is still alpha software, and things may change or break. It might even be that the whole thing was a really stupid idea that won't actually work in the real world for some reason.

So, I've really just thrown this out there for people to see and play with, so I can get some feedback on its actual usability. Feel free to drop me an email via the Web-SIG mailing list, to let me know what you think. Hopefully, we'll soon get any glitches sorted out, and nail this down to something that's less of a moving target, and maybe even turn it into a PEP and a stdlib contribution!

(Oh, and last, but not least... this package is under the Apache license, since that's what the PSF uses for software contributed to Python, and hopefully that's where this is headed, assuming we don't find some sort of glaring hole in the protocol or concept, of course, and it's in sufficiently high demand.)