Source

wsgi-peps / pep-XXXX.txt

Full commit
PEP: XXXX
Title: Python Web Server Gateway Interface v1.1
Version: $Revision: 71593 $
Last-Modified: $Date: 2009-04-13 13:58:19 -0700 (Mon, 13 Apr 2009) $
Author: Armin Ronacher <armin.ronacher@active-4.com>
Discussions-To: Python Web-SIG <web-sig@python.org>
Status: Draft
Type: Informational
Content-Type: text/x-rst
Created: 19-Sep-2009


Abstract
========

This document specifies a proposed standard interface between web
servers and Python web applications or frameworks, to promote web
application portability across a variety of web servers.

It superseeds :pep:`0333` for unicode aware applications for both
Python 2.x and Python 3.


Rationale and Goals
===================

Starting with Python 3.0, Python now features two distinct string
types for text and binary data.  This also made it necessary to
specify a new revision of WSGI that is based on unicode.


Specification Overview
======================

This specification only highlights the differences between WSGI 1.0
and WSGI 1.1.

String Types
------------

The following string types are used throughout the specification:

-   byte string
-   unicode string
-   native string

A 'native string' is the primary string type for a particular Python
implementation.  For Python 2.X this is a byte string, for Python 3.x
this is a unicode string.

=========== =============== ===============
            Python 2.x      Python 3.x
----------- --------------- ---------------
native      `str` (bytes)   `str` (unicode)
bytes       `str`           `bytes`
unicode     `unicode`       `str`
----------- --------------- ---------------


Differences to WSGI 1.0
=======================

Headers and Environment
-----------------------

- The application is passed an instance of a Python dictionary containing what
  is referred to as the WSGI environment.  All keys in this dictionary are
  native strings.  For CGI variables, all names are going to be ISO-8859-1
  and so where native strings are unicode strings, that encoding is used for
  the names of CGI variables

- For the WSGI variables ``'wsgi.url_scheme'`` and ``'wsgi.uri_encoding'``
  contained in the WSGI environment, the value of the variable should be a
  native string.

- For the CGI variables contained in the WSGI environment, the values of the
  variables are native strings.  Where native strings are unicode strings,
  `iso-8859-1` encoding would be used such that the original character data
  is preserved and as necessary the unicode string can be converted back to
  bytes and thence decoded to unicode again using a different encoding.
  (Except for URI values, see the URL Decoding section)

- The WSGI input stream ``'wsgi.input'`` contained in the WSGI environment and
  from which request content is read, MUST yield byte strings.

- The status line specified by the WSGI application should be a byte string.
  Where native strings are unicode strings, the native string type can also
  be returned in which case it would be encoded as `iso-8859-1`.

- The list of response headers specified by the WSGI application should
  contain tuples consisting of two values, where each value is a byte string.
  Where native strings are unicode strings, the native string type can also
  be returned in which case it would be encoded as `iso-8859-1`.

- The iterable returned by the application and from which response content
  is derived, MUST yield byte strings.

- The version information in the WSGI environment (`wsgi.version`) is ``(1, 1)``.


URL Decoding
------------

For the keys ``SCRIPT_NAME``, ``PATH_INFO`` (and ``REQUEST_URI`` if
available) the server has to use the following algorithm for decoding:

-   it decodes all values as `utf-8`.
-   if that fails, it decodes all values as `iso-8859-1`.

The latter will always work.  The encoding the server used to decode the
value is then stored in ``'wsgi.uri_encoding'``.  The application MUST use this
value to decode the ``'QUERY_STRING'`` as well.

URL Encoding
------------

If the application encodes URLs it is required to encode the URLs to
`utf-8`, independent of the value of the `wsgi.uri_encoding`.

write() deprecated
------------------

The WSGI server has to provide a `write()` function that works like
exactly like the function in WSGI 1.0, but it is required to emit a
deprecation warning to warns about this function being obsolete.
`write()` will be remove in WSGI 2.0, which will be based on WSGI 1.1.