Commits

mitsuhiko  committed 063f750

Integrated Graham's proposal.

  • Participants
  • Parent commits 6bc78fe

Comments (0)

Files changed (1)

File pep-XXXX.txt

 This specification only highlights the differences between WSGI 1.0
 and WSGI 1.1.
 
+String Types
+------------
+
+The following string types are used throughout the specification:
+
+-   byte string
+-   unicode string
+-   native string
+
+A 'native string' is the primary string type for a particular Python
+implementation.  For Python 2.X this is a byte string, for Python 3.x
+this is a unicode string.
+
+=========== =============== ===============
+            Python 2.x      Python 3.x
+----------- --------------- ---------------
+native      `str` (bytes)   `str` (unicode)
+bytes       `str`           `bytes`
+unicode     `unicode`       `str`
+----------- --------------- ---------------
+
 
 Differences to WSGI 1.0
 =======================
 Headers and Environment
 -----------------------
 
-The WSGI server is required to process the headers as `latin1` (also known
-as `iso-8859-1`).  This also affects the request line that specifies the
-request path and HTTP method.
+- The application is passed an instance of a Python dictionary containing what
+  is referred to as the WSGI environment.  All keys in this dictionary are
+  native strings.  For CGI variables, all names are going to be ISO-8859-1
+  and so where native strings are unicode strings, that encoding is used for
+  the names of CGI variables
 
-The keys of the WSGI dictionary are unicode values.  The values of the
-WSGI dictionary that contain text are unicode as well.  This affects both
-the headers and standard CGI variables.
+- For the WSGI variables ``'wsgi.url_scheme'`` and ``'wsgi.uri_encoding'``
+  contained in the WSGI environment, the value of the variable should be a
+  native string.
 
-The version information in the WSGI environment (`wsgi.version`) is
-``(1, 1)``.
+- For the CGI variables contained in the WSGI environment, the values of the
+  variables are native strings.  Where native strings are unicode strings,
+  `iso-8859-1` encoding would be used such that the original character data
+  is preserved and as necessary the unicode string can be converted back to
+  bytes and thence decoded to unicode again using a different encoding.
+  (Except for URI values, see the URL Decoding section)
+
+- The WSGI input stream ``'wsgi.input'`` contained in the WSGI environment and
+  from which request content is read, MUST yield byte strings.
+
+- The status line specified by the WSGI application should be a byte string.
+  Where native strings are unicode strings, the native string type can also
+  be returned in which case it would be encoded as `iso-8859-1`.
+
+- The list of response headers specified by the WSGI application should
+  contain tuples consisting of two values, where each value is a byte string.
+  Where native strings are unicode strings, the native string type can also
+  be returned in which case it would be encoded as `iso-8859-1`.
+
+- The iterable returned by the application and from which response content
+  is derived, MUST yield byte strings.
+
+- The version information in the WSGI environment (`wsgi.version`) is ``(1, 1)``.
+
 
 URL Decoding
 ------------
 -   if that fails, it decodes all values as `iso-8859-1`.
 
 The latter will always work.  The encoding the server used to decode the
-value is then stored in `wsgi.uri_encoding`.  The application MUST use this
-value to decode the `QUERY_STRING` as well.
+value is then stored in ``'wsgi.uri_encoding'``.  The application MUST use this
+value to decode the ``'QUERY_STRING'`` as well.
 
 URL Encoding
 ------------