Title: Extended HTTP functionality and WebDAV
Author: firstname.lastname@example.org (Greg Stein)
Type: Standards Track
This PEP has been rejected. It has failed to generate sufficient
community support in the six years since its proposal.
This PEP discusses new modules and extended functionality for Python's
HTTP support. Notably, the addition of authenticated requests, proxy
support, authenticated proxy usage, and WebDAV_ capabilities.
Python has been quite popular as a result of its "batteries included"
positioning. One of the most heavily used protocols, HTTP (see RFC
2616), has been included with Python for years (``httplib``). However,
this support has not kept up with the full needs and requirements of
many HTTP-based applications and systems. In addition, new protocols
based on HTTP, such as WebDAV and XML-RPC, are becoming useful and are
seeing increasing usage. Supplying this functionality meets Python's
"batteries included" role and also keeps Python at the leading edge of
While authentication and proxy support are two very notable features
missing from Python's core HTTP processing, they are minimally handled
as part of Python's URL handling (``urllib`` and
``urllib2``). However, applications that need fine-grained or
sophisticated HTTP handling cannot make use of the features while they
reside in urllib. Refactoring these features into a location where
they can be directly associated with an HTTP connection will improve
their utility for both urllib and for sophisticated applications.
The motivation for this PEP was from several people requesting these
features directly, and from a number of feature requests on
SourceForge. Since the exact form of the modules to be provided and
the classes/architecture used could be subject to debate, this PEP was
created to provide a focal point for those discussions.
Two modules will be added to the standard library: ``httpx`` (HTTP
extended functionality), and ``davlib`` (WebDAV library).
[ suggestions for module names are welcome; ``davlib`` has some
precedence, but something like ``webdav`` might be desirable ]
The ``httpx`` module will provide a mixin for performing HTTP
authentication (for both proxy and origin server authentication). This
mixin (``httpx.HandleAuthentication``) can be combined with the
``HTTPConnection`` and the ``HTTPSConnection`` classes (the mixin may
possibly work with the HTTP and HTTPS compatibility classes, but that
is not a requirement).
The mixin will delegate the authentication process to one or more
"authenticator" objects, allowing multiple connections to share
authenticators. The use of a separate object allows for a long term
connection to an authentication system (e.g. LDAP). An authenticator
for the Basic and Digest mechanisms (see RFC 2617) will be
provided. User-supplied authenticator subclasses can be registered and
used by the connections.
A "credentials" object (``httpx.Credentials``) is also associated with
the mixin, and stores the credentials (e.g. username and password)
needed by the authenticators. Subclasses of Credentials can be created
to hold additional information (e.g. NT domain).
The mixin overrides the ``getresponse()`` method to detect ``401
(Unauthorized)`` and ``407 (Proxy Authentication Required)``
responses. When this is found, the response object, the connection,
and the credentials are passed to the authenticator corresponding with
the authentication scheme specified in the response (multiple
authenticators are tried in decreasing order of security if multiple
schemes are in the response). Each authenticator can examine the
response headers and decide whether and how to resend the request with
the correct authentication headers. If no authenticator can
successfully handle the authentication, then an exception is raised.
Resending a request, with the appropriate credentials, is one of the
more difficult portions of the authentication system. The difficulty
arises in recording what was sent originally: the request line, the
headers, and the body. By overriding putrequest, putheader, and
endheaders, we can capture all but the body. Once the endheaders
method is called, then we capture all calls to send() (until the next
putrequest method call) to hold the body content. The mixin will have
a configurable limit for the amount of data to hold in this fashion
(e.g. only hold up to 100k of body content). Assuming that the entire
body has been stored, then we can resend the request with the
appropriate authentication information.
If the body is too large to be stored, then the ``getresponse()``
simply returns the response object, indicating the 401 or 407
error. Since the authentication information has been computed and
cached (into the Credentials object; see below), the caller can simply
regenerate the request. The mixin will attach the appropriate
A "protection space" (see RFC 2617, section 1.2) is defined as a tuple
of the host, port, and authentication realm. When a request is
initially sent to an HTTP server, we do not know the authentication
realm (the realm is only returned when authentication fails). However,
we do have the path from the URL, and that can be useful in
determining the credentials to send to the server. The Basic
authentication scheme is typically set up hierarchically: the
credentials for ``/path`` can be tried for ``/path/subpath``. The
Digest authentication scheme has explicit support for the hierarchical
setup. The ``httpx.Credentials`` object will store credentials for
multiple protection spaces, and can be looked up in two differents
1. looked up using ``(host, port, path)`` -- this lookup scheme is
used when generating a request for a path where we don't know the
2. looked up using ``(host, port, realm)`` -- this mechanism is used
during the authentication process when the server has specified that
the Request-URI resides within a specific realm.
The ``HandleAuthentication`` mixin will override ``putrequest()`` to
automatically insert credentials, if available. The URL from the
putrequest is used to determine the appropriate authentication
information to use.
It is also important to note that two sets of credentials are used,
and stored by the mixin. One set for any proxy that may be used, and
one used for the target origin server. Since proxies do not have
paths, the protection spaces in the proxy credentials will always use
"/" for storing and looking up via a path.
The ``httpx`` module will provide a mixin for using a proxy to perform
HTTP(S) operations. This mixin (``httpx.UseProxy``) can be combined
with the ``HTTPConnection`` and the ``HTTPSConnection`` classes (the
mixin may possibly work with the HTTP and HTTPS compatibility classes,
but that is not a requirement).
The mixin will record the ``(host, port)`` of the proxy to use. XXX
will be overridden to use this host/port combination for connections
and to rewrite request URLs into the absoluteURIs referring to the
origin server (these URIs are passed to the proxy server).
Proxy authentication is handled by the ``httpx.HandleAuthentication``
class since a user may directly use ``HTTP(S)Connection`` to speak
The ``davlib`` module will provide a mixin for sending WebDAV requests
to a WebDAV-enabled server. This mixin (``davlib.DAVClient``) can be
combined with the ``HTTPConnection`` and the ``HTTPSConnection``
classes (the mixin may possibly work with the HTTP and HTTPS
compatibility classes, but that is not a requirement).
The mixin provides methods to perform the various HTTP methods defined
by HTTP in RFC 2616, and by WebDAV in RFC 2518.
A custom response object is used to decode ``207 (Multi-Status)``
responses. The response object will use the standard library's xml
package to parse the multistatus XML information, producing a simple
structure of objects to hold the multistatus data. Multiple parsing
schemes will be tried/used, in order of decreasing speed.
The actual (future/final) implementation is being developed in the
``/nondist/sandbox/Lib`` directory, until it is accepted and moved
into the main Lib directory.
.. _WebDAV: http://www.webdav.org/
This document has been placed in the public domain.