Commits

Anonymous committed 638870a Merge

merged

Comments (0)

Files changed (3)

+0.8.0
+    ProxyInfo objects now can construct themselves from environment
+    variables commonly-used in Unix environments. By default, the Http
+    class will construct a ProxyInfo instance based on these environment
+    variables. To achieve the previous behavior, where environment
+    variables are ignored, pass proxy_info=None to Http().
+
+    The following issues have been addressed:
+
+    Issue 159: automatic detection of proxy configuration.
+
 0.7.1
     Fix failure to install cacerts.txt for 2.x installs.
 
     Fixes issue 72. Always lowercase authorization header.
     Fix issue 47. Redirects that become a GET should not have a body.
     Fixes issue 19. Set Content-location on redirected HEAD requests
-    Fixes issue 139. Redirect with a GET on 302 regardless of the originating method. 
-    Fixes issue 138. Handle unicode in headers when writing and retrieving cache entries. Who says headers have to be ASCII! 
+    Fixes issue 139. Redirect with a GET on 302 regardless of the originating method.
+    Fixes issue 138. Handle unicode in headers when writing and retrieving cache entries. Who says headers have to be ASCII!
     Add certificate validation. Work initially started by Christoph Kern.
     Set a version number. Fixes issue # 135.
     Sync to latest version of socks.py
 
    The following issues have been addressed:
 
-    #51 - Failure to handle server legitimately closing connection before request body is fully sent	
-    #77 - Duplicated caching test	 
+    #51 - Failure to handle server legitimately closing connection before request body is fully sent
+    #77 - Duplicated caching test
     #65 - Transform _normalize_headers into a method of Http class
-    #45 - Vary header	  
-    #73 - All files in Mercurial are executable	 
-    #81 - Have a useful .hgignore	 	 
-    #78 - Add release tags to the Mercurial repository		 
-    #67 - HEAD requests cause next request to be retried	  
+    #45 - Vary header
+    #73 - All files in Mercurial are executable
+    #81 - Have a useful .hgignore
+    #78 - Add release tags to the Mercurial repository
+    #67 - HEAD requests cause next request to be retried
 
    Mostly bug fixes, the big enhancement is the addition of proper Vary: header
    handling. Thanks to Chris Dent for that change.
 
    Fixed the following bugs:
 
-      #12 - Cache-Control: only-if-cached incorrectly does request if item not in cache 
+      #12 - Cache-Control: only-if-cached incorrectly does request if item not in cache
       #39 - Deprecation warnings in Python 2.6
       #54 - Http.request fails accesing Google account via http proxy
       #56 - Block on response.read() for HEAD requests.
 
    Added support for proxies if the Socksipy module is installed.
 
-   Fixed bug with some HEAD responses having content-length set to 
+   Fixed bug with some HEAD responses having content-length set to
    zero incorrectly.
 
    Fixed most except's to catch a specific exception.
 
    Added 'connection_type' parameter to Http.request().
- 
+
    The default for 'force_exception_to_status_code' was changed to False. Defaulting
    to True was causing quite a bit of confusion.
 
 
    Many improvements to the file cache:
 
-     1.  The names in the cache are now much less 
+     1.  The names in the cache are now much less
          opaque, which should help with debugging.
 
-     2.  The disk cache is now Apache mod_asis compatible. 
-     
+     2.  The disk cache is now Apache mod_asis compatible.
+
      3.  A Content-Location: header is supplied and stored in the
          cache which points to the original requested URI.
 
    Http.add_credentials() now takes an optional domain to restrict
    the credentials to being only used on that domain.
 
-   Added Http.add_certificate() which allows setting 
+   Added Http.add_certificate() which allows setting
    a key and cert for SSL connnections.
 
    Many other bugs fixed.
 0.2.0
    Added support for Google Auth.
 
-   Added experimental support for HMACDigest.  
+   Added experimental support for HMACDigest.
 
    Added support for a pluggable caching system. Now supports
    the old system of using the file system and now memcached.
 
-   Added httplib2.debuglevel which turns on debugging. 
+   Added httplib2.debuglevel which turns on debugging.
 
    Change Response._previous to Response.previous.
 
    Addded Http.follow_all_redirects which forces
-   httplib2 to follow all redirects, as opposed to 
+   httplib2 to follow all redirects, as opposed to
    following only the safe redirects. This makes the
    GData protocol easier to use.
 
     4. Subsequent requests to resources that had timed out would raise an exception.
     And one feature request for 'method' to default to GET.
 
-    Xavier Verges Farrero supplied what I needed to make the 
+    Xavier Verges Farrero supplied what I needed to make the
     library work with Python 2.3.
 
     I added distutils based setup.py.
 
-0.1 Rev 86 
-    
+0.1 Rev 86
+
     Initial Release
 

python2/httplib2/__init__.py

 import zlib
 import httplib
 import urlparse
+import urllib
 import base64
 import os
 import copy
 import time
 import random
 import errno
-# remove depracated warning in python2.6
 try:
     from hashlib import sha1 as _sha, md5 as _md5
 except ImportError:
+    # prior to Python 2.5, these were separate modules
     import sha
     import md5
     _sha = sha.new
     name/password are mapped to key/cert."""
     pass
 
+class AllHosts(object): pass
 
 class ProxyInfo(object):
-  """Collect information required to use a proxy."""
-  def __init__(self, proxy_type, proxy_host, proxy_port, proxy_rdns=None, proxy_user=None, proxy_pass=None):
-      """The parameter proxy_type must be set to one of socks.PROXY_TYPE_XXX
-      constants. For example:
+    """Collect information required to use a proxy."""
+    bypass_hosts = ()
 
-p = ProxyInfo(proxy_type=socks.PROXY_TYPE_HTTP, proxy_host='localhost', proxy_port=8000)
-      """
-      self.proxy_type, self.proxy_host, self.proxy_port, self.proxy_rdns, self.proxy_user, self.proxy_pass = proxy_type, proxy_host, proxy_port, proxy_rdns, proxy_user, proxy_pass
+    def __init__(self, proxy_type, proxy_host, proxy_port,
+        proxy_rdns=None, proxy_user=None, proxy_pass=None):
+        """The parameter proxy_type must be set to one of socks.PROXY_TYPE_XXX
+        constants. For example:
 
-  def astuple(self):
-    return (self.proxy_type, self.proxy_host, self.proxy_port, self.proxy_rdns,
-        self.proxy_user, self.proxy_pass)
+        p = ProxyInfo(proxy_type=socks.PROXY_TYPE_HTTP,
+            proxy_host='localhost', proxy_port=8000)
+        """
+        self.proxy_type = proxy_type
+        self.proxy_host = proxy_host
+        self.proxy_port = proxy_port
+        self.proxy_rdns = proxy_rdns
+        self.proxy_user = proxy_user
+        self.proxy_pass = proxy_pass
 
-  def isgood(self):
-    return (self.proxy_host != None) and (self.proxy_port != None)
+    def astuple(self):
+        return (self.proxy_type, self.proxy_host, self.proxy_port,
+            self.proxy_rdns, self.proxy_user, self.proxy_pass)
+
+    def isgood(self):
+        return (self.proxy_host != None) and (self.proxy_port != None)
+
+    @classmethod
+    def from_environment(cls, method='http'):
+        """
+        Read proxy info from the environment variables.
+        """
+        if method not in ['http', 'https']: return
+
+        env_var = method+'_proxy'
+        url = os.environ.get(env_var, os.environ.get(env_var.upper()))
+        if not url: return
+        pi = cls.from_url(url, method)
+
+        no_proxy = os.environ.get('no_proxy', os.environ.get('NO_PROXY', ''))
+        bypass_hosts = no_proxy.split(',') if no_proxy else []
+        # special case, no_proxy=* means all hosts bypassed
+        if no_proxy == '*': bypass_hosts = AllHosts
+
+        pi.bypass_hosts = bypass_hosts
+        return pi
+
+    @classmethod
+    def from_url(cls, url, method='http'):
+        """
+        Construct a ProxyInfo from a URL (such as http_proxy env var)
+        """
+        url = urlparse.urlparse(url)
+        ident, sep, host_port = url.netloc.rpartition('@')
+        username, sep, password = ident.partition(':')
+        host, sep, port = host_port.partition(':')
+        if port:
+            port = int(port)
+        else:
+            port = dict(https=443, http=80)[method]
+        proxy_type = 3 # socks.PROXY_TYPE_HTTP
+        return cls(
+            proxy_type = proxy_type,
+            proxy_host = host,
+            proxy_port = port,
+            proxy_user = username or None,
+            proxy_pass = password or None,
+        )
+
+    def applies_to(self, hostname):
+        return not self.bypass_host(hostname)
+
+    def bypass_host(self, hostname):
+        """Has this host been excluded from the proxy config"""
+        return self.bypass_hosts is AllHosts or any(
+            hostname.endswith(domain)
+            for domain in self.bypass_hosts
+        )
 
 
 class HTTPConnectionWithTimeout(httplib.HTTPConnection):
 
 and more.
     """
-    def __init__(self, cache=None, timeout=None, proxy_info=None,
+    def __init__(self, cache=None, timeout=None,
+                 proxy_info=ProxyInfo.from_environment,
                  ca_certs=None, disable_ssl_certificate_validation=False):
         """
-        The value of proxy_info is a ProxyInfo instance.
-
         If 'cache' is a string then it is used as a directory name for
         a disk cache. Otherwise it must be an object that supports the
         same interface as FileCache.
         for example the docs of socket.setdefaulttimeout():
         http://docs.python.org/library/socket.html#socket.setdefaulttimeout
 
+        `proxy_info` may be:
+          - a callable that takes the http scheme ('http' or 'https') and
+            returns a ProxyInfo instance per request. By default, uses
+            ProxyInfo.from_environment.
+          - a ProxyInfo instance (static proxy config).
+          - None (proxy disabled).
+
         ca_certs is the path of a file containing root CA certificates for SSL
         server certificate validation.  By default, a CA cert file bundled with
         httplib2 is used.
                 scheme = 'https'
                 authority = domain_port[0]
 
+            proxy_info = self._get_proxy_info(scheme, authority)
+
             conn_key = scheme+":"+authority
             if conn_key in self.connections:
                 conn = self.connections[conn_key]
                         conn = self.connections[conn_key] = connection_type(
                                 authority, key_file=certs[0][0],
                                 cert_file=certs[0][1], timeout=self.timeout,
-                                proxy_info=self.proxy_info,
+                                proxy_info=proxy_info,
                                 ca_certs=self.ca_certs,
                                 disable_ssl_certificate_validation=
                                         self.disable_ssl_certificate_validation)
                     else:
                         conn = self.connections[conn_key] = connection_type(
                                 authority, timeout=self.timeout,
-                                proxy_info=self.proxy_info,
+                                proxy_info=proxy_info,
                                 ca_certs=self.ca_certs,
                                 disable_ssl_certificate_validation=
                                         self.disable_ssl_certificate_validation)
                 else:
                     conn = self.connections[conn_key] = connection_type(
                             authority, timeout=self.timeout,
-                            proxy_info=self.proxy_info)
+                            proxy_info=proxy_info)
                 conn.set_debuglevel(debuglevel)
 
             if 'range' not in headers and 'accept-encoding' not in headers:
 
         return (response, content)
 
+    def _get_proxy_info(self, scheme, authority):
+        """Return a ProxyInfo instance (or None) based on the scheme
+        and authority.
+        """
+        hostname, port = urllib.splitport(authority)
+        proxy_info = self.proxy_info
+        if callable(proxy_info):
+            proxy_info = proxy_info(scheme)
+
+        if (hasattr(proxy_info, 'applies_to')
+            and not proxy_info.applies_to(hostname)):
+            proxy_info = None
+        return proxy_info
 
 
 class Response(dict):

python2/httplib2test.py

         end2end = httplib2._get_end2end_headers(response)
         self.assertEquals(0, len(end2end))
 
+
+class TestProxyInfo(unittest.TestCase):
+    def setUp(self):
+        self.orig_env = dict(os.environ)
+
+    def tearDown(self):
+        os.environ.clear()
+        os.environ.update(self.orig_env)
+
+    def test_from_url(self):
+        pi = httplib2.ProxyInfo.from_url('http://myproxy.example.com')
+        self.assertEquals(pi.proxy_host, 'myproxy.example.com')
+        self.assertEquals(pi.proxy_port, 80)
+        self.assertEquals(pi.proxy_user, None)
+
+    def test_from_url_ident(self):
+        pi = httplib2.ProxyInfo.from_url('http://zoidberg:fish@someproxy:99')
+        self.assertEquals(pi.proxy_host, 'someproxy')
+        self.assertEquals(pi.proxy_port, 99)
+        self.assertEquals(pi.proxy_user, 'zoidberg')
+        self.assertEquals(pi.proxy_pass, 'fish')
+
+    def test_from_env(self):
+        os.environ['http_proxy'] = 'http://myproxy.example.com:8080'
+        pi = httplib2.ProxyInfo.from_environment()
+        self.assertEquals(pi.proxy_host, 'myproxy.example.com')
+        self.assertEquals(pi.proxy_port, 8080)
+        self.assertEquals(pi.bypass_hosts, [])
+
+    def test_from_env_no_proxy(self):
+        os.environ['http_proxy'] = 'http://myproxy.example.com:80'
+        os.environ['https_proxy'] = 'http://myproxy.example.com:81'
+        os.environ['no_proxy'] = 'localhost,otherhost.domain.local'
+        pi = httplib2.ProxyInfo.from_environment('https')
+        self.assertEquals(pi.proxy_host, 'myproxy.example.com')
+        self.assertEquals(pi.proxy_port, 81)
+        self.assertEquals(pi.bypass_hosts, ['localhost',
+            'otherhost.domain.local'])
+
+    def test_from_env_none(self):
+        os.environ.clear()
+        pi = httplib2.ProxyInfo.from_environment()
+        self.assertEquals(pi, None)
+
+    def test_applies_to(self):
+        os.environ['http_proxy'] = 'http://myproxy.example.com:80'
+        os.environ['https_proxy'] = 'http://myproxy.example.com:81'
+        os.environ['no_proxy'] = 'localhost,otherhost.domain.local,example.com'
+        pi = httplib2.ProxyInfo.from_environment()
+        self.assertFalse(pi.applies_to('localhost'))
+        self.assertTrue(pi.applies_to('www.google.com'))
+        self.assertFalse(pi.applies_to('www.example.com'))
+
+    def test_no_proxy_star(self):
+        os.environ['http_proxy'] = 'http://myproxy.example.com:80'
+        os.environ['NO_PROXY'] = '*'
+        pi = httplib2.ProxyInfo.from_environment()
+        for host in ('localhost', '169.254.38.192', 'www.google.com'):
+            self.assertFalse(pi.applies_to(host))
+
+
 if __name__ == '__main__':
     unittest.main()