Issue #274 new

disparity in {metadata,resource}_listdir for EggMetadata/ZipProvider

wickman
created an issue

This is on distribute-0.6.24.

When using EggMetadata/ZipProviders, there is a disparity between how metadata and resources are handled. I've initialized a Distribution for async here:

{{{

!text

async_dist async 0.6.1 (/Users/wickman/clients/science/3rdparty/python/async-0.6.1-py2.6-macosx-10.6-x86_64.egg)

async_dist.resource_listdir('/') ['async', 'EGG-INFO']

async_dist.resource_listdir(None) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/distribute-0.6.24-py2.6.egg/pkg_resources.py", line 1219, in resource_listdir return self._listdir(self._fn(self.module_path,resource_name)) File "/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/distribute-0.6.24-py2.6.egg/pkg_resources.py", line 1476, in _listdir return list(self._index().get(self._zipinfo_name(fspath), ())) File "/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/distribute-0.6.24-py2.6.egg/pkg_resources.py", line 1357, in _zipinfo_name "%s is not a subpath of %s" % (fspath,self.zip_pre) AssertionError: /Users/wickman/clients/science/3rdparty/python/async-0.6.1-py2.6-macosx-10.6-x86_64.egg is not a subpath of /Users/wickman/clients/science/3rdparty/python/async-0.6.1-py2.6-macosx-10.6-x86_64.egg/

async_dist.metadata_listdir('/') []

async_dist.metadata_listdir(None) ['native_libs.txt', 'dependency_links.txt', 'PKG-INFO', 'SOURCES.txt', 'not-zip-safe', 'top_level.txt']

}}}

Rather than using standard os.path.relpath and the like (I guess because we can't rely upon os.sep) there is a rather buggy implementation in _zipinfo_name on ZipProvider:

{{{

!python

def _zipinfo_name(self, fspath):
    # Convert a virtual filename (full path to file) into a zipfile subpath
    # usable with the zipimport directory cache for our target archive
    if fspath.startswith(self.zip_pre):
        return fspath[len(self.zip_pre):]
    raise AssertionError(
        "%s is not a subpath of %s" % (fspath,self.zip_pre)
    )

}}}

Unfortunately the paths passed to this by EggMetadata._fn contain trailing slashes. _zipinfo_name does not strip them but the self._dir_index does.

I've been using the following code to monkeypatch pkg_resources in production, but it'd be great if this would make mainline:

{{{

!python

def monkeypatch_pkg_resources(): import pkg_resources

_EggMetadata = pkg_resources.EggMetadata

def normalized_elements(path): path_split = path.split('/') while path_split[-1] in ('', '.'): path_split.pop(-1) return path_split

class EggMetadata(_EggMetadata): def init(self, *args, kw): _EggMetadata.__init__(self, *args, kw)

def _fn(self, base, resource_name):
  return '/'.join(normalized_elements(_EggMetadata._fn(self, base, resource_name)))

def _zipinfo_name(self, fspath):
  fspath = normalized_elements(fspath)
  zip_pre = normalized_elements(self.zip_pre)
  if fspath[:len(zip_pre)] == zip_pre:
    return '/'.join(fspath[len(zip_pre):])
  raise AssertionError(
    "%s is not a subpath of %s" % (fspath, self.zip_pre)
  )

pkg_resources.EggMetadata = EggMetadata

}}}

Comments (1)

  1. wickman reporter

    It's worthwhile to note other disparities, e.g. resource_listdir when backed by a zip .egg vs exploded egg:

    dir:

    >>> ds[0].resource_listdir('/async/AUTHORS')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/distribute-0.6.24-py2.6.egg/pkg_resources.py", line 1219, in resource_listdir
        return self._listdir(self._fn(self.module_path,resource_name))
      File "/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/site-packages/distribute-0.6.24-py2.6.egg/pkg_resources.py", line 1310, in _listdir
        return os.listdir(path)
    OSError: [Errno 20] Not a directory: '/private/tmp/async.XXXXXX/async/AUTHORS'
    

    zip:

    >>> ds[3].resource_listdir('/async/AUTHORS')
    []
    

    Either both should swallow the exception and return [] or make a ResourceNotADirectoryError(Exception) or something along those lines.

  2. Log in to comment