1. Markus Zapke-Gründemann
  2. hg-importfs

Commits

Markus Zapke-Gründemann  committed 11b3992

Added new option --exclude-path to exclude a path relative to SOURCE.

  • Participants
  • Parent commits f1dcab3
  • Branches default

Comments (0)

Files changed (3)

File CHANGELOG

View file
  • Ignore whitespace
 
 - Symlinks are no longer dereferenced on Linux.
 - Renamed --exclude option to --exclude-pattern.
+- Added new option --exclude-path to exclude a path relative to SOURCE.
 
 1.0.1
 =====

File importfs.py

View file
  • Ignore whitespace
                 onerror(os.remove, path)
 
 
-def copyfiles(src, dst, exclude_pattern=None, ignore_errors=False):
-    """Copy recursively from src to dst directory.
+def smartcopy(src, dst, node, ignore=None, exclude_path=None, no_errors=False):
+    """Copies a node (recursively) from src to dst directory.
 
-    If src is a file copy only the file.
+    If node is a file only the file is copied.
 
-    If src is a directory copy recursively the entire directory tree. The
-    destination directory, named by dst, must not already exist; it will be
-    created as well as missing parent directories.
+    If node is a directory the entire directory tree is copied recursively. The
+    destination directory must not already exist; it will be created as well as
+    missing parent directories.
 
-    All files and directories matching exclude_pattern will be ignored. The
-    pattern ['*.pyc', 'tmp*'] will copy everything except .pyc files and files
-    or directories whose name starts with tmp.
+    If ignore is given, it must be a callable that will receive as its
+    arguments the directory being visited by smartcopy(), and a list of its
+    contents, as returned by os.listdir(). Since smartcopy() is called
+    recursively, the ignore callable will be called once for each directory
+    that is copied. The callable must return a sequence of directory and file
+    names relative to the current directory (i.e. a subset of the items in its
+    second argument); these names will then be ignored in the copy process.
+    shutil.ignore_patterns() can be used to create such a callable that ignores
+    names based on glob-style patterns.
 
-    If ignore_errors is True all errors will be returned as a list of warnings.
+    If given, exclude_path must be a list of paths relative to src (no leading
+    slash!). All paths exactly matching an element of exclude_path will be
+    ignored.
+
+    If no_errors is True all errors will be returned as a list of warnings.
     """
+    srcpath = os.path.join(src, node)
+    dstpath = os.path.join(dst, node)
     warnings = []
+    # Use exclude options.
+    if node in exclude_path:
+        # Terminate immediately if node is in the exclude path list.
+        return warnings
+    if ignore is not None:
+        if len(ignore(src, (node,))) > 0:
+            # Terminate immediately if the node matches the exclude pattern.
+            return warnings
+    # Either create a symlink or copy the files recursively.
+    symlinks = True
+    if os.name == 'nt':
+        # Dereference symlinks for Windows.
+        symlinks = False
+    if symlinks and os.path.islink(srcpath):
+        linkto = os.readlink(srcpath)
+        os.symlink(linkto, dstpath)
+    else:
+        try:
+            if os.path.isdir(srcpath):
+                copytree(srcpath, dstpath, symlinks, ignore, exclude_path)
+            else:
+                shutil.copy(srcpath, dstpath)
+        except shutil.Error, err:
+            if not no_errors:
+                raise
+            warnings.append(err)
+    return warnings
+
+
+def get_common_suffix(path1, path2):
+    """Takes two paths and returns the common suffix.
+
+    If the common suffix has a leading directory separator it's removed.
+    """
+    common_chars = []
+    path1 = list(path1)
+    path2 = list(path2)
+    while len(path1) > 0:
+        char1 = path1.pop()
+        char2 = path2.pop()
+        if char1 == char2:
+            common_chars.insert(0, char1)
+        else:
+            break
+    common_suffix = ''.join(common_chars).lstrip(os.sep)
+    return common_suffix
+
+
+def copytree(src, dst, symlinks=False, ignore=None, exclude_path=None):
+    """Recursively copy an entire directory tree rooted at src. The destination
+    directory, named by dst, must not already exist; it will be created as well
+    as missing parent directories. Permissions and times of directories are
+    copied with copystat(), individual files are copied using copy2().
+
+    If symlinks is true, symbolic links in the source tree are represented as
+    symbolic links in the new tree; if false or omitted, the contents of the
+    linked files are copied to the new tree.
+
+    If ignore is given, it must be a callable that will receive as its
+    arguments the directory being visited by copytree(), and a list of its
+    contents, as returned by os.listdir(). Since copytree() is called
+    recursively, the ignore callable will be called once for each directory
+    that is copied. The callable must return a sequence of directory and file
+    names relative to the current directory (i.e. a subset of the items in its
+    second argument); these names will then be ignored in the copy process.
+    ignore_patterns() can be used to create such a callable that ignores names
+    based on glob-style patterns.
+
+    If the suffix of a file or directory in src matches a string in the
+    exclude_path list it's not copied.
+
+    If exception(s) occur, an Error is raised with a list of reasons.
+    """
+    names = os.listdir(src)
+    if ignore is not None:
+        ignored_names = ignore(src, names)
+    else:
+        ignored_names = set()
+
+    os.makedirs(dst)
+    errors = []
+    for name in names:
+        if name in ignored_names:
+            continue
+        srcname = os.path.join(src, name)
+        dstname = os.path.join(dst, name)
+        if exclude_path is not None:
+            common_suffix = get_common_suffix(srcname, dstname)
+            if common_suffix in exclude_path:
+                continue
+        try:
+            if symlinks and os.path.islink(srcname):
+                linkto = os.readlink(srcname)
+                os.symlink(linkto, dstname)
+            elif os.path.isdir(srcname):
+                copytree(srcname, dstname, symlinks, ignore)
+            else:
+                shutil.copy2(srcname, dstname)
+        except (IOError, os.error), why:
+            errors.append((srcname, dstname, str(why)))
+        # catch the Error from the recursive copytree so that we can
+        # continue with other files
+        except shutil.Error, err:
+            errors.extend(err.args[0])
     try:
-        if exclude_pattern:
-            ignore = shutil.ignore_patterns(*exclude_pattern)
-            if len(ignore(src, os.path.split(src))):
-                # Terminate immediately if the src path matches the exclude
-                # pattern.
-                return warnings
-        else:
-            ignore = None
-        symlinks = True
-        if os.name == 'nt':
-            # Do only dereference symlinks for Windows.
-            symlinks = False
-        if os.path.isdir(src):
-            shutil.copytree(src, dst, symlinks=symlinks, ignore=ignore)
-        else:
-            shutil.copy(src, dst)
-    except shutil.Error, err:
-        if not ignore_errors:
-            raise
-        warnings.append(err)
-    return warnings
+        shutil.copystat(src, dst)
+    except WindowsError:
+        # can't copy file access times on Windows
+        pass
+    except OSError, why:
+        errors.extend((src, dst, str(why)))
+    if errors:
+        raise shutil.Error(errors)
 
 
 def importfs(ui, repo, source, *pats, **opts):
     and files or directories whose name starts with tmp.
 
     $ hg importfs repo source --exclude-pattern *.pyc --exclude-pattern tmp
+
+    The --exclude-path option takes an exact path as value. The result is that
+    the files in the path are not imported. The option can be used several
+    times for different paths. The path is specified relative to SOURCE.
     """
     sources = []
     for node in (source,) + pats:
         path = util.expandpath(node)
         if not os.path.exists(path):
             raise Abort(_('directory %s does not exist') % path)
-        sources.append(util.expandpath(path))
+        sources.append(path)
     repo = get_repo(ui, repo)
     update_repo(ui, repo, opts.get('rev'), opts.get('branch'))
     purge_repo(repo)
+    # Prepare exclusion rules.
+    exclude_path = opts.get('exclude_path')
+    exclude_pattern = opts.get('exclude_pattern')
+    if exclude_pattern:
+        ignore = shutil.ignore_patterns(*exclude_pattern)
+    else:
+        ignore = None
     # Copy all files into the repository.
     for sourcepath in sources:
         for node in os.listdir(sourcepath):
-            src = os.path.join(sourcepath, node)
-            dst = os.path.join(repo.root, node)
-            if os.name != 'nt' and os.path.islink(src):
-                # Only reference symlinks for Linux.
-                linkto = os.readlink(src)
-                os.symlink(linkto, dst)
-            else:
-                warnings = copyfiles(src, dst, opts.get('exclude_pattern'),
-                    opts.get('ignore_copy_errors'))
-                if len(warnings) == 0:
-                    continue
-                # Print warnings. This will only happen if copyfiles was called
-                # with ignore_errors = True.
-                for warning in warnings:
-                    ui.write('Warning: Failed to copy %s to %s (%s).\n' %
-                        warning.args[0][0])
+            warnings = smartcopy(sourcepath, repo.root, node, ignore,
+                exclude_path, opts.get('ignore_copy_errors'))
+            if len(warnings) == 0:
+                continue
+            # Print warnings. This will only happen if smartcopy was called
+            # with the ignore_copy_errors option set to True.
+            for warning in warnings:
+                ui.write('Warning: Failed to copy %s to %s (%s).\n' %
+                    warning.args[0][0])
     commands.addremove(ui, repo, similarity=opts.get('similarity'))
     message = opts.get('message') or 'importfs commit.'
     commands.commit(ui, repo, message=message)
         'tag string is used.'), _('TEXT')),
     ('t', 'tag', '', _('The tag for the resulting revision. If omitted the '
         'revision is not tagged.'), _('NAME')),
-    ('', 'exclude-pattern', list(), _('Exclude all files matching the given pattern.'),
-        _('PATTERN')),
+    ('', 'exclude-pattern', list(),
+        _('Exclude all files matching the given pattern.'), _('PATTERN')),
+    ('', 'exclude-path', list(),
+        _('Exclude the exact path relative to SOURCE.'), _('PATH')),
     ('', 'ignore-copy-errors', None, _('Turn all errors during the file copy '
         'operation into warnings.'))],
     '[OPTION]... REPO SOURCE...')

File test-importfs-copytree.t

View file
  • Ignore whitespace
   $ echo "[extensions]" >> $HGRCPATH
   $ echo "importfs = $TESTDIR/importfs.py" >> $HGRCPATH
 
-Create a simple filesystem structure for import:
+Create a simple file system structure for import:
 
   $ mkdir d1
   $ echo "c1" > d1/f1
   adding f 2
   adding f1
 
+Create a file system structure where a directory name appears twice.
+
+  $ mkdir -p proj/doc/{eggs,spam}
+  $ mkdir -p proj/src/{eggs,spam}
+  $ echo Index > proj/doc/index
+  $ echo "Index eggs" > proj/doc/eggs/index
+  $ echo "Index spam" > proj/doc/spam/index
+  $ echo "1 + 1 = 2" > proj/src/eggs/calc
+  $ echo "1 + 1 = 2" > proj/src/spam/calc
+
+Perform an import ignoring the proj/src/spam directory:
+
+  $ hg importfs r4 proj --exclude-path src/spam
+  created repository $TESTTMP/r4
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  adding doc/eggs/index
+  adding doc/index
+  adding doc/spam/index
+  adding src/eggs/calc
+
+Perform an import ignoring the proj/doc/eggs and the proj/src/spam directory:
+
+  $ hg importfs r5 proj --exclude-path doc/eggs --exclude-path src/spam
+  created repository $TESTTMP/r5
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  adding doc/index
+  adding doc/spam/index
+  adding src/eggs/calc
+
+See if the --exclude-path options also works at the toplevel:
+
+  $ hg importfs r7 proj/doc --exclude-path eggs
+  created repository $TESTTMP/r7
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  adding index
+  adding spam/index