Commits

Pedro Silva  committed 2bbfd6e

Initial import of 'duplicates' plugin

  • Participants
  • Parent commits 7b452fd

Comments (0)

Files changed (4)

File beetsplug/duplicates.py

+# This file is part of beets.
+# Copyright 2013, Pedro Silva.
+#
+# Permission is hereby granted, free of charge, to any person obtaining
+# a copy of this software and associated documentation files (the
+# "Software"), to deal in the Software without restriction, including
+# without limitation the rights to use, copy, modify, merge, publish,
+# distribute, sublicense, and/or sell copies of the Software, and to
+# permit persons to whom the Software is furnished to do so, subject to
+# the following conditions:
+#
+# The above copyright notice and this permission notice shall be
+# included in all copies or substantial portions of the Software.
+
+"""List duplicate tracks or albums.
+"""
+import logging
+
+from beets.plugins import BeetsPlugin
+from beets.ui import decargs, print_obj, Subcommand
+
+PLUGIN = 'duplicates'
+log = logging.getLogger('beets')
+
+
+def _counts(items):
+    """Return count of ITEMS indexed by item_id.
+    """
+    import collections
+    counts = collections.defaultdict(list)
+    for item in items:
+        item_id = getattr(item, 'mb_trackid', item.mb_albumid)
+        counts[item_id].append(item)
+    return counts
+
+
+def _duplicates(items, full):
+    """Return duplicate ITEMS.
+    """
+    counts = _counts(items)
+    offset = 0 if full else 1
+    for item_id, items in counts.iteritems():
+        if len(items) > 1:
+            yield (item_id, len(items)-offset, items[offset:])
+
+
+class DuplicatesPlugin(BeetsPlugin):
+    """List duplicate tracks or albums
+    """
+    def __init__(self):
+        super(DuplicatesPlugin, self).__init__()
+
+        self.config.add({'format': ''})
+        self.config.add({'count': False})
+        self.config.add({'album': False})
+        self.config.add({'full': False})
+
+        self._command = Subcommand('duplicates',
+                                   help=__doc__,
+                                   aliases=['dup'])
+
+        self._command.parser.add_option('-f', '--format', dest='format',
+                                        action='store', type='string',
+                                        help='print with custom FORMAT',
+                                        metavar='FORMAT')
+
+        self._command.parser.add_option('-c', '--count', dest='count',
+                                        action='store_true',
+                                        help='count duplicate tracks or\
+                                        albums')
+
+        self._command.parser.add_option('-a', '--album', dest='album',
+                                        action='store_true',
+                                        help='show duplicate albums instead\
+                                        of tracks')
+
+        self._command.parser.add_option('-F', '--full', dest='full',
+                                        action='store_true',
+                                        help='show all versions of duplicate\
+                                        tracks or albums')
+
+    def commands(self):
+        def _dup(lib, opts, args):
+            self.config.set_args(opts)
+            fmt = self.config['format'].get()
+            count = self.config['count'].get()
+            album = self.config['album'].get()
+            full = self.config['full'].get()
+
+            if album:
+                items = lib.albums(decargs(args))
+            else:
+                items = lib.items(decargs(args))
+
+            orig_fmt = fmt
+            for obj_id, obj_count, objs in _duplicates(items, full):
+                if count:
+                    if not fmt:
+                        if album:
+                            fmt = '$albumartist - $album'
+                        else:
+                            fmt = '$albumartist - $album - $title'
+                    fmt += ': {}'
+                for o in objs:
+                    print_obj(o, lib, fmt=fmt.format(obj_count))
+                fmt = orig_fmt
+
+        self._command.func = _dup
+        return [self._command]

File docs/changelog.rst

 1.1.1 (in development)
 ----------------------
 
+* New :doc:`/plugins/duplicates`: Find tracks or albums in your
+  library that are **duplicated**.
 * New :doc:`/plugins/missing`: Find albums in your library that are **missing
   tracks**. Thanks to Pedro Silva.
 * Your library now keeps track of **when music was added** to it. The new

File docs/plugins/duplicates.rst

+Duplicates Plugin
+==============
+
+This plugin adds a new command, ``duplicates`` or ``dup``, which finds
+and lists duplicate tracks or albums in your collection.
+
+Installation
+------------
+
+Enable the plugin by putting ``duplicates`` on your ``plugins`` line in
+:doc:`config file </reference/config>`::
+
+    plugins:
+        duplicates
+        ...
+
+Configuration
+-------------
+
+By default, the ``beet duplicates`` command lists the names of tracks
+in your library that are duplicates. It assumes that Musicbrainz track
+and album ids are unique to each track or album. That is, it lists
+every track or album with an ID that has been seen before in the
+library.
+
+You can customize the output format, count the number of duplicate
+tracks or albums, and list all tracks that have duplicates or just the
+duplicates themselves. These options can either be specified in the
+config file::
+
+    duplicates:
+        format: $albumartist - $album - $title
+        count: no
+        album: no
+        full: no
+
+or on the command-line::
+
+    -f FORMAT, --format=FORMAT
+                          print with custom FORMAT
+    -c, --count           count duplicate tracks or
+                          albums
+    -a, --album           show duplicate albums instead
+                          of tracks
+    -F, --full            show all versions of duplicate
+                          tracks or albums
+
+format
+~~~~~~
+
+The ``format`` option (default: :ref:`list_format_item`) lets you
+specify a specific format with which to print every track or
+album. This uses the same template syntax as beets’ :doc:`path formats
+</reference/pathformat>`.  The usage is inspired by, and therefore
+similar to, the :ref:`list <list-cmd>` command.
+
+count
+~~~~~
+
+The ``count`` option (default: false) prints a count of duplicate
+tracks or albums, with ``format`` hard-coded to ``$albumartist -
+$album - $title: $count`` or ``$albumartist - $album: $count`` (for
+the ``-a`` option).
+
+album
+~~~~~
+
+The ``album`` option (default: false) lists duplicate albums instead
+of tracks.
+
+full
+~~~~
+
+The ``full`` option (default: false) lists every track or album that
+has duplicates, not just the duplicates themselves.
+
+Examples
+--------
+
+List all duplicate tracks in your collection::
+
+  beet duplicates
+
+List all duplicate tracks from 2008::
+
+  beet duplicates year:2008
+
+Print out a unicode histogram of duplicate track years using `spark`_::
+
+  beet duplicates -f '$year' | spark
+  ▆▁▆█▄▇▇▄▇▇▁█▇▆▇▂▄█▁██▂█▁▁██▁█▂▇▆▂▇█▇▇█▆▆▇█▇█▇▆██▂▇
+
+Print out a listing of all albums with duplicate tracks, and respective counts::
+
+  beet duplicates -ac
+
+The same as the above but include the original album, and show the path::
+
+  beet duplicates -acf '$path'
+
+
+TODO
+----
+
+- Allow deleting duplicates.
+
+.. _spark: https://github.com/holman/spark

File docs/plugins/index.rst

    smartplaylist
    mbsync
    missing
-   
+   duplicates
+
 Autotagger Extensions
 ''''''''''''''''''''''
 
   a different directory.
 * :doc:`info`: Print music files' tags to the console.
 * :doc:`missing`: List missing tracks.
-  
+* :doc:`duplicates`: List duplicate tracks or albums.
+
 .. _MPD: http://mpd.wikia.com/
 .. _MPD clients: http://mpd.wikia.com/wiki/Clients
 
 
 .. toctree::
     :hidden:
-    
+
     writing