Commits

Jonathan Eunice committed 5c534ec

major rewrite successful; now a meta-metaclass aka metaclass factory

Comments (0)

Files changed (4)

-Classes that use this caching metaclass will have their instances
-automatically cached based on instantiation-time arguments (i.e. to ``__init__``).
-Useful for not repetitively creating expensive-to-create objects,
-and for making sure that objects are created only once.
+A quick way to make Python classes automatically memoize (a.k.a. cache) their
+instances based on the arguments they are instantiated with (i.e. args to
+``__init__``). It's a simple way to avoid repetitively creating
+expensive-to-create objects, and to make sure objects are created only once.
 
 Usage
 =====
 
-Say you have a class ``Thing`` that requires expensive computation to
-create, or that should be created only once. You can make that happen
-simply by adding one line to its definition::
+Say you have a class ``Thing`` that requires expensive computation to create, or
+that should be created only once. In Python 2.x you can make that happen by
+adding one line to its definition::
 
     from mementos import MementoMetaclass
 
     class Thing(object):
         
-        __metaclass__ = MementoMetaclass
+        __metaclass__ = MementoMetaclass  # now I'm memoized!
         
         def __init__(self, name):
             self.name = name
 Python 3
 ========
 
-``mementos`` works in Python 3 just as in Python 2, but with different
-syntax. Instead of the
-double-underscore class attribute assignment, Python 3 uses a keyword argument
-at class creation time::
+``mementos`` works in Python 3 just as in Python 2, but with different syntax.
+Instead of the double-underscore class attribute assignment, Python 3 uses a
+keyword argument at class creation::
 
     class Thing3(object, metaclass=MementoMetaclass):
         ...
        
-Unfortunately, Python 2 and Python 3 don't recognize each other's syntax
-for metaclass specification, so straightforwad code designed
-for one won't even compile for
-the other. You can get around this by using the
-``with_metaclass()`` function from the
-``six`` cross-version compatibility module. It's very short, so you can
-drop it into any program thus::
+Unfortunately, Python 2 and Python 3 don't recognize each other's syntax for
+metaclass specification, so straightforward code for one won't even compile for
+the other. You can get around this by using the ``with_metaclass()`` function
+from the ``six`` cross-version compatibility module. It's very short, so you can
+drop it into any program::
     
     def with_metaclass(meta, base=object):
         """Create a base class with a metaclass."""
     from six import with_metaclass
     ...
     
-(You can even inline what ``with_metaclass()`` does directly, but it's
-ugly and not recommended.)
+(You could even inline what ``with_metaclass()`` does directly, but it's
+really ugly, hard-to-read, and not recommended.)
+
+Careful with Call Signatures
+============================
+
+``MementoMetaclass`` caches on call signature, which can vary if keyword args
+are used. E.g. ``def func(a, b=2)`` could be called ``func(1)``, ``func(1,2)``,
+``func(a=1)``, ``func(1, b=2)``, or ``func(a=2, b=2)``--and all resolve to the
+same logical call. And this is just for two parameters! If there are more than
+one ``kwarg``, they can be arbitrarily ordered, creating *many* logically
+identical permuations. Thank Goodness Python doesn't allow kwargs to come before
+positional args, else there'd be even more ways to make the same call.
+    
+So if you instantiate an object once, then again with a logically identical call
+but using a different calling structure/signature, the object won't be created
+and cached just once--it will be created and cached multiple times.::
+    
+    o1 = Thing("lovely")
+    o2 = Thing(name="lovely")
+    assert o1 is not o2     # because the call signature is different
+        
+This may degrade performance, and can also create errors, if you're counting on
+``mementos`` to create just one object. So don't do that. Use a consistent
+calling style, and it won't be a problem.
+
+In most cases, this isn't an issue, because objects tend to be instanitated with
+a limited number of parameters, and you can take care that you instantiate them
+with parallel call signatures. Since this works 99% of the time and has a simple
+implementation, it's worth the price of this inelegance.
+    
+If you want only part of the initialization-time call signature (i.e. arguments
+to ``__init__``) to define an object's identity/cache key, there are two
+approaches. One is to use ``MementoMetaclass`` and design ``__init__`` without
+superflous attributes, then create one or more secondary methods to add/set
+useful-but-not-essential data. E.g.::
+
+    class OtherThing(with_metaclass(MementoMetaclass, object)):
+    
+        def __init__(self, name):
+            self.name = name
+            self.color = None   # unset for now
+            self.weight = None
+            
+        def set(self, color=None, weight=None):
+            self.color = color or self.color
+            self.weight = weight or self.weight
+            return self
+    
+    ot1 = OtherThing("one").set(color='blue')
+    ot2 = OtherThing("one").set(weight='light')
+    assert ot1 is ot2
+    assert ot1.color == ot2.color == 'blue'
+    assert ot1.weight == ot2.weight == 'light'
+
+Or you can just define your own memoizing metaclass, using the factory function
+described below.
+
+Visiting the Factory
+====================
+
+The first iteration of ``mementos`` defined a single metaclass. It's since been
+reimplemented as a parameterized meta-metaclass. Cool, huh? That basically means
+that it defines a function, ``memento_factory()`` that, given a metaclass name
+and a function defining how cache keys are constructed, returns a corresponding
+metaclass. ``MementoMetaclass`` is the only metaclass that the module
+pre-defines, but it's easy to define your own memoizing metaclass.::
+
+    from mementos import memento_factory
+    
+    IdTracker = memento_factory('IdTracker',
+                                lambda cls, args, kwargs: (cls, id(args[0])) )
+                                
+    class MyTracker(with_metaclass(IdTracker, object)):
+        ...
+        
+        # object idenity is the object id of first argument to __init__
+        # (and there must be one, else the args[0] reference => IndexError)
+
+The first argument to ``memento_factory()`` is the name of the metaclass being
+defined. The second is a callable (e.g. lambda expression or function object)
+that takes three arguments: a class object, an argument ``list``, and a keyword
+arg ``dict``. Note that there is no ``*`` or ``**`` magic--args passed to the
+key function have already been resolved into basic data structures.
+
+The callable must return a globally-unique, hashable key for an object. This key
+will be stored in the ``_memento_cache``, which is a simple ``dict``.
+
+When various arguments are used as the cache key/object identity, you may use a
+``tuple`` that includes the class and arguments you want to key off of. This can
+also help debugging, should you need to examine the ``_memento_cache`` cache
+directly. But in cases like the ``IdTracker`` above, it's not mandatory that you
+keep extra information around. The raw ``id(args[0])`` integer value would
+suffice, as would a constructed string or other immutable, hashable value.
+
+For the 1% edge-cases where multiple call variations must be
+conclusively resolved to a unique canonical signature, that can be done on a
+custom basis (based on the specific args). Or in Python 2.7 and 3.x, the
+``inspect`` module's ``getcallargs()`` function can be used to create a generic
+"call fingerprint" that can be used as a key. (See the tests for example code.)
 
 Notes
 =====
 
- *  Mementos was derived from `an ActiveState recipe
+ *  ``mementos`` is not to be confused with `memento <http://pypi.python.org/pypi/memento>`_,
+    which does something completely different.
+
+ *  Mementos was originally derived from `an ActiveState recipe
     <http://code.activestate.com/recipes/286132-memento-design-pattern-in-python/>`_ by Valentino Volonghi.
     Thank you, Valentino!
     
     their class is a part of the cache key, so the values are distinct.
    
  *  This implementation is *not* thread-safe, in and of itself. If you're
-    in a multi-threaded environment, wrap object instantiation in a lock.
+    in a multi-threaded environment, consider wrapping object instantiation
+    in a lock.
    
- *  Be careful with call signatures. This metaclass
-    caches on call signature, which can vary if keyword args
-    are used. E.g. ``def func(a, b=2)`` could be called ``func(1)``, ``func(1,2)``,
-    ``func(a=1)``, ``func(1, b=2)``, or ``func(a=2, b=2)``--and all resolve to the
-    same logical call. And this is just for two parameters! If there are more
-    than one kwarg, they can be arbitrarily ordered, creating *many* logically
-    identical permuations. Thank Goodness Python doesn't allow kwargs to come
-    before positional args, else there'd be even more ways to make the same
-    call.
-    
-    So if you instantiate an object once, then again with a logically
-    identical call but using a different calling structure/signature, the
-    object won't be created and cached just once--it will be created and
-    cached multiple times.::
-    
-        o1 = Thing("lovely")
-        o2 = Thing(name="lovely")
-        assert o1 is not o2   # because the call signature is different
-        
-    This may degrade performance, and can also create
-    errors, if you're counting on ``mementos`` to create just one object.
-    So don't do that. Use a consistent calling style, and it won't be a problem.
-    
-    In most cases, this isn't an issue, because objects tend to be
-    instanitated with a limited number of parameters, and you can take care
-    that you instantiate them with parallel call signatures. Since this works
-    99% of the time and has a simple implementation, it's worth the price of
-    this inelegance. For the 1% edge-cases where multiple call signature
-    variations must be conclusively resolved to a unique canonical signature,
-    the ``inspect`` module could be used (e.g. ``getargvalues()`` or
-    ``getcallargs()`` to create such a unified key.
-    
- *  Not to be confused with the `memento <http://pypi.python.org/pypi/memento>`_
-    module, which does something completely different.
-    
- *  In some cases, you may not want the entire initialization-time call signature
-    to define the cache key--only part of it. (Because some arguments to ``__init__``
-    may be useful, but not really part of the object's 'identity'.) The
-    best option is to design ``__init__`` without superflous attributes, then
-    create a second
-    method that adds sets useful-but-not-essential data. E.g.::
-
-        class OtherThing(object):
-                
-            __metaclass__ = MementoMetaclass
-            
-            def __init__(self, name):
-                self.name = name
-                self.color = None
-                self.weight = None
-                
-            def set(self, color=None, weight=None):
-                self.color = color or self.color
-                self.weight = weight or self.weight
-                return self
-        
-        ot1 = OtherThing("one").set(color='blue')
-        ot2 = OtherThing("one").set(weight='light')
-        assert ot1 is ot2
-        assert ot1.color == ot2.color == 'blue'
-        assert ot1.weight == ot2.weight == 'light'
-        
  *  Automated multi-version testing with 
     `pytest <http://pypi.python.org/pypi/pytest>`_
     and `tox <http://pypi.python.org/pypi/tox>`_ modules has commenced. ``mementos`` is now
     successfully packaged for, and tested against, all late-model verions of
-    Python (2.6, 2.7, 3.2, and 3.3) and one (2.5) that isn't so very recent.
+    Python (2.6, 2.7, 3.2, and 3.3), plus one (2.5) that isn't so very recent.
 
 Installation
 ============
 
     pip install mementos
     
-(You may need to prefix this with "sudo " to authorize installation.)
+To ``easy_install`` under a specific Python version (3.3 in this example)::
+
+    python3.3 -m easy_install mementos
+    
+(You may need to prefix these with "sudo " to authorize installation.)
-class MementoMetaclass(type):
+
+_memento_cache = {}
+
+def memento_factory(name, func):
     """
-    Classes that use this caching metaclass will have their instances
-    automatically cached based on instantiation-time arguments (i.e. to __init__).
-    Super-useful for not repetitively creating expensive-to-create objects.
+    Return a memoizing metaclass with the given name and key function.
+    And yes that makes this a parametrized meta-metaclass, which is probably
+    the most meta thing you've ever seen. If it isn't, both congratulations
+    and sympathies are in order!
+    """
+
+    def call(cls, *args, **kwargs):
+        key = func(cls, args, kwargs)
+        try:
+            return _memento_cache[key]
+        except KeyError:
+            instance = type.__call__(cls, *args, **kwargs)
+            _memento_cache[key] = instance
+            return instance
+        
+    mc = type(name, (type,), { '__call__': call })
+    return mc
+
     
-    See http://code.activestate.com/recipes/286132-memento-design-pattern-in-python/
-    """
-    cache = {}
+MementoMetaclass = memento_factory("MementoMetaclass",
+                                   lambda cls, args, kwargs: (cls, ) + args + tuple(kwargs.items()) )
 
-    def __call__(self, *args, **kwargs):
-        key = (self, ) + args + tuple(kwargs.items())
-        try:
-            return self.cache[key]
-        except KeyError:
-            instance = type.__call__(self, *args, **kwargs)
-            self.cache[key] = instance
-            return instance
-
-# Looking to possibility of creating a parametrized metaclass, metaclass
-# factory, or similar mecahnism to be able to configure MementoMetaclass on the
-# fly. This would provide a mechanism for MementoMetaclass users to stipulate
-# what parameters are used as the object-identifying key (currently: all of
-# them, in exactly the call signature). Not ready to pull the trigger on that,
-# however. Metaclasses are a bit tricky, and must be "done right." 
-
+# 
 # Some reading:
 # http://bytes.com/topic/python/answers/40084-parameterized-metaclass-metametaclass
 # http://www.acooke.org/cute/PythonMeta0.html
 # http://www.python.org/dev/peps/pep-3115/
+
+
 
 setup(
     name='mementos',
-    version=verno("0.404"),
+    version=verno("0.452"),
     author='Jonathan Eunice',
     author_email='jonathan.eunice@gmail.com',
     description='Memoizing metaclass. Drop-dead simple way to create cached objects',
 from mementos import *
+import sys, pytest
 
 def with_metaclass(meta, base=object):
     """Create a base class with a metaclass."""
     o2 = Thing23(name="lovely")
     assert o1 is not o2   # because the call signature is different
     
+def test_id_metaclass():
     
-# need to extend for Py3
-# http://mikewatkins.ca/2008/11/29/python-2-and-3-metaclasses/
+    # Test the metaclass that uses the id of the first arg
+    # but really as much a test of memento_factory, since IdMementoMetaclass not
+    # part of base module
+       
+    IdMementoMetaclass = memento_factory("IdMementoMetaclass",
+                                         lambda cls, args, kwargs: (cls, id(args[0])) )
+
+    class IdTrack(with_metaclass(IdMementoMetaclass, object)):
+        def __init__(self, name, *args):
+            self.name = name
+            self.args = args
+            
+    id1 = IdTrack("joe")
+    id2 = IdTrack("joe")
+    assert id1 is id2
+    
+    id3 = IdTrack("joe", 1, 4)
+    assert id3 is id2
+    
+    id4 = IdTrack("Joe", 1, 4)
+    id5 = IdTrack("Joe")
+    assert id4 is id5
+    assert id4 is not id3
+    assert id5 is not id3
+    
+def test_id_metaclass_primitive():
+    
+    # Test the metaclass that uses the id of the first arg
+    # but really as much a test of memento_factory, since IdMementoMetaclass not
+    # part of base module
+       
+    IdMementoPrimitive = memento_factory("IdMementoMetaclass",
+                                         lambda cls, args, kwargs: id(args[0]) )
+
+    class IdTrackPrim(with_metaclass(IdMementoPrimitive, object)):
+        def __init__(self, name, *args):
+            self.name = name
+            self.args = args
+            
+    id1 = IdTrackPrim("joe")
+    id2 = IdTrackPrim("joe")
+    assert id1 is id2
+    
+    id3 = IdTrackPrim("joe", 1, 4)
+    assert id3 is id2
+    
+    id4 = IdTrackPrim("Joe", 1, 4)
+    id5 = IdTrackPrim("Joe")
+    assert id4 is id5
+    assert id4 is not id3
+    assert id5 is not id3
+    
+def test_metaclass_factory():
+    
+    mymeta = memento_factory("mymeta", lambda cls, args, kwargs: (cls, kwargs['b']))
+    
+    class BTrack(with_metaclass(mymeta, object)):
+        def __init__(self, name, **kwargs):
+            assert 'b' in kwargs
+            self.name = name
+            self.b = kwargs['b']
+            
+    b1 = BTrack('andy', b=1)
+    b2 = BTrack('dave', b=1)
+    assert b1 is b2
+    assert b1.name == 'andy'  # because b1 got there first, defined object such that b=1
+    
+    b3 = BTrack('andy', b=2)
+    assert b3 is not b1
+    assert b3 is not b2
+    assert b3.name == 'andy'
+    
+@pytest.mark.skipif("sys.version_info < (2,7)")
+def test_perfect_signatures():
+    # use inspect.getcallargs to make signatures that dont vary
+    
+    from inspect import getcallargs
+    import hashlib
+    import six
+    
+    def call_fingerprint(cls, args, kwargs):
+        """
+        Given a complex __init__ call with varied positional, keyward, variable, and
+        variable keyword arguments, canonicalize the argument values and return a
+        suitable hash key.
+        """
+        callitems = list(getcallargs(cls.__init__, None, *args, **kwargs).items())
+        callitems.sort()
+        h = hashlib.md5()
+        h.update(repr(callitems).encode('utf-8'))
+        return h.hexdigest()
+        
+    Perfect = memento_factory("Perfect", call_fingerprint)
+    
+    class PerfectCall(with_metaclass(Perfect, object)):
+        def __init__(self, name, a=1, b=2, c=3, *args, **kwargs):
+            self.vector = (name, a, b, c)
+    
+    p1 = PerfectCall('amy')
+    p2 = PerfectCall('amy', b=2)
+    p3 = PerfectCall('amy', c=3)
+    p4 = PerfectCall('amy', c=3, a=1)
+    p5 = PerfectCall(c=3, a=1, b=2, name='amy')
+    assert p1 is p2
+    assert p2 is p3
+    assert p3 is p4
+    assert p4 is p5
+    
+    p6 = PerfectCall('amy', b=33)
+    assert p1 is not p6
+    
+    p7 = PerfectCall(**{'name': 'amy', 'b': 33})
+    assert p6 is p7
+