Commits

Jonathan Eunice committed edb5327

first commit

Comments (0)

Files changed (6)

+syntax: glob
+*.swp.{py,txt,html,css,js}
+*.pyc
+.DS_Store
+build/*
+dist/*
+*.egg-info
+setup.cfg
+PKG-INFO
+Intensional (rule-defined) sets for Python.
+
+Overview
+========
+
+There are two ways of defining a set: intensional and extensional. Extensional
+sets like ``set([1,3,5,'daisy'])`` enumerate every member of the set explicitly.
+
+Intensional sets, in contrast, are defined by rules. For example "the set of all
+prime numbers" or "every word beginning with ``'a'`` and ending with ``'t'``.
+Intensional sets are often infinite. It may be possible to generate a list of
+their members, but it's not as simple as a "give me everything you've got!"
+``for`` loop.
+
+Once you know what you're looking for, intensional sets are *everywhere*.
+
+Python doesn't represent them directly, but regular expressions, many list
+comprehensions, and all manner of testing and filtering operations are really
+faces of the intensional set concept. Many functions test whether something
+'qualifies'. ``os.path.isdir(d)`` for example, tests whether ``d`` is in the set
+of legitimate directories, and ``isinstance(s, str)`` tests whether ``s`` is a
+member of the set of ``str`` objects. Even the core ``if`` conditional can be
+construed as testing for membership in an intensional set--the set of all items
+that pass the test.
+
+Many such tests have a temporal aspect--they determine whether a value is a
+member at the time of the test. In the future, the answer may change, if
+conditions change. Others are invariant over time. ``%%734`` will never be a
+valid Python identifier, no matter how many times it's tested--unless the rules
+of the overall Python universe change.
+
+Intensional sets are part and parcel of all programming, even if they're not
+called by that name or explictly manifested. ``intensional`` helps Python
+prograims represent intensional sets directly. 
+
+Usage
+=====
+
+Instead of::
+
+    match = re.search(r'(pattern\w*)', some_string)
+    if match:
+        print match.group(1)
+
+You can do an *en passant* test::
+
+    if some_string in Re(r'(pattern\w*)'):
+        print _[1]
+        
+Note this turns the sense of the matching around, asking "is a given string *in*
+this pattern?" The ``Re`` pattern is an intensionally defined set--namely, the
+set of all strings matching the pattern. This makes excellent sene in cases where
+you have a clear intent for the match--for example, determining "is the given string
+within the set of *all legitimate commands*?"
+
+If you prefer the more traditional ``re`` pattern, they are also available in a
+convenient *en passant* style.::
+
+    if Re(r'(pattern\w*)').search('string with pattern in it'):
+        print _[1]
+        
+Note that the underscore name ``_`` holds the result.
+        
+This works even better with named pattern components, such as::
+
+    person = 'John Smith 48'
+    if Re(r'(?P<name>[\w\s]*)\s+(?P<age>\d+)').search(person):
+        print _.name, "is", _.age, "years old"
+    else:
+        print "don't understand '{}'".format(person)
+        
+### need go back meahcnism
+        
+It's possible also to loop over the results, as in::
+
+    for found in Re(r'(pattern\w*)').findall('pattern is as pattern does'):
+        print found
+        
+``Re`` objects are `memoized <http://en.wikipedia.org/wiki/Memoization>`_ for efficiency, so they
+are only compiled once, regardless of how many times they're mentioned in the program.
+
+        
+Or if you 
+Say you have a class ``Thing`` that requires expensive computation to
+create, or that should be created only once. You can make that happen
+simply by adding one line to its definition::
+
+    from mementos import MementoMetaclass
+
+    class Thing(object):
+        
+        __metaclass__ = MementoMetaclass
+        
+        def __init__(self, name):
+            self.name = name
+        
+        ...
+
+Then ``Thing`` objects will be memoized::
+
+    t1 = Thing("one")
+    t2 = Thing("one")
+    assert t1 is t2
+    
+
+Notes
+=====
+
+ *  Derived from `an ActiveState recipe
+    <http://code.activestate.com/recipes/286132-memento-design-pattern-in-python/>`_ by Valentino Volonghi.
+    
+ *  It is safe to memoize multiple classes at the same time. They will all be stored in the same cache, but
+    their class is a part of the cache key, so the values are distinct.
+   
+ *  This implementation is *not* thread-safe, in and of itself. If you're
+    in a multi-threaded environment, wrap object instantiation in a lock.
+   
+ *  Be careful with call signatures. This metaclass
+    caches on call signature, which can vary if keyword args
+    are used. E.g. ``def func(a, b=2)`` could be called ``func(1)``, ``func(1,2)``,
+    ``func(a=1)``, ``func(1, b=2)``, or ``func(a=2, b=2)``--and all resolve to the
+    same logical call. And this is just for two parameters! If there are more
+    than one kwarg, they can be arbitrarily ordered, creating *many* logically
+    identical permuations. Thank Goodness Python doesn't allow kwargs to come
+    before positional args, else there'd be even more ways to make the same
+    call.
+    
+    So if you instantiate an object once, then again with a logically
+    identical call but using a different calling structure/signature, the
+    object won't be created and cached just once--it will be created and
+    cached multiple times.::
+    
+        o1 = Thing("lovely")
+        o2 = Thing(name="lovely")
+        assert o1 is not o2   # because the call signature is different
+        
+    This may degrade performance, and can also create
+    errors, if you're counting on ``mementos`` to create just one object.
+    So don't do that. Use a consistent calling style, and it won't be a problem.
+    
+    In most cases, this isn't an issue, because objects tend to be
+    instanitated with a limited number of parameters, and you can take care
+    that you instantiate them with parallel call signatures. Since this works
+    99% of the time and has a simple implementation, it's worth the price of
+    this inelegance. For the 1% edge-cases where multiple call signature
+    variations must be conclusively resolved to a unique canonical signature,
+    the ``inspect`` module could be used (e.g. ``getargvalues()`` or
+    ``getcallargs()`` to create such a unified key.
+    
+ *  Not to be confused with the `memento <http://pypi.python.org/pypi/memento>`_
+    module, which does something completely different.
+    
+ *  In some cases, you may not want the entire initialization-time call signature
+    to define the cache key--only part of it. (Because some arguments to ``__init__``
+    may be useful, but not really part of the object's 'identity'.) The
+    best option is to design ``__init__`` without superflous attributes, then
+    create a second
+    method that adds sets useful-but-not-essential data. E.g.::
+
+        class OtherThing(object):
+                
+            __metaclass__ = MementoMetaclass
+            
+            def __init__(self, name):
+                self.name = name
+                self.color = None
+                self.weight = None
+                
+            def set(self, color=None, weight=None):
+                self.color = color or self.color
+                self.weight = weight or self.weight
+                return self
+        
+        ot1 = OtherThing("one").set(color='blue')
+        ot2 = OtherThing("one").set(weight='light')
+        assert ot1 is ot2
+        assert ot1.color == ot2.color == 'blue'
+        assert ot1.weight == ot2.weight == 'light'
+
+Installation
+============
+
+::
+
+    pip install mementos
+    
+(You may need to prefix this with "sudo " to authorize installation.)

__init__.py

Empty file added.
+
+
+from mementos import MementoMetaclass   # to memoize IntensionalSet
+from stuf import orderedstuf            # for ReMatch
+import re                               # for Re
+import fnmatch                          # for Glob
+import inspect                          # for en passant operation in Re
+
+class IntensionalSet(object):
+    """
+    An intensional set (actually, an intensionally defined set) is a set that
+    is defined by a definition, rather than an explicit listing of members
+    (which would be an extensional set).
+    
+    Like other Python sets, IntensionalSet objects can include explicit
+    items--but it also admits the possibility of rule-based membership.
+    """
+
+    __metaclass__ = MementoMetaclass
+    
+    
+# need to complete more of the set functions like union, difference, etc
+
+
+class ReMatch(orderedstuf):
+    """
+    An attributes-exposed regular expression match object. Ideally this would be
+    a subclass of the re module's match object, but they are of type ``_sre.SRE_Match``
+    and `appear to be unsubclassable
+    <http://stackoverflow.com/questions/4835352/subclassing-matchobject-in-python>`_.
+    Thus, ReMatch is a subclass of an attributes-exposed dict type, ``stuf``, into
+    which the relevant data has been copied.
+    """
+    pass
+
+    # in future, might want to do a pass-through for methods of _sre.SRE_Match
+    # though they are currently available through x._match.method
+        
+def show_frames():
+    f = inspect.currentframe().f_back # the caller
+    while f:
+        print f, f.f_lineno
+        print "   f_locals:", ' '.join(f.f_locals.keys())
+        f = f.f_back
+    
+class Re(IntensionalSet):
+    
+    def __init__(self, pattern, flags=0):
+        self.pattern = pattern
+        self.flags   = flags
+        self.re = re.compile(pattern, flags)
+        self.groups     = self.re.groups
+        self.groupindex = self.re.groupindex
+        
+    def _regroup(self, m):
+        """
+        Given an _sre.SRE_Match object, create and return a corresponding
+        ReMatch object. Also, set the en passant variable to it.
+        """
+        if not m:
+            return m
+        result = ReMatch()
+        result['_match'] = m   # access to original match object
+        result.update(m.groupdict())  # update named groups
+        # update numerical / positional groups
+        result[0] = m.group(0)
+        for i, g in enumerate(m.groups(), start=1):
+            result[i] = g
+            
+        # this method -> calling method -> MementosMetaclass.__call__ -> Re.__init__ -> caller
+        inspect.currentframe().f_back.f_back.f_back.f_back.f_locals['_'] = result
+
+        return result
+        
+    def __contains__(self, item):
+        # if not isinstance(item, basestring):
+        #     item = str(item)
+        return self._regroup(self.re.search(item))
+    
+    # methods that return ReMatch objects
+    
+    def search(self, *args, **kwargs):
+        return self._regroup(self.re.search(*args, **kwargs))
+
+    def match(self, *args, **kwargs):
+        return self._regroup(self.re.match(*args, **kwargs))
+    
+    # methods that don't need ReMatch objects
+    
+    def finditer(self, *args, **kwargs):
+        return self.re.finditer(*args, **kwargs)
+      
+    def findall(self, *args, **kwargs):
+        return self.re.findall(*args, **kwargs)
+    
+    def split(self, *args, **kwargs):
+        return self.re.split(*args, **kwargs)
+    
+    def sub(self, *args, **kwargs):
+        return self.re.sub(*args, **kwargs)
+    
+    def subn(self, *args, **kwargs):
+        return self.re.subn(*args, **kwargs)
+    
+    def escape(self, *args, **kwargs):
+        return self.re.escape(*args, **kwargs)
+
+class Glob(IntensionalSet):
+        
+    def __init__(self, expr):
+        self.expr = expr
+        
+    def __contains__(self, item):
+        return fnmatch.fnmatch(item, self.expr)
+    
+class Any(IntensionalSet):
+    """
+    An item is in an Any if it is or is in any member of the set.
+    """
+    
+    def __init__(self, *args):
+        self.items = set(args)
+        
+    def __contains__(self, item):
+        if item in self.items:
+            return True
+        
+        for i in self.items:
+            if hasattr(i, '__contains__'):
+                if item in i:
+                    return True
+            else:
+                if item == i:
+                    return True
+        return False
+    
+class Every(IntensionalSet):
+    """
+    An item is in an Every if it is or is in every member of the set.
+    """
+    
+    def __init__(self, *args):
+        self.items = set(args)
+
+    def __contains__(self, item):
+        for i in self.items:
+            if hasattr(i, '__contains__'):
+                if item not in i:
+                    return False
+            else:
+                if item != i:
+                    return False
+        return True
+
+   
+class Test(IntensionalSet):
+    
+    def __init__(self, expr, *args, **kwargs):
+        IntensionalSet.__init__(self)
+        self.args = args
+        self.kwargs = kwargs
+        self.expr = expr
+        if isinstance(expr, basestring):
+            if not expr.startswith('lambda'):
+                expr = 'lambda x: ' + expr
+            self.func = eval(expr)
+        elif hasattr(expr, '__call__'):
+            self.func = expr
+        else:
+            raise ValueError('expr needs to be string or callable')
+        
+    def __contains__(self, item):
+        return self.func(item, *self.args, **self.kwargs)
+#! /usr/bin/env python
+
+from setuptools import setup
+from decimal import Decimal
+import re
+
+def linelist(text):
+    """
+    Returns each non-blank line in text enclosed in a list.
+    """
+    return [ l.strip() for l in text.strip().splitlines() if l.split() ]
+    
+    # The double-mention of l.strip() is yet another fine example of why
+    # Python needs en passant aliasing.
+
+
+def verno(s):
+    """
+    Update the version number passed in by extending it to the 
+    thousands place and adding 1/1000, then returning that result
+    and as a side-effect updating setup.py
+
+    Dangerous, self-modifying, and also, helps keep version numbers
+    ascending without human intervention.
+    """
+    d = Decimal(s)
+    increment = Decimal('0.001')
+    d = d.quantize(increment) + increment
+    dstr = str(d)
+    setup = open('setup.py', 'r').read()
+    setup = re.sub('verno\(\w*[\'"]([\d\.]+)[\'"]', 'verno("' + dstr + '"', setup)
+    open('setup.py', 'w').write(setup)
+    return dstr
+
+setup(
+    name='intensional',
+    version=verno("0.01"),
+    author='Jonathan Eunice',
+    author_email='jonathan.eunice@gmail.com',
+    description='Intensional sets in Python',
+    long_description=open('README.rst').read(),
+    url='',
+    py_modules=['intensional'],
+    install_requires=['stuf','mementos'],
+    classifiers=linelist("""
+        Development Status :: 3 - Alpha
+        Operating System :: OS Independent
+        License :: OSI Approved :: BSD License
+        Intended Audience :: Developers
+        Programming Language :: Python
+        Topic :: Software Development :: Libraries :: Python Modules
+    """)
+)
+from testharness import import_from_parent, test_run
+
+import_from_parent()
+from intensional import *
+
+def test_Re():
+    tests = 'some string with things in it ok?'
+    testpat = Re(r'\b(s\w*)\b')
+    testpat1 = Re(r'\b(s\w*)\b')
+    assert testpat is testpat1   # test memoization
+    if tests in testpat:
+        print _[1]
+        assert _[1] == 'some'
+        assert _._match.end(1) == 4
+    
+    found = testpat.findall(tests)
+    assert found == ['some', 'string']
+    
+def test_Any():
+    ext = Any(1,2,3)
+    assert 2 in ext
+    assert 33 not in ext
+    print ext
+    
+def test_Test():
+    mytest = Test('x.startswith("a")')
+    print mytest
+    assert 'a' in mytest
+    assert 'alpha' in mytest
+    assert 'sljd' not in mytest
+    
+    
+if __name__ == '__main__':
+    test_run()