Jonathan Eunice avatar Jonathan Eunice committed 2b875a1

reorging

Comments (0)

Files changed (8)

+syntax: glob
+*.swp.{py,txt,html,css,js}
+*.pyc
+.DS_Store
+build/*
+dist/*
+*.egg-info
+setup.cfg
+PKG-INFO
+.tox
+A simplified interface to Python's regular expression (``re``)
+string search. As a bonus, also provides compatible way to
+access Unix glob searches.
+
+Usage
+=====
+
+Python regular expressions are powerful, but the language's lack of an *en passant*
+assignment requires a preparatory
+motion and then a test, like this::
+
+    import re
+    
+    match = re.search(pattern, some_string)
+    if match:
+        print match.group(1)
+
+With ``simplere``, you can do it in fewer steps::
+
+    from simplere import Re
+
+    if some_string in Re(pattern):
+        print _[1]
+        
+There are two things to note here: First,
+this turns the sense of the matching around, asking "is the given string *in*
+the set of items this pattern describes?" To be fancy, the
+``Re`` pattern is an intensionally
+defined set (namely "all strings matching the pattern"). This order often makes
+excellent sense whey you have a clear intent for the test. For example, "is the
+given string within the set of *all legitimate commands*?"
+
+Second, the ``in`` test had the side effect of setting the underscore
+name ``_`` to the result. Python doesn't support *en passant* assignment, so
+you can't both test and collect results in the same motion, even though that's
+sometimes exactly appropriate. ``simplere`` uses introspection to get around this
+difficulty and provide neater code.
+
+If you prefer the more traditional ``re`` calls, you can still use them with the
+convenient *en passant* style.::
+
+    if Re(pattern).search(some_string):
+        print _[1]
+
+``Re`` works even better with named pattern components, such as::
+
+    person = 'John Smith 48'
+    if person in Re(r'(?P<name>[\w\s]*)\s+(?P<age>\d+)'):
+        print _.name, "is", _.age, "years old"
+    else:
+        print "don't understand '{}'".format(person)
+        
+It's possible also to loop over the results::
+
+    for found in Re('pattern (\w+)').finditer('pattern is as pattern does'):
+        print found
+        
+Or collect them all in one fell swoop::
+
+    found = Re('pattern (\w+)').findall('pattern is as pattern does')
+    
+Pretty much all of the methods and properties one can access from the standard
+``re`` module are available.
+   
+``Re`` objects are `memoized <http://en.wikipedia.org/wiki/Memoization>`_ for efficiency, so they
+so they're only compiled once, regardless of how many times
+they're mentioned in the program.
+
+Bonus: Globs
+============
+
+Regular expressions are wonderfuly powerful, but sometimes the simpler `Unix glob
+<http://en.wikipedia.org/wiki/Glob_(programming)>`_ is works just fine. As a bonus,
+``simplere`` also provides simple glob access.::
+
+    if 'globtastic' in Glob('glob*'):
+        print "Yes! It is!"
+    else:
+        raise ValueError('YES IT IS')
+
+
+Notes
+=====
+
+ *  ``simplere`` is part of a larger (as yet unpublished) effort to add intensional sets to Python.
+
+Installation
+============
+
+::
+
+    pip install simplere
+    
+(You may need to prefix this with "sudo " to authorize installation.)
+class MementoMetaclassSRE(type):
+    """
+    Classes that use this caching metaclass will have their instances
+    automatically cached based on instantiation-time arguments (i.e. to __init__).
+    Super-useful for not repetitively creating expensive-to-create objects.
+    
+    See http://code.activestate.com/recipes/286132-memento-design-pattern-in-python/
+    """
+    cache = {}
+
+    def __call__(self, *args, **kwargs):
+        print "MementoMetaclassSRE.__call__()"
+        key = (self, ) + args + tuple(kwargs.items())
+        try:
+            return self.cache[key]
+        except KeyError:
+            instance = type.__call__(self, *args, **kwargs)
+            self.cache[key] = instance
+            return instance
+
+# Looking to possibility of creating a parametrized metaclass, metaclass
+# factory, or similar mecahnism to be able to configure MementoMetaclass on the
+# fly. This would provide a mechanism for MementoMetaclass users to stipulate
+# what parameters are used as the object-identifying key (currently: all of
+# them, in exactly the call signature). Not ready to pull the trigger on that,
+# however. Metaclasses are a bit tricky, and must be "done right." 
+
+# Some reading:
+# http://bytes.com/topic/python/answers/40084-parameterized-metaclass-metametaclass
+# http://www.acooke.org/cute/PythonMeta0.html
+# http://www.python.org/dev/peps/pep-3115/
+[pytest]
+python_files = test/*.py
+#!/usr/bin/env python
+
+from setuptools import setup
+from decimal import Decimal
+import re
+
+def linelist(text):
+    """
+    Returns each non-blank line in text enclosed in a list.
+    """
+    return [ l.strip() for l in text.strip().splitlines() if l.split() ]
+    
+    # The double-mention of l.strip() is yet another fine example of why
+    # Python needs en passant aliasing.
+
+
+def verno(s):
+    """
+    Update the version number passed in by extending it to the 
+    thousands place and adding 1/1000, then returning that result
+    and as a side-effect updating setup.py
+
+    Dangerous, self-modifying, and also, helps keep version numbers
+    ascending without human intervention.
+    """
+    d = Decimal(s)
+    increment = Decimal('0.001')
+    d = d.quantize(increment) + increment
+    dstr = str(d)
+    setup = open('setup.py', 'r').read()
+    setup = re.sub('verno\(\w*[\'"]([\d\.]+)[\'"]', 'verno("' + dstr + '"', setup)
+    open('setup.py', 'w').write(setup)
+    return dstr
+
+setup(
+    name='simplere',
+    version=verno("0.108"),
+    author='Jonathan Eunice',
+    author_email='jonathan.eunice@gmail.com',
+    description='Simpler, cleaner access to regular expressions. Globs too.',
+    long_description=open('README.rst').read(),
+    url='https://bitbucket.org/jeunice/simplere',
+    py_modules=['simplere'],
+    install_requires=['mementos'],
+    classifiers=linelist("""
+        Development Status :: 3 - Alpha
+        Operating System :: OS Independent
+        License :: OSI Approved :: BSD License
+        Intended Audience :: Developers
+        Programming Language :: Python
+        Topic :: Software Development :: Libraries :: Python Modules
+    """)
+)

simplere/simplere.py

+"""
+A simpler way to access and use regular expressions. As a bonus,
+also simpler access to globs.
+"""
+
+from mementos import MementoMetaclass, with_metaclass   # to memoize Re objects
+import re                               # for Re
+import fnmatch                          # for Glob
+import sys                              # for en passant operation in Re
+
+class ReMatch(object):
+    """
+    An easier-to-use proxy for regular expression match objects. Ideally this would be
+    a subclass of the re module's match object, but their type ``_sre.SRE_Match``
+    appears to be unsubclassable
+    <http://stackoverflow.com/questions/4835352/subclassing-matchobject-in-python>`_.
+    Thus, ReMatch is a proxy exposes the match object's numeric (positional) and
+    named groups through indices and attributes. If a named group has the same
+    name as a match object method or property, it takes precedence. Either
+    change the name of the match group or access the underlying property thus:
+    ``x._match.property``
+    """
+     
+    def __init__(self, match):
+        self._match = match
+        self._groupdict = match.groupdict()
+        
+    def __getattr__(self, name):
+        if name in self.__dict__:
+            return self.__dict__[name]
+        if name in self._groupdict:
+            return self._groupdict[name]
+        try:
+            return getattr(self._match, name)
+        except AttributeError:
+            return AttributeError("no such attribute '{}'".format(name))
+        
+    def __getitem__(self, index):
+        return self._match.group(index)
+        
+class Re(with_metaclass(MementoMetaclass, object)):
+        
+    # convenience copy of re flag constants
+    
+    DEBUG      = re.DEBUG
+    I          = re.I
+    IGNORECASE = re.IGNORECASE
+    L          = re.L
+    LOCALE     = re.LOCALE
+    M          = re.M
+    MULTILINE  = re.MULTILINE
+    S          = re.S
+    DOTALL     = re.DOTALL
+    U          = re.U
+    UNICODE    = re.UNICODE
+    X          = re.X
+    VERBOSE    = re.VERBOSE
+    
+    _ = None
+
+    def __init__(self, pattern, flags=0):
+        self.pattern = pattern
+        self.flags   = flags
+        self.re = re.compile(pattern, flags)
+        self.groups     = self.re.groups
+        self.groupindex = self.re.groupindex
+        
+    def _regroup(self, m):
+        """
+        Given an _sre.SRE_Match object, create and return a corresponding
+        ReMatch object. 
+        """
+        result = ReMatch(m) if m else m
+        Re._ = result
+        return result
+
+    def __contains__(self, item):
+        # if not isinstance(item, basestring):
+        #     item = str(item)
+        return self._regroup(self.re.search(item))
+    
+    def into(self, obj):
+        self.retobj = obj
+        return self
+    
+    ### methods that return ReMatch objects
+    
+    def search(self, *args, **kwargs):
+        return self._regroup(self.re.search(*args, **kwargs))
+
+    def match(self, *args, **kwargs):
+        return self._regroup(self.re.match(*args, **kwargs))
+        
+    def finditer(self, *args, **kwargs):
+        for m in self.re.finditer(*args, **kwargs):
+            yield self._regroup(m)
+    
+    ### methods that don't need ReMatch objects
+      
+    def findall(self, *args, **kwargs):
+        return self.re.findall(*args, **kwargs)
+    
+    def split(self, *args, **kwargs):
+        return self.re.split(*args, **kwargs)
+    
+    def sub(self, *args, **kwargs):
+        return self.re.sub(*args, **kwargs)
+    
+    def subn(self, *args, **kwargs):
+        return self.re.subn(*args, **kwargs)
+    
+    def escape(self, *args, **kwargs):
+        return self.re.escape(*args, **kwargs)
+    
+
+class Glob(object):
+    """
+    An item matches a Glob via Unix filesystem glob semantics.
+    
+    E.g. 'alpha' matches 'a*' and 'a????' but not 'b*'
+    """
+    __metaclass__ = MementoMetaclassSRE
+
+    def __init__(self, pattern):
+        self.pattern = pattern
+        
+    def __contains__(self, item):
+        return fnmatch.fnmatch(item, self.pattern)
+
+try:
+    from testharness import import_from_parent, test_run
+    # raise ImportError
+    import_from_parent()
+except ImportError:
+    test_run = None
+
+from simplere import *
+
+def test_Re():
+    _ = 1
+    print "test_Re()"
+    tests = 'some string with things in it ok?'
+    testpat  = Re(r'\b(s\w*)\b')
+    testpat1 = Re(r'\b(s\w*)\b')
+    assert testpat is testpat1   # test memoization
+    if tests in Re(r'\b(s\w*)\b').into(_):
+        print Re._[1]
+        print _
+        print _[1]
+        assert _[1] == 'some'
+        assert _.end(1) == 4
+        assert _._match.group(1) == _[1]
+    print 'x'  * 6
+    
+    if tests in testpat:
+        print _[1]
+        assert _[1] == 'some'
+        assert _.end(1) == 4
+        assert _._match.group(1) == _[1]
+    else:
+        raise ValueError('yes it is!!!')
+    
+    found = testpat.findall(tests)
+    assert found == ['some', 'string']
+    
+    person = 'John Smith 48'
+    if person in Re(r'(?P<name>[\w\s]*)\s+(?P<age>\d+)'):
+        assert _.name == 'John Smith'
+        print _.name
+        assert int(_.age) == 48
+        assert _.name == _._match.group('name')
+        assert _.age  == _._match.group('age')
+    else:
+        raise ValueError('yes it is!!!')
+    
+    for found in Re(r'pattern (\w+)').finditer('pattern is as pattern does'):
+        print found[1]
+        assert isinstance(found, ReMatch)
+        assert found[1] in ['is','does']
+    
+    found = Re(r'pattern (\w+)').findall('pattern is as pattern does')
+    print found
+    assert found == 'is does'.split()
+    
+def test_Glob():
+    assert "alpha" in Glob("a*")
+    assert "beta" not in Glob("a*")
+    
+    if 'globtastic' in Glob('glob*'):
+        print "Yes! You're right. It is!"
+    else:
+        raise ValueError('YES IT IS')
+    
+if __name__ == '__main__':
+    if test_run:
+        test_run()
+    else:
+        for f in [test_Re, test_Glob]:
+            print "-" * 20, f.func_name, "-" * 20
+            f()
+        print "-" * 20, "done", "-" * 20
+[tox]
+envlist = py26, py27, py32, py33, pypy
+
+[testenv]
+changedir=test
+deps=
+    pytest 
+commands=py.test {posargs}
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.