Source

intensional / README.rst

Full commit

Intensional (rule-defined) sets for Python.

Overview

There are two ways of defining a set: intensional and extensional. Extensional sets like set([1,3,5,'daisy']) enumerate every member of the set explicitly.

Intensional sets, in contrast, are defined by rules. For example "the set of all prime numbers" or "every word beginning with 'a' and ending with 't'. Intensional sets are often infinite. It may be possible to generate a list of their members, but it's not as simple as a "give me everything you've got!" for loop.

Once you know what you're looking for, intensional sets are everywhere.

Python doesn't represent them directly, but regular expressions, many list comprehensions, and all manner of testing and filtering operations are really faces of the intensional set concept. Many functions test whether something 'qualifies'. os.path.isdir(d) for example, tests whether d is in the set of legitimate directories, and isinstance(s, str) tests whether s is a member of the set of str objects. Even the core if conditional can be construed as testing for membership in an intensional set--the set of all items that pass the test.

Many such tests have a temporal aspect--they determine whether a value is a member at the time of the test. In the future, the answer may change, if conditions change. Others are invariant over time. %%734 will never be a valid Python identifier, no matter how many times it's tested--unless the rules of the overall Python universe change.

Intensional sets are part and parcel of all programming, even if they're not called by that name or explictly manifested. intensional helps Python prograims represent intensional sets directly.

Usage

Instead of:

match = re.search(r'(pattern\w*)', some_string)
if match:
    print match.group(1)

You can do an en passant test:

if some_string in Re(r'(pattern\w*)'):
    print _[1]

Note this turns the sense of the matching around, asking "is a given string in this pattern?" The Re pattern is an intensionally defined set--namely, the set of all strings matching the pattern. This makes excellent sene in cases where you have a clear intent for the match--for example, determining "is the given string within the set of all legitimate commands?"

If you prefer the more traditional re pattern, they are also available in a convenient en passant style.:

if Re(r'(pattern\w*)').search('string with pattern in it'):
    print _[1]

Note that the underscore name _ holds the result.

This works even better with named pattern components, such as:

person = 'John Smith 48'
if Re(r'(?P<name>[\w\s]*)\s+(?P<age>\d+)').search(person):
    print _.name, "is", _.age, "years old"
else:
    print "don't understand '{}'".format(person)

### need go back meahcnism

It's possible also to loop over the results, as in:

for found in Re(r'(pattern\w*)').findall('pattern is as pattern does'):
    print found

Re objects are memoized for efficiency, so they are only compiled once, regardless of how many times they're mentioned in the program.

Or if you Say you have a class Thing that requires expensive computation to create, or that should be created only once. You can make that happen simply by adding one line to its definition:

from mementos import MementoMetaclass

class Thing(object):

    __metaclass__ = MementoMetaclass

    def __init__(self, name):
        self.name = name

    ...

Then Thing objects will be memoized:

t1 = Thing("one")
t2 = Thing("one")
assert t1 is t2

Notes

  • Derived from an ActiveState recipe by Valentino Volonghi.

  • It is safe to memoize multiple classes at the same time. They will all be stored in the same cache, but their class is a part of the cache key, so the values are distinct.

  • This implementation is not thread-safe, in and of itself. If you're in a multi-threaded environment, wrap object instantiation in a lock.

  • Be careful with call signatures. This metaclass caches on call signature, which can vary if keyword args are used. E.g. def func(a, b=2) could be called func(1), func(1,2), func(a=1), func(1, b=2), or func(a=2, b=2)--and all resolve to the same logical call. And this is just for two parameters! If there are more than one kwarg, they can be arbitrarily ordered, creating many logically identical permuations. Thank Goodness Python doesn't allow kwargs to come before positional args, else there'd be even more ways to make the same call.

    So if you instantiate an object once, then again with a logically identical call but using a different calling structure/signature, the object won't be created and cached just once--it will be created and cached multiple times.:

    o1 = Thing("lovely")
    o2 = Thing(name="lovely")
    assert o1 is not o2   # because the call signature is different
    

    This may degrade performance, and can also create errors, if you're counting on mementos to create just one object. So don't do that. Use a consistent calling style, and it won't be a problem.

    In most cases, this isn't an issue, because objects tend to be instanitated with a limited number of parameters, and you can take care that you instantiate them with parallel call signatures. Since this works 99% of the time and has a simple implementation, it's worth the price of this inelegance. For the 1% edge-cases where multiple call signature variations must be conclusively resolved to a unique canonical signature, the inspect module could be used (e.g. getargvalues() or getcallargs() to create such a unified key.

  • Not to be confused with the memento module, which does something completely different.

  • In some cases, you may not want the entire initialization-time call signature to define the cache key--only part of it. (Because some arguments to __init__ may be useful, but not really part of the object's 'identity'.) The best option is to design __init__ without superflous attributes, then create a second method that adds sets useful-but-not-essential data. E.g.:

    class OtherThing(object):
    
        __metaclass__ = MementoMetaclass
    
        def __init__(self, name):
            self.name = name
            self.color = None
            self.weight = None
    
        def set(self, color=None, weight=None):
            self.color = color or self.color
            self.weight = weight or self.weight
            return self
    
    ot1 = OtherThing("one").set(color='blue')
    ot2 = OtherThing("one").set(weight='light')
    assert ot1 is ot2
    assert ot1.color == ot2.color == 'blue'
    assert ot1.weight == ot2.weight == 'light'
    

Installation

pip install mementos

(You may need to prefix this with "sudo " to authorize installation.)