lazy initialization to support per-thread engine proxy

Issue #40 resolved
Former user created an issue

sqlalchemy would be much easier to use in some scenarios if it were possible to rebind a mapped table to a new engine.

For instance, in a web app meant to be paste-deployable, connection parameters (including the engine to use) must be thread-safe and mutable per request, because two copies of the same app may be installed in the same server process, but with different configurations. Currently, as far as I can tell, this would require creating all of the mappers at the start of each request -- which is much too much redundant work on each request.

It would be much more efficient to be able to do something like:

table.use_engine(engine_for_this_request)

Except it looks like it would require a great deal of work to make that operation thread-safe.

So, what I wanted to do was create a proxy engine that could be used at definition time, and that would delegate to an actual engine at connection time. Unfortunately, some of the implementation in schema.py makes this impossible. Specifically, Table and Column reach into the engine when they are instantiated, eg (from {{{Table.init}}}):

self._impl = self.engine.tableimpl(self)

If those sorts of eager initializations were replaced with lazy initializations using properties, then it would be possible to build a delegating proxy engine that would enable define-once, use-many-engines tables and mappers.

Example patch for schema.py:

Index: sqlalchemy/schema.py
===================================================================
--- sqlalchemy/schema.py    (revision 841)
+++ sqlalchemy/schema.py    (working copy)
@@ -233,7 +233,7 @@
         column, which generally isnt in column lists.
         """
         self.name = str(name) # in case of incoming unicode
-        self.type = type
+        self.coltype = type
         self.args = args
         self.key = kwargs.pop('key', name)
         self.primary_key = kwargs.pop('primary_key', False)
@@ -247,6 +247,20 @@

     original = property(lambda s: s._orig or s)
     engine = property(lambda s: s.table.engine)
+
+    def _get_type(self):
+        # some caching would be nice here, if types are invariant
+        # per engine
+        if self.table.engine is not None:
+            return self.table.engine.type_descriptor(self.coltype)
+        return self.coltype
+    type = property(_get_type)
+
+    def _get_impl(self):
+        if self.table.engine is not None:
+            return self.table.engine.columnimpl(self)
+        return None        
+    _impl = property(_get_impl)

     def __repr__(self):
        return "Column(%s)" % string.join(
@@ -271,10 +285,6 @@
             if self.primary_key:
                 table.primary_key.append(self)
         self.table = table
-        if self.table.engine is not None:
-            self.type = self.table.engine.type_descriptor(self.type)
-            
-        self._impl = self.table.engine.columnimpl(self)

         if self.default is not None:
             self.default = ColumnDefault(self.default)

Comments (2)

  1. Mike Bayer repo owner

    OK, well at first I was thinking, "you dont have to do that...." but yes, since a class is a global declaration, which points to its mapper, which points to tables, the same table would have to talk to different DBs per "request". I would say right off that building such an app, i.e. one that context-switches all over the place, is generally a huge pain in the ass and takes away from one of the original reasons people like Python, that its easy! Also I was surprised that Paste has this requirement, but it rings a bell, but I would guess that nothing is seriously going to make usage of it....its just too easy to start separate servers for different applications or use something like mod_python which has PythonInterpPerXXXX functionality....also would perform better.

    But yes the need will exist regardless. But, I think this need is going to be an exception case, so that if an application is not using the proxy engine, it should not have to pay any price. Placing the burden on "context-switching" and "caching" of attributes within the Table implies that the ProxyEngine doesnt worry about it, and also that anything else which wants to hold onto ProxyEngine values has to worry about the same thing. Additionally, its code overhead on a per-attribute-access, no matter what engine is being used.

    So why not have ProxyEngine do more of the work, and have it return proxying versions of TypeEngine, TableImpl, ColumnImpl, and whatever else as well ? That way ProxyEngine can define some high-performing and cache-aware scheme to quickly return thread-local/context-local attributes in only one place, instead of sprinkled about Table and whatever else in the sql.* package that might need it, and no external code has to worry about it. if ProxyEngine is not used, as I think will be more common, then the additional per-attribute overhead is gone. It also means if you do something like repr(table), or table.c.col1.type == table2.c.col2.type, it will work consistently since its the same object instances being returned.

  2. Log in to comment