proposal: sqlalchemy.ext.altpersist

Issue #1518 wontfix
Mike Bayer repo owner created an issue

Yes I know, just take back what I said at Pycon.

First the use case:

  1. An existing application uses relational storage and has many classes mapped.

  2. certain classes within the model are to be migrated away from the relational storage and into an alternative database system, which is almost certainly a key/value oriented system.

  3. SQLAlchemy's existing attribute-tracking and unit of work workflow is still desired. the objects in question are still linked to relationally-persisted objects, still need to issue SQL upon persist operations that update the database, and as such would like to be "flushed" in the same workflow as they do within the relational system.

  4. as is usually the case with k/v stores, ACID is not terribly important. It is a given that if part of a flush() fails on the relational side, nonrelational data may have already been persisted, can't be rolled back, etc.

  5. Querying is totally to be hand-rolled outside of SQLAlchemy. These are not relational objects so there are no tables, foreign keys, SELECT statements, aggregates, aggregate update/deletes, or anything. a plain Query(MyAltPersistedClass) without explicit addition of such a capability throws an error. there are also no relation()s - such linkages again need to be hand-rolled.

the extension would be an expert-level extension that would allow construction of plugins designed towards various k/v storage engines. It would present:

  1. a subclassable base class that provides hooks to satisfy the minimal mapper() requriements. These hooks would proxy into the private underscore methods currently used and would provide the bridge from a public extension into the ORM's internals:

a. _configure_class_instrumentation would still occur probably in the exact same way, with possible hooks on the extension class.
b. deferred_scalar_loader / _load_scalar_attributes() - if reloading of attributes is not available, the plugin should somehow signal to the session that its objects cannot be expired. this would have some impact on _expire_state, and is the only change to current internals that I can see so far here. If the extension can handle it, an extension method get_reloader() would allow a reload function to be returned, or alternatively a method like refresh(object, attrs). c. User-defined extensions would use the attributes package directly to instrument attributes. The system of MapperProperty objects would not be used by extensions - it's too heavyhanded and would be difficult to use. That said it still could be leveraged by a user-defined class if that was desired. By using attributes to instrument, we get session, state, identity_map, and unit of work compatibility as well as history and "dirty" states.
d. _save_obj() and _delete_obj() would hook into persist(), update() and remove() methods on the extension. nothing is assumed about these, and a user-defined extension would probably want to query attribute history in order to figure out what attributes have changed, etc. new pk values would need to be generated here too. e. _register_dependencies() and _register_processors() would be linked to get_dependencies() and get_processors(), which return tuples of mappers/extension mappers, and dependendent processor functions, respectively. f. base_mapper and polymorphic_iterator() would be proxied as well in order to support inheritance patterns, although initial extensions would probably not use these features. g. _identity_key_from_state(state) would be via get_key(object). the full tuple-based identity key consisting of (class, tuple(pk vals)) would be generated by the base extension class. h. querying. obviously we aren't getting into that at all, Query can be subclassed and an extension that wishes to build into Query would need to provide a subclass that does whatever is desired when an alternatively-persisted class is requested. i. cascade_iterator(). yup, need that too. cascading along related mappers would have to be provided by user defined classes. j. _post_configure_properties() - post_configure() allows delayed configuration after all mappers are assembled.

The MapperExtension system is not part of the system - user-defined objects would have to use other methods and/or reimplement the usage of MapperExtension. SessionExtension is fully usable though.

Comments (5)

  1. Mike Bayer reporter

    The extension should probably expose InstanceState too. we'd otherwise spend a lot of time marshalling back and forth and it would require a whole range of translation methods to the attributes package.

  2. Log in to comment