session.flush_object() ?

Issue #3100 resolved
Mike Bayer repo owner created an issue

the common use case where folks need to INSERT or UPDATE a single row, fast.

session.flush_object(some_object)

This would pull into the persistence.py mechanics to flush just that one object. No collections, no related items. it would skip all of flush(), unitofwork.py and go directly to save_obj, or if the object were marked deleted to delete_obj.

Essentially its what a simple ORM would do with an obj.save() method.

for starters, it would not do any foreign key syncs from a related m2o, that's a UOW function. im sure people will complain soon enough though.

Comments (4)

  1. Mike Bayer reporter

    poc for this:

    def fast_save(obj, session):
        state = inspect(obj)
        isinsert = state.key is None
        mapper = state.mapper
        states = [state]
        uowtransaction = FakeUOWTransaction(session)
    
        with session.begin(subtransactions=True) as transaction:
            uowtransaction.transaction = transaction
            persistence.save_obj(mapper, states, uowtransaction, single=True)
            if isinsert:
                instance_key = mapper._identity_key_from_state(state)
                state.key = instance_key
                session.identity_map.replace(state)
                session._new.pop(state)
            state._commit_all(state.dict, instance_dict=session.identity_map)
            session._register_altered(states)
    
  2. Mike Bayer reporter

    OK, what we need also is, fast insert for multiple objects, no PK, that will use executemany() and not worry about postfetch. How do we do that?

  3. Mike Bayer reporter

    starting with:

    11854532c7f7137fc3a40c4faf913

    runner:

    import cProfile
    import StringIO
    import pstats
    import contextlib
    import time
    
    @contextlib.contextmanager
    def profiled():
        pr = cProfile.Profile()
        pr.enable()
        now = time.time()
        yield
        total = time.time() - now
        pr.disable()
        s = StringIO.StringIO()
        ps = pstats.Stats(pr, stream=s).sort_stats('cumulative')
        print("Total calls: %s Total Time: %s" % (ps.total_calls, total))
    
    
    
    from sqlalchemy import *
    from sqlalchemy.orm import *
    from sqlalchemy.ext.declarative import declarative_base
    import random
    
    Base = declarative_base()
    
    
    class A(Base):
        __tablename__ = 'a'
    
        id = Column(Integer, primary_key=True)
        x = Column(Integer)
        y = Column(Integer)
    
    
    def bulk():
        e = create_engine("sqlite://", echo=False)
    
        Base.metadata.create_all(e)
        sess = Session(e)
    
        with profiled():
            sess.bulk_save(
                [
                    A(x=random.randint(1, 1000), y=random.randint(1, 1000))
                    for i in range(10000)
                ]
            )
    
    
    def flush(ids):
        e = create_engine("sqlite://", echo=False)
    
        Base.metadata.create_all(e)
        sess = Session(e)
    
        with profiled():
            if ids:
                sess.add_all(
                    [
                        A(id=i,
                            x=random.randint(1, 1000), y=random.randint(1, 1000))
                        for i in range(10000)
                    ]
                )
            else:
                sess.add_all(
                    [
                        A(x=random.randint(1, 1000), y=random.randint(1, 1000))
                        for i in range(10000)
                    ]
                )
            sess.flush()
    
    flush(False)
    flush(True)
    bulk()
    

    call counts:

    #!
    
    $ python test.py
    Total calls: 1600844 Total Time: 1.86700105667
    Total calls: 1170455 Total Time: 1.18801999092
    Total calls: 650394 Total Time: 0.738065004349
    
  4. Mike Bayer reporter
    • A new series of :class:.Session methods which provide hooks directly into the unit of work's facility for emitting INSERT and UPDATE statements has been created. When used correctly, this expert-oriented system can allow ORM-mappings to be used to generate bulk insert and update statements batched into executemany groups, allowing the statements to proceed at speeds that rival direct use of the Core. fixes #3100

    → <<cset 3f1477e2ecf3>>

  5. Log in to comment