in the loop of large quantity query result, update operation costs strange time

Issue #3270 invalid
simo chao created an issue

When I query a large quantity records, then loop the result. if I update one record, it will cost a lot time.

_users = db.session.query(WxUser).limit(10000).all()
for _user in _users:
    t1 = time()
    db.session.query(WxUser).filter_by(id=_user.id).update(dict(sex=WxUser.Sex.MALE))
    db.session.commit()
    t2 = time()
    print t2 - t1

out put:

0.242075920105
15.5323040485
16.6957418919
...

the commit operation cost time is more than normal. when open the 'SQLALCHEMY_ECHO', shows that SQLAlchemy query the every record by id.

Comments (1)

  1. Mike Bayer repo owner

    there's no bug here.

    Two techniques will resolve the time spent.

    1. turn of in-Python evaluation of objects to see if they match the criteria:

        s.query(WxUser).filter_by(id=_user.id).update(dict(data='M'),  synchronize_session=False)
    

    docs: http://docs.sqlalchemy.org/en/rel_0_9/orm/query.html?highlight=query.update#sqlalchemy.orm.query.Query.update.params.synchronize_session

    2. don't expire all 10000 objects in the session before calling upon update(), which indicates that in order for the above synchronize_session='evaulate' to function, it must hit every single WxUser object and evaluate their "id", which requires that they be re-loaded from the database:

      session = Session(expire_on_commit=False)
    

    docs: http://docs.sqlalchemy.org/en/rel_0_9/orm/session.html#committing

  2. Log in to comment