session.refresh(obj) spawns new connection that are held until session is closed

Issue #3866 invalid
literg created an issue

Environment: MySQL 14.14, SQLAlchemy 1.0.11, 1.0.16, 1.1.4

When using session.refresh(obj) inside session, new connection is spawned and are held until the end of session.

Code that i used:

session = session_maker()
try:
    yield session
    session.commit()
    for obj in session:
        session.refresh(obj)
except Exception as e:
    session.rollback()
    raise e
finally:
    session.close()
with make_session(...) as session:
        for mapped in [self._mapper.map(obj) for obj in objects]:
            saved_entities.append(mapped)
            session.add(mapped)
        session.flush()

Hint: The curious thing is that when i use session.commit() after session.refresh(obj) the connection is reused by next session.refresh(obj) invocation.

Comments (14)

  1. Mike Bayer repo owner

    session.commit() ends the transaction, and the connection in use is returned to the connection pool. session.refresh() then is being asked to load new data from the database, so it starts a new transaction and hence a new connection is acquired, which will remain until the transaction is again explicitly ended. What alternate behavior are you expecting?

  2. literg reporter

    The problem in this code is that. I am trying to create multiple objects at once which needs to be refreshed at the end of session. When refresh happens new connections are spawned and the connection pool is quickly exhausted causing the next refresh to timeout due to lack of available connection. I would expect that refresh after refreshing object would close the transaction/connection and not wait until close of entire session (alternatively next refresh would use already open connection). So next refresh would reuse the connection from the pool. I tried to do something like session.connection().close() but it did not make any changes, connection was still held until session is closed.

  3. Mike Bayer repo owner

    the multiple refresh() calls will all participate in the same transaction. there is only one new connection used for the full range of refresh() calls in your loop. refresh() is like any other Session method, the only way the Session has a connection is that a transaction is started, unless you are using autocommit mode.

    I suspect the problem is that you really just want to turn expire_on_commit=False for this Session. that way there's no need to refresh the objects after the commit(). Without that setting, objects that are attached to the Session will only have state when a connection is procured. If you want the objects to remain attached and populated while the Session is between transactions, you need to turn that flag off.

  4. literg reporter

    Actually I would like not to turn this to False. The reason why I use refresh after commit is that some data are filled by database triggers and I would like to use just created object in other parts of the code.

    Unfortunately is not what you have described multiple refresh uses multiple transactions. I am attaching log from MySQL:

    #!
    
     7897 Query     SELECT 1
                     7897 Query     INSERT INTO ...
                     7897 Query     INSERT INTO ...
    ...
                     7907 Query     SELECT 1
                     7907 Query     COMMIT
                     7897 Query     COMMIT
                     7907 Query     ROLLBACK
                     7897 Query     ROLLBACK
                     7916 Query     SELECT 1
                     7916 Query     SELECT ... where id = 3192
                     7935 Query     SELECT 1
                     7935 Query     SELECT ... where id = 3186
                     7934 Query     SELECT 1
                     7934 Query     SELECT ... where id = 3184
    ...
                     7935 Query     ROLLBACK
                     7987 Query     ROLLBACK
                     7916 Query     ROLLBACK
                     7972 Query     ROLLBACK
                     7934 Query     ROLLBACK
                     7907 Query     ROLLBACK
                     7953 Query     ROLLBACK
                     7954 Query     ROLLBACK
                     7971 Query     ROLLBACK
    
  5. Mike Bayer repo owner

    please attach a full MCVE that reproduces this behavior. nothing like that happens here obviously.

  6. literg reporter

    I tried to create simple example but surprisingly i failed. In my example it is working as intended (1 connection is used for entire session). I only wonder how the people that have done our framework for db achieved such behavior.

    Sorry for the trouble.

  7. Mike Bayer repo owner

    are you able to reproduce this in your own environment or this is not something you observed directly?

  8. literg reporter

    I am able to reproduce it using our framework which includes wrapper for creation of sqlalchemy engine which sets parameters and adds implementation of ping_connection plus some json to db model mapper plus some cache for sqlalchemy engines. Nothing that on first look could produce such issue.

    I have tried to create minimum example using code snippets from tutorial and the session management from our code.

  9. Mike Bayer repo owner

    set your pool size to 1, max_overflow=0, and the timeout to nothing, it should produce a stack trace on the second concurrent checkout.

  10. literg reporter

    I did what you asked but I don't see anything interesting here:

    Traceback (most recent call last):
      File "/home/.../lib/python2.7/site-packages/generic_resource_manager/api/generic_api.py", line 140, in POST
        for to_insert_entity_json in to_create_json_payload])
      File "/home/.../lib/python2.7/site-packages/.../autolog.py", line 30, in logging_wrapper
        return_value = func(*args, **kwargs)
      File "/home/.../lib/python2.7/site-packages/generic_resource_manager/service/manager_base.py", line 259, in create_many
        session.add(mapped)
      File "/usr/local/lib/python2.7/contextlib.py", line 24, in __exit__
        self.gen.next()
      File "/home/.../lib/python2.7/site-packages/.../sql_alchemy_helper.py", line 186, in make_session
        raise e
    TimeoutError: QueuePool limit of size 1 overflow 0 reached, connection timed out, timeout 0
    
  11. Mike Bayer repo owner

    well that stack trace is not at the point of refresh(), it's at the point of your function yielding a session. It would suggest you have lots of calls to that method opening many sessions simutaneously.

  12. literg reporter

    Yes, if you look on the mySQL log you will notice that before commit for the connection that inserts entries there is some test query using other connection. As for many concurrent calls I disabled rest of the system for that tests and it was not different. During this one session (which is created locally from session maker) multiple connection is used.

  13. Log in to comment