session.refresh(obj) spawns new connection that are held until session is closed
Environment: MySQL 14.14, SQLAlchemy 1.0.11, 1.0.16, 1.1.4
When using session.refresh(obj) inside session, new connection is spawned and are held until the end of session.
Code that i used:
session = session_maker()
try:
yield session
session.commit()
for obj in session:
session.refresh(obj)
except Exception as e:
session.rollback()
raise e
finally:
session.close()
with make_session(...) as session:
for mapped in [self._mapper.map(obj) for obj in objects]:
saved_entities.append(mapped)
session.add(mapped)
session.flush()
Hint: The curious thing is that when i use session.commit() after session.refresh(obj) the connection is reused by next session.refresh(obj) invocation.
Comments (14)
-
repo owner -
reporter The problem in this code is that. I am trying to create multiple objects at once which needs to be refreshed at the end of session. When refresh happens new connections are spawned and the connection pool is quickly exhausted causing the next refresh to timeout due to lack of available connection. I would expect that refresh after refreshing object would close the transaction/connection and not wait until close of entire session (alternatively next refresh would use already open connection). So next refresh would reuse the connection from the pool. I tried to do something like session.connection().close() but it did not make any changes, connection was still held until session is closed.
-
repo owner the multiple refresh() calls will all participate in the same transaction. there is only one new connection used for the full range of refresh() calls in your loop. refresh() is like any other Session method, the only way the Session has a connection is that a transaction is started, unless you are using autocommit mode.
I suspect the problem is that you really just want to turn expire_on_commit=False for this Session. that way there's no need to refresh the objects after the commit(). Without that setting, objects that are attached to the Session will only have state when a connection is procured. If you want the objects to remain attached and populated while the Session is between transactions, you need to turn that flag off.
-
reporter Actually I would like not to turn this to False. The reason why I use refresh after commit is that some data are filled by database triggers and I would like to use just created object in other parts of the code.
Unfortunately is not what you have described multiple refresh uses multiple transactions. I am attaching log from MySQL:
#! 7897 Query SELECT 1 7897 Query INSERT INTO ... 7897 Query INSERT INTO ... ... 7907 Query SELECT 1 7907 Query COMMIT 7897 Query COMMIT 7907 Query ROLLBACK 7897 Query ROLLBACK 7916 Query SELECT 1 7916 Query SELECT ... where id = 3192 7935 Query SELECT 1 7935 Query SELECT ... where id = 3186 7934 Query SELECT 1 7934 Query SELECT ... where id = 3184 ... 7935 Query ROLLBACK 7987 Query ROLLBACK 7916 Query ROLLBACK 7972 Query ROLLBACK 7934 Query ROLLBACK 7907 Query ROLLBACK 7953 Query ROLLBACK 7954 Query ROLLBACK 7971 Query ROLLBACK
-
repo owner please attach a full MCVE that reproduces this behavior. nothing like that happens here obviously.
-
reporter I tried to create simple example but surprisingly i failed. In my example it is working as intended (1 connection is used for entire session). I only wonder how the people that have done our framework for db achieved such behavior.
Sorry for the trouble.
-
reporter - changed status to invalid
-
repo owner are you able to reproduce this in your own environment or this is not something you observed directly?
-
reporter I am able to reproduce it using our framework which includes wrapper for creation of sqlalchemy engine which sets parameters and adds implementation of ping_connection plus some json to db model mapper plus some cache for sqlalchemy engines. Nothing that on first look could produce such issue.
I have tried to create minimum example using code snippets from tutorial and the session management from our code.
-
repo owner set your pool size to 1, max_overflow=0, and the timeout to nothing, it should produce a stack trace on the second concurrent checkout.
-
reporter I will try to do that but I will get back to you tomorrow.
Thanks for the assist.
-
reporter I did what you asked but I don't see anything interesting here:
Traceback (most recent call last): File "/home/.../lib/python2.7/site-packages/generic_resource_manager/api/generic_api.py", line 140, in POST for to_insert_entity_json in to_create_json_payload]) File "/home/.../lib/python2.7/site-packages/.../autolog.py", line 30, in logging_wrapper return_value = func(*args, **kwargs) File "/home/.../lib/python2.7/site-packages/generic_resource_manager/service/manager_base.py", line 259, in create_many session.add(mapped) File "/usr/local/lib/python2.7/contextlib.py", line 24, in __exit__ self.gen.next() File "/home/.../lib/python2.7/site-packages/.../sql_alchemy_helper.py", line 186, in make_session raise e TimeoutError: QueuePool limit of size 1 overflow 0 reached, connection timed out, timeout 0
-
repo owner well that stack trace is not at the point of refresh(), it's at the point of your function yielding a session. It would suggest you have lots of calls to that method opening many sessions simutaneously.
-
reporter Yes, if you look on the mySQL log you will notice that before commit for the connection that inserts entries there is some test query using other connection. As for many concurrent calls I disabled rest of the system for that tests and it was not different. During this one session (which is created locally from session maker) multiple connection is used.
- Log in to comment
session.commit() ends the transaction, and the connection in use is returned to the connection pool. session.refresh() then is being asked to load new data from the database, so it starts a new transaction and hence a new connection is acquired, which will remain until the transaction is again explicitly ended. What alternate behavior are you expecting?