Implicit ROLLBACK and PostgreSQL 9.5

Issue #4139 closed
Dariusz Suchojad
created an issue

Hello,

a Zato user posted a question and findings regarding an implicit ROLLBACK issued by session objects.

During investigation it turned out that, if I interpret it correctly, in PostgreSQL 9.5 a warning is issued if such ROLLBACK without a transaction is executed, which ultimately may lead to confusion and log files growing spuriously.

In particular, here are links to PostgreSQL's source code where it looks that a recent change now causes warnings to be logged for such ROLLBACK.

I'm not clear how to proceed further - I recommended that this warning be filtered out by syslog for now but it may hide genuine issues that use the same warning, so it probably is not the best approach (there are no warning/error codes in PostgreSQL as far as I can see and the same message may be triggered from different places).

Do I understand it correctly that this is a core feature of ORM and is unlikely to change? Or is there a way to prevent this implicit ROLLBACK?

The SQLAlchemy version in question is 0.9.9 but we will be upgrading to latest 1.x in Zato soon.

I'm happy to discuss it in PostgreSQL's mailing lists but I'm just not sure what the best course of action is - clearly the behaviour is not wanted but I'm not certain on whose end there should be any work carried out to deal with it.

I'm also curious - hasn't anyone raised this question before? This probably concerns all users of SQLAlchemy and PostgreSQL 9.5+?

Regards.

Comments (14)

  1. Michael Bayer repo owner

    So this kind of thing has to be handled at the driver level, e.g. psycopg2, since the "BEGIN" is already implicit as per pep-249. So we as users of psycopg2 don't have any immediate way of knowing if this implicit BEGIN were emitted unless we tracked other calls to the API which is much easier for the API to do itself; also, some drivers don't emit the BEGIN except for certain kinds of statements. The driver is the one which knows for certain if the BEGIN was emitted or not and it is already tracking it.

    However, googling, the "there is no transaction in progress" warning does not see to be a Postgresql 9.5 thing. here is from 2015 and 9.5 wasn't released until a year later.

    finally, I have PG 9.6.5 here locally and no such warning is emitted if I call rollback():

    >>> from sqlalchemy import create_engine
    >>> e = create_engine("postgresql+psycopg2://scott:tiger@localhost/test")
    >>> c = e.connect()
    >>> e.dialect.server_version_info
    (9, 6, 5)
    >>> raw = c.connection.connection
    >>> raw
    <connection object at 0x7f068be382c0; dsn: 'host=localhost password=xxx user=scott dbname=test', closed: 0>
    >>> raw.rollback()
    >>> raw.rollback()
    >>> raw.rollback()
    >>> raw.rollback()
    >>> c.close()
    >>> 
    

    so perhaps this user is on a very old psycopg2 version? would need software versions and code that reproduces the issue, thanks.

  2. Michael Bayer repo owner

    Also I note that poster is using pg8000, which is a vastly less used driver and I was hoping maybe that's where the behavior is but again I cannot reproduce.

  3. Michael Bayer repo owner

    here's the most direct test, PG 9.6.5, my conf file no setting for log_min_messages and states that it defaults to warning:

    import pg8000
    
    conn = pg8000.connect(user="scott", password="tiger", host="localhost", database="test")
    
    conn.rollback()
    conn.rollback()
    conn.rollback()
    

    can you try this script and see if it emits a warning?

  4. Dariusz Suchojad reporter

    Thanks Michael, this warning appears to be used in several places by PostgreSQL and it's not clear if the message from 2015 pertained to the same situation.

    In xact.c there is a commit which adds the warning to stray ROLLBACK statements - unless I am mistaken, that is, I'm not really familiar with PG's source code but please look up 'case TBLOCK_IMPLICIT_INPROGRESS:' in xact.c if you are curious.

    Version of pg8000 is 1.10.2 - I will ask to repeat with the latest one.

    My impression after reading the stanza below was that it was issued by SQLAlchemy, hence the ticket.

    """ When the close() method is called, the referenced DBAPI connection is released to the connection pool. From the perspective of the database itself, nothing is actually “closed”, assuming pooling is in use. The pooling mechanism issues a rollback() call on the DBAPI connection so that any transactional state or locks are removed, and the connection is ready for its next usage """

  5. Dariusz Suchojad reporter

    Hi Mike,

    I have had a look at pg8000's source and there is no implicit rollback in there - I was using pg8000 1.10.2 (I mention it because 1.11.x saw internal rewrites but still, nothing related to this case).

    I created two gists to reproduce the situation.

    The first simply returns the expected result but the latter, using sessionmaker, will add ROLLBACK at the end. Could you please help me interpret it?

    I'm using SA 0.9.9 which I realize is not the latest one and updating to 1.x is planned but it has been hampered by the fact that we cannot set custom attributes on models unless we use the WritableTuple approach - it is doable, but it needs some time to implement, hence we are still on 0.9.9.

    https://gist.github.com/dsuch/2c4786c304c5ce63a43e61ab2478a9d0 https://gist.github.com/dsuch/de67d63ee142f575dcd2a583feb6d572

    Thanks.

  6. Michael Bayer repo owner

    Hi Dariusz -

    the implicit ROLLBACK is a behavior of SQLAlchemy's connection pool and is both normal and necessary; without it, locks and transactional state remain upon the connection as it is returned to the connection pool where it will both leave idle locks hanging open and interfere with subsequent use. While there is an option to turn it off, this option is dangerous as it means your application must absolutely not fail to COMMIT or ROLLBACK any transaction it makes even in the case of application failure. You can use reset_on_return if you'd like (see http://docs.sqlalchemy.org/en/latest/core/pooling.html?highlight=reset_on_return#sqlalchemy.pool.Pool.params.reset_on_return , this parameter is also accepted by create_engine) however you will likely have more severe issues as a result.

    in your test scripts, the first script doesn't illustrate "ROLLBACK" in the log but it is in fact doing this. In the second script, the Session itself is doing a rollback() of an explicit transaction because you are not calling "commit()". When the session is closed it must dispose of this transaction which means "rollback()", and this ROLLBACK because it is outside of the connection pool is logged. Then when the connection is returned to the connection pool, the implicit "rollback()" proceeds. The DBAPI should be tracking that in fact the connection was already "rolled back" and not be emitting additional statements so no warning occurs.

    When I run either script, there is no warning and no bug is reproducible on this end.

  7. Dariusz Suchojad reporter

    Thanks, I think this confirms what I initially read in documentation, i.e.

    """ The pooling mechanism issues a rollback() call on the DBAPI connection so that any transactional state or locks are removed, and the connection is ready for its next usage. """

    I understand that by DBAPI we mean pg8000 in this case, which should be emitting ROLLBACK only if there is a transaction in progress, since it is the place that best knows, by definition, whether it is in a transaction already?

    This makes sense but if you still have a moment - how can it be reconciled with the fact that there is an implicit BEGIN, even for SELECT statements, like in the second gist - right before SELECT 1+1?

    Or, is that BEGIN (implicit) on SA level and DBAPI does not necessarily have a transaction yet at this point?

    I'm happy to check in pg8000's source code if that ROLLBACK can be prevented - I would just like to ask for guidance as to who issues which BEGIN.

    Thanks again.

  8. Michael Bayer repo owner

    This makes sense but if you still have a moment - how can it be reconciled with the fact that there is an implicit BEGIN, even for SELECT statements, like in the second gist - right before SELECT 1+1?

    the implicit BEGIN is part of the DBAPI, e.g. pg8000 in this case. See https://www.python.org/dev/peps/pep-0249/#commit. This is not written completely clearly, but by "autocommit must be initially off", that means a transaction is implicitly present.

    databases don't operate without a transaction in progress. What you usually see when you use a SQL command line is actually "autocommit" - each statement you type implicitly runs in a transaction that is committed at the end. By typing "BEGIN", you turn off this "autocommit" feature and state that the transaction should continue until ROLLBACK or COMMIT is called. pep 249 states that all DBAPIs should first and foremost work without autocommit being turned on.

    pg8000 has an "autocommit" property here: http://pythonhosted.org/pg8000/dbapi.html#pg8000.Connection.autocommit which is exposed in SQLAlchemy via the AUTOCOMMIT isolation level: http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html#pg8000-transaction-isolation-level

    if you turn on this isolation level, then the connection will unconditionally autocommit all statements. SQLAlchemy's transactional constructs on top of it are not aware of this setting, and continue to function normally, but are ignored by the DBAPI (at least if pg8000 works correctly, they should be ignored).

    I'm happy to check in pg8000's source code if that ROLLBACK can be prevented

    Still don't know why the ROLLBACK is a problem. Again, I keep hearing about a "warning" but am not shown how to reproduce.

  9. Markus Liuska

    Greetings,

    I am the user who originally posted this issue into the Zato forum.

    Based on earlier discussion, I have done testing with scripts that were provided by Michael against PostgreSQL versions 9.5.9 and 9.6.6.

    Environment

    Ubuntu Server LTS and the PostgreSQL installed directly from the Ubuntu apt repository.

    Versions:

    • Ubuntu 16.04.3 LTS
    • PostgreSQL 9.5.9 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit
    • Python 2.7.12

    Testing

    I started by creating new database called "test", with user scott, then I took code snippets from previous posts and pasted them into separated files; pg8000_test.py and psycopg2_test.py. For avoiding any conflicts, I created separated virtual environments for both of tests.

    pg8000 test

    #//pg8000_test.py 
    import pg8000
    conn = pg8000.connect(user="scott", password="tiger", host="localhost", database="test")
    conn.rollback()
    conn.rollback()
    conn.rollback()
    

    As I ran the test, transaction warnings appeared on postgresql-9.5-main.log.

    (v_env) developer@xxxx-17:~/dbtest$ pip freeze
    pg8000==1.10.2
    pkg-resources==0.0.0
    
    (v_env) developer@xxxx-17:~/dbtest$ python pg8000_test.py
    
    developer@xxxx-17:/var/log/postgresql$ sudo tailf postgresql-9.5-main.log
    
    2017-12-11 10:35:48 UTC [state:00000, session:5a2e5f84.1082, procid:4226, session-ln:1] scott@test LOG:  execute pg8000_statement_0/pg8000_portal_0: rollback
    2017-12-11 10:35:48 UTC [state:25P01, session:5a2e5f84.1082, procid:4226, session-ln:2] scott@test WARNING:  there is no transaction in progress
    2017-12-11 10:35:48 UTC [state:00000, session:5a2e5f84.1082, procid:4226, session-ln:3] scott@test LOG:  execute pg8000_statement_0/pg8000_portal_1: rollback
    2017-12-11 10:35:48 UTC [state:25P01, session:5a2e5f84.1082, procid:4226, session-ln:4] scott@test WARNING:  there is no transaction in progress
    2017-12-11 10:35:48 UTC [state:00000, session:5a2e5f84.1082, procid:4226, session-ln:5] scott@test LOG:  execute pg8000_statement_0/pg8000_portal_2: rollback
    2017-12-11 10:35:48 UTC [state:25P01, session:5a2e5f84.1082, procid:4226, session-ln:6] scott@test WARNING:  there is no transaction in progress
    

    psycopg2 test

    from sqlalchemy import create_engine
    e = create_engine("postgresql+psycopg2://scott:tiger@localhost/test")
    c = e.connect()
    print(e.dialect.server_version_info)
    raw = c.connection.connection
    print(raw)
    
    raw.rollback()
    raw.rollback()
    raw.rollback()
    raw.rollback()
    c.close()
    

    I followed previous procedure with the psycopg2 test, however when connection was created with psycopg2 warnings were missing completely.

    (v_env_psy) developer@xxxx-17:~/dbtest$ pip freeze
    pkg-resources==0.0.0
    psycopg2==2.5.3
    SQLAlchemy==0.9.9
    
    (v_env_psy) developer@xxxx-17:~/dbtest$ python psycopg2_test.py 
    (9, 5, 9)
    <connection object at 0x7f7ac4c336e0; dsn: 'dbname=test user=scott password=xxxxx host=localhost', closed: 0>
    
    
    developer@xxxx-17:/var/log/postgresql$ sudo tailf  postgresql-9.5-main.log
    
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:1] scott@test LOG:  statement: BEGIN
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:2] scott@test LOG:  statement: SELECT t.oid, typarray
            FROM pg_type t JOIN pg_namespace ns
                ON typnamespace = ns.oid
            WHERE typname = 'hstore';
    
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:3] scott@test LOG:  statement: ROLLBACK
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:4] scott@test LOG:  statement: BEGIN
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:5] scott@test LOG:  statement: select version()
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:6] scott@test LOG:  statement: select current_schema()
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:7] scott@test LOG:  statement: show transaction isolation level
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:8] scott@test LOG:  statement: SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:9] scott@test LOG:  statement: SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:10] scott@test LOG:  statement: SELECT 'x' AS some_label
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:11] scott@test LOG:  statement: ROLLBACK
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:12] scott@test LOG:  statement: BEGIN
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:13] scott@test LOG:  statement: show standard_conforming_strings
    2017-12-11 10:53:27 UTC [state:00000, session:5a2e63a7.1272, procid:4722, session-ln:14] scott@test LOG:  statement: ROLLBACK
    2017-12-11 10:53:27 UTC [state:08006, session:5a2e63a7.1272, procid:4722, session-ln:15] scott@test LOG:  could not receive data from client: Connection reset by peer
    

    PG configuration and tests

    PostgreSQL log snippets use custom log_line_prefix and log_statement is set to 'all'. The log_min_messages is commented out (defaulted). However, just to make sure, I ran tests with log_statements commented out as well and warning messages stayed as expected.

    Running tests with PG 9.6.6

    I wanted to see if the warnings would be written into the log with PG 9.6.5. Sadly I only managed to do testing with PG 9.6.6. However my results were identical to previous tests with older version.

    (v_env) developer@xxxx-17:~/dbtest$ python pg8000_test.py 
    (v_env_psy) developer@xxxx-17:~/dbtest$ python psycopg2_test.py 
    (9, 6, 6)
    <connection object at 0x7f497ae256e0; dsn: 'dbname=test user=scott password=xxxxx host=localhost', closed: 0>
    
    
    root@xxxx-17:/var/log/postgresql# tailf postgresql-9.6-main.log 
    2017-12-11 13:26:51 EET [3200-1] scott@test WARNING:  there is no transaction in progress
    2017-12-11 13:26:51 EET [3200-2] scott@test WARNING:  there is no transaction in progress
    2017-12-11 13:26:51 EET [3200-3] scott@test WARNING:  there is no transaction in progress
    2017-12-11 13:27:08 EET [3223-1] scott@test LOG:  could not receive data from client: Connection reset by peer
    

    EOP

    I might be wrong, but I think that Dariusz is looking possibility for preventing the ROLLBACK as the transaction can be controlled via Zato's code. Anyway, I hope that this post gives enough information for reproducing this issue, if it is seen necessary to do so.

    Ps. Thank you both for your time.

  10. Michael Bayer repo owner

    OK so the warnings are in PG's log, that's useful.

    Anyway, psycopg2 does not have the issue because it properly bookkeeps for ROLLBACKs. the bug is in pg8000. I would suggest reporting this issue to pg8000 and/or just using psycopg2 which is vastly more widely used than pg8000.

    If you'd like to turn off implicit rollback in SQLAlchemy:

    • set pool reset_on_return to None - prevents any extra ROLLBACK when returning connection to the pool, however this is very dangerous as many forms of execution (like metadata.create_all(engine)) rely upon this ROLLBACK to end the transaction (which note is always present unless autocommit is set on the connection)

    • when using the Session, set it up with autocommit=True. The session.close() here is emitting the other ROLLBACK otherwise.

    • set AUTOCOMMIT isolation on the database connection, driver should not emit ROLLBACK or COMMIT ever.

    • use psycopg2

    • report bug to pg8000

    The job of tracking the "rollback" count on the connection so there is not an extra ROLLBACK is the job of the driver, not SQLAlchemy, no action to be taken here.

  11. Dariusz Suchojad reporter

    Thanks for all your pointers, I am truly grateful for it!

    We will follow up in pg8000's tracker, the reason we are using this library is that it is pure-Python so it is much easier to reason about it in a multi-greenlet/gevent-based server such as Zato.

  12. Log in to comment