1.1.0b3 Core PostgreSQL INSERT CTE rollback bug with engine.execute and default pool_reset_on_return

Issue #3805 resolved

Christopher Wilson created an issue 2016-09-23

I'm using the new insert CTEs to implement a select or insert in one statement using the engine.execute() approach. The default reset_on_return works fine for separate select() followed by an insert() as the insert transaction is COMMITted. However, using an insert CTE, the transaction is ROLLBACKed whenever the connection is returned to the pool.

MWE: using PostgreSQL 9.3.

import sqlalchemy as sa

metadata = sa.MetaData()

exTable = sa.Table('the_table_name',metadata,
        sa.Column('id', sa.Integer, primary_key = True),
        sa.Column('text', sa.TEXT, unique=True),
        )

theText = "some text here"
returningCols = [exTable.c.id]

connStr = 'postgresql+psycopg2://user:pass@server.example.com/the_example_db'

engine = sa.create_engine(connStr,
                         #pool_reset_on_return = 'commit', #uncomment this line for working example
                         )
lookup = exTable.select().with_only_columns(returningCols).where(exTable.c.text==theText)
lookupCTE = lookup.cte('selq')
insertQuery = exTable.insert().returning(*returningCols).from_select([exTable.c.yaml],sa.select([sa.literal(theText)]).where(~sa.exists(sa.select([sa.text('*')]).select_from(lookupCTE))))
insertCTE = insertQuery.cte('insq')

selParams = [sa.text('*')]
lookupSel = sa.select(selParams).select_from(lookupCTE)
lookupIns = sa.select(selParams).select_from(insertCTE)
query = lookupSel.union_all(lookupIns)
idres = engine.execute(query)
theId = None
for item in idres:
  print item
  theId = item['id']

idLookup = engine.execute(lookup)
print ("Should find ID: %d..." % (theId))
for item in idLookup:
  print "item found: %d" % item['id']

prints:

(1,)
Should find ID: 1...

Setting pool_reset_on_return = 'commit':

(2,)
Should find ID: 2...
item found: 2

This leads to confusing behavior: * if the data is present, the correct ID is returned, * if the data is not present, then the data is inserted, the corresponding ID returned, and then that ID is invalid as the transaction is rolled back at the end of the execute() call.

Yes, I can arbitrarily use the pool_reset_on_return='commit' option for the engine or manually wrap it in a transaction, but the expected behavior would be a COMMIT transaction on the use of INSERT, UPDATE, or DELETE CTEs, since that is the behavior of a separate INSERT command with the default pool_reset_on_return behavior:

idres = engine.execute(insertQuery)
theId = None
for item in idres:
  print item
  theId = item['id']

idLookup = engine.execute(lookup)
print ("Looking up row %d..." % (theId))
for item in idLookup:
  print "item found: %d" % item['id']

Gives:

(3,)
Looking up row 3...
item found: 3

Comments (13)

Christopher Wilson reporter
- edited description
- 2016-09-23T20:36:22+00:00
Christopher Wilson reporter
- edited description
- 2016-09-23T20:37:40+00:00
Mike Bayer repo owner
sure this is because the statement you're running is ultimately a SELECT, and a SELECT doesn't signal autocommit=True when a transaction is not otherwise being used. You can work around this using my_stmt = my_stmt.execution_options(autocommit=True).

if you can let me know that resolves, we can work on possibly setting the flag automatically when a DML CTE is embedded into a SELECT.
- 2016-09-23T21:28:42+00:00
Mike Bayer repo owner
- changed milestone to 1.1
- changed component to sql
- 2016-09-23T21:29:06+00:00
Christopher Wilson reporter
- removed component
- changed title to 1.1.0b3 Core PostgreSQL INSERT CTE rollback bug with engine.execute and default pool_reset_on_return
- edited description
- removed milestone
tried to clarify that the default insertQuery command will execute the COMMIT on the default engine transaction when the connection is returned to the pool, whereas the insertCTE will not.
- 2016-09-23T21:29:24+00:00
Christopher Wilson reporter
Ooops, I was editing that description when you responded. My bad. Thanks for the work around and the fantastic product!
- 2016-09-23T21:31:13+00:00
Mike Bayer repo owner
- changed milestone to 1.1
- changed component to sql
- 2016-09-23T21:31:43+00:00
Christopher Wilson reporter
Thanks! query=query.execution_options(autocommit=True) does work. However, I do think the better option is an autocommit with a DML CTE in use.
- 2016-09-23T21:45:13+00:00
Christopher Wilson reporter
Interesting behavior on the workaround. Specifying a query with execution_options(autocommit=True) as a cte ignores the execution_options. For example:
```
insertQuery = exTable.insert().returning(*returningCols).from_select([exTable.c.text],sa.select([sa.literal(theText)]).where(~sa.exists(sa.select([sa.text('*')]).select_from(lookupCTE)))).execution_options(autocommit=True)
insertCTE = insertQuery.cte('insq')
```
At first inspection, I would expect the autocommit (if explicitly set) to be propagated to whatever query used the CTE. I realize there could be some interesting implementation edge cases if two CTEs were explicitly given different autocommit options.

Of course, this could probably be mitigated by simply doing automatic inspection on the final query and setting the commit flag if the query contains a DML CTE.
- 2016-09-26T13:21:13+00:00
Mike Bayer repo owner
IIUC we're talking here about autocommit on an inner query object propagating out to the outermost. So right now nothing does that. To fix this, we'd be looking to propagate autocommit=True to containing queries in all cases.
- 2016-09-26T14:41:19+00:00
Mike Bayer repo owner
yeah and as far as if they conflict, shrugs, will probably just push towards True and not towards False. autocommit is not "harmful", theoretically everthing could just say COMMIT at the end rather than ROLLBACK.
- 2016-09-26T14:49:03+00:00
Christopher Wilson reporter
Yes. You understand correctly. It could be an implementation to get the autocommit working for DML CTE's. Inserts/Updates/Deletes automatically set the autocommit as true in UpdateBase, allowing CTE to contain/propagate that seems a straightforward approach, depending on how easy it is to lift the execution_options from subqueries, and to prefer autocommit=True over autocommit = False.

Are there any other cases besides autocommit where it would be useful to propagate options up from a subquery to the parent query unless explicitly overwritten on the parent query?
- 2016-09-26T15:26:44+00:00
Mike Bayer repo owner
- changed status to resolved
Propagate execution_options at compile stage

Compiler can now set up execution options and additionally will propagate autocommit from embedded CTEs.

Change-Id: I19db7b8fe4d84549ea95342e8d2040189fed1bbe Fixes: ~~#3805~~

→ <<cset 71030d67bd7c>>
- 2016-10-05T20:51:35+00:00
Log in to comment

Assignee: –

Type: bug

Priority: major

Status: resolved

Component: sql

Milestone: 1.1

Votes: 0

Watchers: 1