invalid input syntax with on_conflict_do_update and JSONB

Issue #3888 resolved

Samuel Colvin created an issue 2017-01-13

Using postgres 9.5, python 3.6 and aiopg with SQLAlchemy==1.1.4

I've looked and couldn't find another issue pertaining to this.

I have a table with a JSONB field, I'm trying to run

from sqlalchemy.dialects.postgresql import insert as pg_insert

...
id = data.pop('id')
await conn.execute(
    pg_insert(sa_contractors)
    .values(id=id, **data)
    .on_conflict_do_update(
        index_elements=[sa_contractors.c.id],
        set_=data
    )
)

and getting the exception:

...
  File "/home/samuel/code/socket-server/env/lib/python3.6/site-packages/aiopg/connection.py", line 134, in _ready
    state = self._conn.poll()
psycopg2.DataError: invalid input syntax for integer: "Bio"
LINE 1: ...ARRAY['name', 'sort_index', 'value', 'id'], ARRAY['Bio', 0, ...

Is it possible that sqlalchemy is forming the query wrongly for the update of the json field like this SO question?

The project is open source so I can give links to code if that helps.

Comments (11)

Samuel Colvin reporter
Forgot to say if I remove the JSONB field from the set data it works fine.
- 2017-01-13T14:23:40+00:00
Mike Bayer repo owner
- changed status to wontfix
Hi there -

Thanks for reporting this issue. However, it is non-actionable for two reasons. The first, more minor reason is: "The project is open source so I can give links to code if that helps." Unfortunately we cannot act on bug reports that don't supply an mcve as we don't have the resources to debug other people's projects. But much more critically is that the aiopg library entirely replaces the majority of SQLAlchemy statement execution internals with its own homegrown approaches, which we obviously cannot support. The handling of special datatypes is deeply linked into the statement execution internals.

This is not all to say there is not a bug in SQLAlchemy, however this would need to be demonstated against SQLAlchemy directly without replacing the entire Connection/Engine execution system with that of another library (e.g. a single standalone script that does not import anything outside of sqlalchemy modules itself).
- 2017-01-13T15:43:29+00:00
Mike Bayer repo owner
- changed status to invalid
- 2017-01-13T15:44:14+00:00

Samuel Colvin reporter

Hi Michael, I'd suggest it's rather abrupt of you to mark this as "invalid" without giving me the opportunity to provide the extra proof you needed. Surely you want people to report bugs?

Complete working example with just sqlalachemy:

#!/usr/bin/env python3.6
import os

import psycopg2
from sqlalchemy import Column, Integer, String, create_engine, select
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.dialects.postgresql import insert as pg_insert
from sqlalchemy.engine.url import URL
from sqlalchemy.ext.declarative import declarative_base

db = dict(
    password=os.getenv('POSTGRESQL_PASSWORD'),
    host='localhost',
    port='5432',
    user='postgres',
)

conn = psycopg2.connect(**db)

DB_NAME = 'jsonb_upsert_test'
conn.autocommit = True
cur = conn.cursor()
cur.execute('DROP DATABASE IF EXISTS {}'.format(DB_NAME))
cur.execute('CREATE DATABASE {}'.format(DB_NAME))
cur.close()
conn.close()

Base = declarative_base()

class Person(Base):
    __tablename__ = 'people'
    id = Column(Integer, primary_key=True, nullable=False)
    name = Column(String(63))
    extra = Column(JSONB)

people = Person.__table__

db['username'] = db.pop('user')
engine = create_engine(URL(database=DB_NAME, drivername='postgres', **db))
Base.metadata.create_all(engine)

extra = [1, 2, 3]
with engine.connect() as conn:
    data1 = dict(id=123, name='joe', extra=extra)
    print('trying upsert without JSON obj in "update" data...')
    id1 = data1.pop('id')
    q = (
        pg_insert(people)
        .values(id=id1, **data1)
        .on_conflict_do_update(
            index_elements=[people.c.id],
            set_={k: v for k, v in data1.items() if k != 'extra'}
        )
    )
    conn.execute(q)
    print('successful, trying again so we do an update...')
    conn.execute(q)
    print('successful, current db:')
    for row in conn.execute(select([people.c.id, people.c.name, people.c.extra])):
        print(row)

    print('\n####################################\n')
    print('trying upsert **with** JSON obj in "update" data...')
    data2 = dict(id=321, name='joe', extra=extra)
    id2 = data2.pop('id')
    conn.execute(
        pg_insert(people)
        .values(id=id2, **data2)
        .on_conflict_do_update(
            index_elements=[people.c.id],
            set_={k: v for k, v in data2.items()}
        )
    )
    print('successful')

engine.dispose()

output:

(env) /path/to ➤  POSTGRESQL_PASSWORD=<password> ./demo.py 
trying upsert without JSON obj in "update" data...
successful, trying again so we do an update...
successful, current db:
(123, 'joe', [1, 2, 3])

####################################

trying upsert **with** JSON obj in "update" data...
Traceback (most recent call last):
  File "/path/to/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
    context)
  File "/path/to/env/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 469, in do_execute
    cursor.execute(statement, parameters)
psycopg2.ProgrammingError: column "extra" is of type jsonb but expression is of type integer[]
LINE 1: ...CONFLICT (id) DO UPDATE SET name = 'joe', extra = ARRAY[1, 2...
                                                             ^
HINT:  You will need to rewrite or cast the expression.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./demo.py", line 73, in <module>
    set_={k: v for k, v in data1.items()}
  File "/path/to/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 945, in execute
    return meth(self, multiparams, params)
  File "/path/to/env/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/path/to/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/path/to/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context
    context)
  File "/path/to/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1393, in _handle_dbapi_exception
    exc_info
  File "/path/to/env/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 202, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/path/to/env/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 185, in reraise
    raise value.with_traceback(tb)
  File "/path/to/env/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
    context)
  File "/path/to/env/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 469, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) column "extra" is of type jsonb but expression is of type integer[]
LINE 1: ...CONFLICT (id) DO UPDATE SET name = 'joe', extra = ARRAY[1, 2...
                                                             ^
HINT:  You will need to rewrite or cast the expression.
 [SQL: 'INSERT INTO people (id, name, extra) VALUES (%(id)s, %(name)s, %(extra)s) ON CONFLICT (id) DO UPDATE SET name = %(param_1)s, extra = %(param_2)s'] [parameters: {'id': 321, 'name': 'joe', 'extra': '[1, 2, 3]', 'param_1': 'joe', 'param_2': [1, 2, 3]}]

I've used a list for extra so the error matches the original one, but you get another error if you use a dict.

If this doesn't prepresent sufficient information, please let me know what you need.

2017-01-13T16:23:16+00:00

Mike Bayer repo owner
Hi Samuel -

re: "I'd suggest it's rather abrupt of you to mark this as "invalid"", I apologize for this, and I certainly don't do this for every case. However in this case I'd note that I have very prominent messaging on the "post a new bug report" screen which emphatically requests that the poster provide a complete test case, and additionally links to the SQLAlchemy Community Guide which features very detailed instructions on what we're looking for in a bug report. TBH,the suggestion that I would need to go look at your own project, figure out how to install/run/use it and then guess my way into reproducing your test case is diametrically opposed to the spirit of these guidelines.

Combined with the code example that uses aiopg, a library that seems it may be trying to ride on SQLAlchemy's success without doing the real work of collaborating with us on a true async solution and instead invents their own execution system, I felt the need to make the case that my time is very valuable and that bug posters should at least make an attempt to follow the guidelines.

In the vast majority of cases where people aren't able to provide a test case initially, they are unable or unwilling to provide a test case at all and these posters usually disappear. That you've come back with a test case in half an hour makes this a very unique case.

Thank you for posting a proper test case and I will look into your test case now.
- 2017-01-13T16:33:20+00:00
Mike Bayer repo owner
- changed status to open
- 2017-01-13T16:33:29+00:00
Samuel Colvin reporter
makes sense, thanks for looking into it. And thank you for sqlalchemy, it's great (even when used with aiopg :-)).
- 2017-01-13T16:37:14+00:00
Mike Bayer repo owner
your case should work for now if you set a literal:
```
from sqlalchemy import literal

set_={"somevalue": literal([1, 2, 3], JSONB)}
```
patch going through at https://gerrit.sqlalchemy.org/285
- 2017-01-13T19:37:37+00:00
Samuel Colvin reporter
That's great, thank you get much. literal solution is working fine. Not sure if you want me to close this now or wait until it's merged/deployed?
- 2017-01-13T19:45:34+00:00
Mike Bayer repo owner
you're all done, the issue closes automatically when i merge the fix.
- 2017-01-13T19:51:33+00:00
Mike Bayer repo owner
- changed status to resolved
Use full column->type processing for ON CONFLICT SET clause

Fixed bug in new "ON CONFLICT DO UPDATE" feature where the "set" values for the UPDATE clause would not be subject to type-level processing, as normally takes effect to handle both user-defined type level conversions as well as dialect-required conversions, such as those required for JSON datatypes. Additionally, clarified that the keys in the set_ dictionary should match the "key" of the column, if distinct from the column name. A warning is emitted for remaining column names that don't match column keys; for compatibility reasons, these are emitted as they were previously.

Fixes: ~~#3888~~ Change-Id: I67a04c67aa5f65e6d29f27bf3ef2f8257088d073

→ <<cset afd78a37dafe>>
- 2017-01-13T20:47:18+00:00
Log in to comment

Assignee: –

Type: bug

Priority: minor

Status: resolved

Component: –

Milestone: –

Votes: 0

Watchers: 1