PostgreSQL Sequence not created by metadata.create_all() from declarative mixin
I'd been copying this snippet from my old cookbook to add a surrogate key to a declarative table class:
class MixinIntKey(object):
@declared_attr
def key(cls):
return Column(
pg.INTEGER,
Sequence(f"sqk_{cls.__tablename__}", 1, 1),
nullable=False,
primary_key=True
)
This code was working before (some 3-4 years ago, I think around version 0.8 ): it generates the Sequence when I build the database objects through metadata.create_all(). The only change I made to the snippet was to update it to use format strings (Python 3.6).
After some googling around and testing (see attached file/link), the Sequence can be generated only if the metadata is passed during Sequence init:
Sequence(f"sqk_{cls.__tablename__}", 1, 1, metadata=metadata)
My environment: psycopg2==2.7.1, SQLAlchemy==1.1.8, Python 3.6 64bit on Windows 8.1 64bit, PostgreSQL 9.6 server
Comments (17)
-
repo owner -
repo owner I can't run your attached test. I get:
#! sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) data type integer has no default operator class for access method "gist" HINT: You must specify an operator class for the index or define a default operator class for the data type.
the test case has many extraneous details that need to be removed. So far the issue looks like you have "schema" set up for your MetaData and you expect that to take place for the Sequence, which would indicate there's no bug here.
-
repo owner yup with schema, we get what's expected:
CREATE SEQUENCE sqk_a INCREMENT BY 1 START WITH 1 2017-04-01 11:59:43,919 INFO sqlalchemy.engine.base.Engine {} 2017-04-01 11:59:43,920 INFO sqlalchemy.engine.base.Engine COMMIT 2017-04-01 11:59:43,925 INFO sqlalchemy.engine.base.Engine CREATE TABLE test_schema.a ( id INTEGER NOT NULL, key INTEGER NOT NULL, PRIMARY KEY (id, key) ) 2017-04-01 11:59:43,925 INFO sqlalchemy.engine.base.Engine {} 2017-04-01 11:59:43,933 INFO sqlalchemy.engine.base.Engine COMMIT
that is, the sequence is created without a schema, because it does not specify one.
Sequence needs to be associated with the MetaData() in order to get metadata.schema involved, this is documented, or specify schema to Sequence directly. When Sequence is present inside of Table(), the Sequence is not associated with the metadata, this is not implicit.
-
repo owner Please confirm your test does create the sequence, just not in the schema you intended due to the lack of association of Sequence with MetaData (again, not a bug). I can't run your test from here unless you can simplify it.
-
repo owner - changed status to closed
alrighty, as described the behavior is what's expected
-
reporter Sorry for the late reply on this and for reopening as I was travelling.
I did several test runs to check if the issue is due to timing of object creation. It appears to be so. As you have recommended, I strip down the code to bare minimum for a PostgreSQL "best practice" use case: tables in non-public schema, and use of explicit sequences instead of serial.
The mixin approach works when both Table and Sequence are without a schema. This is probably due to both being created on the public schema.
from sqlalchemy import create_engine, MetaData, Column, Sequence from sqlalchemy.dialects.postgresql import INTEGER, TEXT from sqlalchemy.ext.declarative import declarative_base, declared_attr ENGINE_STR = "postgresql+psycopg2://testuser:correcthorse@192.168.159.129:5432/test" engine = create_engine(ENGINE_STR, echo=True) Base = declarative_base() class MixinIntKey(object): @declared_attr def key(cls): return Column( INTEGER, Sequence(f"sqk_{cls.__tablename__}", 1, 1), nullable=False, primary_key=True ) class TableA(Base, MixinIntKey): __tablename__ = "table_a" column_a = Column(TEXT) Base.metadata.create_all(bind=engine) """ 2017-04-04 19:14:11,164 INFO sqlalchemy.engine.base.Engine CREATE SEQUENCE sqk_table_a INCREMENT BY 1 START WITH 1 2017-04-04 19:14:11,164 INFO sqlalchemy.engine.base.Engine {} 2017-04-04 19:14:11,164 INFO sqlalchemy.engine.base.Engine COMMIT 2017-04-04 19:14:11,164 INFO sqlalchemy.engine.base.Engine CREATE TABLE table_a ( column_a TEXT, key INTEGER NOT NULL, PRIMARY KEY (key) ) 2017-04-04 19:14:11,164 INFO sqlalchemy.engine.base.Engine {} 2017-04-04 19:14:11,164 INFO sqlalchemy.engine.base.Engine COMMIT """
Now, when a schema needs to be specified the typical approach (AFAIK) would be to instantiate declarative Base with a MetaData object with the schema name. The Table inheriting from Base would be created in the schema but the Sequence will not be created at all.
SCHEMA = "testschema" ENGINE_STR = "postgresql+psycopg2://testuser:correcthorse@192.168.159.129:5432/test" engine = create_engine(ENGINE_STR, echo=True) Base = declarative_base(metadata=MetaData(schema=SCHEMA)) class MixinIntKey(object): @declared_attr def key(cls): return Column( INTEGER, Sequence(f"sqk_{cls.__tablename__}", 1, 1), nullable=False, primary_key=True ) class TableA(Base, MixinIntKey): __tablename__ = "table_a" column_a = Column(TEXT) Base.metadata.create_all(bind=engine) """ 2017-04-04 19:19:04,462 INFO sqlalchemy.engine.base.Engine select version() 2017-04-04 19:19:04,462 INFO sqlalchemy.engine.base.Engine {} 2017-04-04 19:19:04,462 INFO sqlalchemy.engine.base.Engine select current_schema() 2017-04-04 19:19:04,462 INFO sqlalchemy.engine.base.Engine {} 2017-04-04 19:19:04,462 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1 2017-04-04 19:19:04,462 INFO sqlalchemy.engine.base.Engine {} 2017-04-04 19:19:04,478 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1 2017-04-04 19:19:04,478 INFO sqlalchemy.engine.base.Engine {} 2017-04-04 19:19:04,478 INFO sqlalchemy.engine.base.Engine show standard_conforming_strings 2017-04-04 19:19:04,478 INFO sqlalchemy.engine.base.Engine {} 2017-04-04 19:19:04,478 INFO sqlalchemy.engine.base.Engine select relname from pg_class c join pg_namespace n on n.oid=c.relnamespace where n.nspname=%(schema)s and relname=%(name)s 2017-04-04 19:19:04,478 INFO sqlalchemy.engine.base.Engine {'schema': 'testschema', 'name': 'table_a'} 2017-04-04 19:19:04,478 INFO sqlalchemy.engine.base.Engine SELECT relname FROM pg_class c join pg_namespace n on n.oid=c.relnamespace where relkind='S' and n.nspname=current_schema() and relname=%(name)s 2017-04-04 19:19:04,478 INFO sqlalchemy.engine.base.Engine {'name': 'sqk_table_a'} 2017-04-04 19:19:04,478 INFO sqlalchemy.engine.base.Engine CREATE TABLE testschema.table_a ( column_a TEXT, key INTEGER NOT NULL, PRIMARY KEY (key) ) 2017-04-04 19:19:04,478 INFO sqlalchemy.engine.base.Engine {} 2017-04-04 19:19:04,478 INFO sqlalchemy.engine.base.Engine COMMIT """
The correct approach to make things work is to explicitly specify medata on the Sequence mixin.
class MixinIntKey(object): @declared_attr def key(cls): return Column( INTEGER, Sequence(f"sqk_{cls.__tablename__}", 1, 1, metadata=cls.metadata), nullable=False, primary_key=True )
I believe the bug case would be the mixin working when no schema is specified: it relied on PostgreSQL's behavior of creating everything in public for it to work. A couple of possible fix:
- make a Sequence mixin work like ForeignKey (feature request?): create it under the same schema of the Column's Table when metadata is not passed. This may be problematic due to sequences in Postgres, unlike foreign keys and indices, are owned by the schema and not the table, and that sequences may be shared across tables in different schema.
- make schema argument mandatory for Postgres by raising an error: this should catch attention and not allow the case of public sequence in public table from working. That use case is just bound to surprise people as they progress into the more magical realm of mixins.
-
reporter - marked as proposal
-
reporter - changed status to open
-
repo owner why can't you just add the Base.metadata to the Sequence as documented?
-
repo owner This may be problematic due to sequences in Postgres, unlike foreign keys and indices, are owned by the schema and not the table, and that sequences may be shared across tables in different schema.
yes, that's why we can't really do that. At best there'd be a flag like, "inherit_schema=True", but then, you're passing a flag, might as well make it "metadata = Base.metadata" ;) this was why "metadata" was added as an argument to Sequence, because someone contributed the whole
MetaData(schema='foo')
feature, which I also was a little uncomfortable with due to side effects like these.make schema argument mandatory for Postgres by raising an error: this should catch attention and not allow the case of public sequence in public table from working. That use case is just bound to surprise people as they progress into the more magical realm of mixins.
use of "schema" is not necessarily a widespread practice, and Postgresql itself (as is the case with all other databases) never require a "schema" argument, there's a default in place. So even if this weren't a huge backwards-incompatible change, it still wouldn't be appropriate.
As you have recommended, I strip down the code to bare minimum for a PostgreSQL "best practice" use case: tables in non-public schema, and use of explicit sequences instead of serial.
I'm not familiar with what's wrong with SERIAL? If your column is a simple integer primary key, using SERIAL is definitely the simplest approach...which is a large reason why this issue is IMO not too critical.
Having to specify "metadata" to the Sequence is a little bit more awkward than I'd prefer, but this is not a common use case and I really can't think of any alternative that is worth it.
-
repo owner not to mention, I'm not familiar with why you need to have tables in a "non-public" schema as some kind of best practice. The "public" schema is already local to a database that itself is a "local" namespace.
-
repo owner also going back to the original use case. We're talking about a declarative mixin. Just do this:
class MixinIntKey(object): @declared_attr def key(cls): return Column( pg.INTEGER, Sequence(f"sqk_{cls.__tablename__}", 1, 1, metadata=cls.metadata), nullable=False, primary_key=True )
the Sequence also will receive the "after_parent_attach" event on behalf of the Column to which it is associated. This can be used to link to the Table association and set the "schema" after the fact (setting the MetaData on an existing Sequence is not public API right now):
@event.listens_for(Sequence, "after_parent_attach") def _assoc_seq_col(seq_element, parent): @event.listens_for(parent, "after_parent_attach") def _assoc_seq_table(col_element, parent): seq_element.schema = parent.schema
-
reporter The schema in Postgres is essentially a namespace. Explicitly defining a schema avoids namespace collisions and makes a large project manageable. Although the Zen of Python agrees to that as best practice, I understand that it's not the responsibility of this project (or anyone for that matter, unless you're an Inquisitor) to enforce "rightness".
Perhaps, the path of least resistance would be to document this use case on the Postgres dialect section? Or is it more appropriate to be in the wiki/usage recipe?
-
repo owner I'm all for adding some tips to the documentation about "Sequence -> metadata". I do think people generally use SERIAL in most cases though.
-
reporter True, SERIAL is the no-nonsense autoincrement "datatype": Postgres implements it as a sequence that's linked to the table (gets dropped with table). However, an explicit stand-alone sequence is a cheap way of generating unique IDs across partitioned tables without the bloat of UUID.
-
reporter - changed status to closed
-
repo owner Issue
#3979was marked as a duplicate of this issue. - Log in to comment
havent run your full test yet as it seems to have lots going on that is not related. A simple test passes:
output: