marking subclass of declared_attr with cascading ignores subclass
Before a class is mapped, I want to check if it defines a primary key. If the column is defined in a declared_attr
, I can't know if it's primary without evaluating the attr, which SQLAlchemy correctly warns about, since the model is not mapped yet.
To solve this, I thought I'd require marking declared attrs that define a primary key with a declared_pkey
subclass. However, this breaks when using declarded_pkey.cascading
, since internally _stateful_declared_attr
creates a declared_attr
instead of knowing about the subclass.
I realize that cascading
is not necessary in this contrived example, but it still demonstrates the issue. Running the following code should print 'pkey named id' but does not.
class declared_pkey(declared_attr):
primary_key = True
class IDMixin:
@declared_pkey.cascading
def id(cls):
return Column(Integer, primary_key=True)
class ExampleMeta(DeclarativeMeta):
def __init__(cls, name, bases, d):
for base in cls.__mro__:
for key, value in base.__dict__.items():
if (
isinstance(value, (declared_attr, sa.Column, InstrumentedAttribute))
and getattr(value, 'primary_key', False)
):
print(f'pkey named {key}')
super(ExampleMeta, cls).__init__(name, bases, d)
Base = declarative_base(metaclass=ExampleMeta)
class User(IDMixin, Base):
__tablename__ = 'user'
For now my solution has been to document that this doesn't work and tell people to do id.primary_key = True
manually, but it would be nice if subclasses were preserved.
class IDMixin:
@declared_attr.cascading
def id(cls):
return Column(Integer, primary_key=True)
id.primary_key = True
To solve this, _stateful_declared_attr
could be modifed to take the class as well.
class _stateful_declared_attr(declared_attr):
def __init__(self, cls, **kw):
self.cls = cls
self.kw = kw
def _stateful(self, **kw):
new_kw = self.kw.copy()
new_kw.update(kw)
return _stateful_declared_attr(self.cls, **new_kw)
def __call__(self, fn):
return self.cls(fn, **self.kw)
class declared_attr(...):
...
def _stateful(cls, **kw):
return _stateful_declared_attr(cls, **kw)
...
Comments (31)
-
reporter -
repo owner Subclassing of declared_attr is not supported and I don't see how it's necessary? I don't understand what:
class declared_pkey(declared_attr): primary_key = True
is intended to achieve.
Adding arbitrary attributes to the InstrumentedAttribute is also not supported (use .info), and I don't see how a "MyClass.foo.primary_key" is useful considering that all the other columns / relationships / whatever will not have a "primary_key" attirbute, how would you check that?
-
repo owner Before a class is mapped, I want to check if it defines a primary key. If the column is defined in a declared_attr, I can't know if it's primary without evaluating the attr
why would this class use the "IDMixin" if it already defines the thing that IDMixin provides? Why not a simple "defines_primary_key=True" class attribute ?
-
repo owner I see you have the getattr() in that metaclass and all that. basically we need to step back and talk about what you actually need to achieve b.c. custom metaclasses and subclassing things that dont want to support subclassing is not how to do it.
-
reporter This is in support of Flask-SQLAlchemy, which generates tablenames automatically, but currently has some bugs / deficiencies. I understand you're not the biggest fan of that feature, but it is something I want to continue to support. Part of that support is attempting to handle as many of SQLAlchemy's options for defining models as possible.
Part of the process is detecting if the model defines a primary key, or if one of its MRO defines a primary key as a declared attr. However, accessing a
declared_attr
on a mixin (unmapped) class causes SQLAlchemy to emit a warning. The solution to that was to look at the value in the__dict__
, which doesn't execute the property and so doesn't emit the warning. However, now we can't tell if the property defines the primary key.This subclass was the most obvious way to support this case to me. Now the property has a
primary_key
attribute like regular columns, and so the process works correctly. However, if the user then decides to use thecascading
option, it breaks the subclass.It doesn't seem like there's a reason that subclasses aren't supported here, and the change seems minimal, which is why I decided to report it. As I pointed out, I can also document that the attribute should be set manually, I just figured a subclass used in the same way as
declared_attr
would be easier to understand for most users. -
reporter I would like to be able to detect that the following three examples all define a primary key. The first two are easy. The last one is what I'm trying to solve. This has to happen before the model is mapped because the tablename must be set before the model is mapped.
class DirectModel(Base): # isinstance(value, InstrumentedAttribute); value.primary_key == True id = Column(Integer, primary_key=True) class DirectMixin: # isinstance(value, Column); value.primary_key == True id = Column(Integer, primary_key=True) class DeclMixin: # isinstance(value, declared_attr); value.primary_key == unknown without access, access emits warning @declared_attr def id(cls): return Column(Integer, primary_key=True)
-
repo owner This is in support of Flask-SQLAlchemy, which generates tablenames automatically, but currently has some bugs / deficiencies. I understand you're not the biggest fan of that feature, but it is something I want to continue to support. Part of that support is attempting to handle as many of SQLAlchemy's options for defining models as possible.
I'm totally cool with that! I've blogged about auto-tablename patterns and openstack uses that kind of thing like crazy. That's the whole point of
__tablename__
and all that.Part of the process is detecting if the model defines a primary key, or if one of its MRO defines a primary key as a declared attr. However, accessing a declared_attr on a mixin (unmapped) class causes SQLAlchemy to emit a warning. The solution to that was to look at the value in the dict, which doesn't execute the property and so doesn't emit the warning. However, now we can't tell if the property defines the primary key.
why must the class be unmapped ?
-
repo owner also, the warning is to prevent people from making mistakes. in this case, if you want to inspect a mixin, you are not making a mistake. So the real bug is, the warning not having any way of skipping it (besides filtering it, which nobody ever seems to want to do).
-
repo owner here's the change where that warning was added, does not seem to be in response to any specific issue: 7f82c55fa764b031110309fb3a819e4b518e741d
-
repo owner OK so i think what we are warning here on is that when you call upon the @declared_attr without the mapping, you are creating a brand new object and then throwing it away. That is, if you said "MyMixin.some_attr.foo = 'bar'", that "foo" is now gone. Also if there are side effects from the method, like creating a table or constraint that modifies the metadata, this is a problem also.
would need to see the way you are using this but a "safe" means might be a sub-method ".invoke".
-
reporter I'm totally cool with that!
Ha, for some reason I thought you didn't like it, not sure where I got that. :-)
why must the class be unmapped
A mixin class is just a plain class, it will never be mapped. We traverse the MRO, so we'd be accessing the attribute on the mixin directly.
I think the warning is correct, we should not be accessing the value outside a real, mapped model. I can think of plenty of contrived examples that won't work if we access them. If I can come up with contrived examples, I don't want to accidentally break a real use case that I couldn't think of. For example:
class Mixin: @declared_attr def id(cls): return Column(Integer, primary_key=cls.__name__.lower()[0] == 'a') class Aardvark(Mixin, Base): pass # If Mixin.id is accessed, we would determine that the column is not a primary key # but it will end up on Aardvark and so *should* be a primary key
Again, this is contrived, but I bet there are real cases where this would cause problems.
-
repo owner We traverse the MRO,
yeah when is that? you're in a metaclass?
-
repo owner this is the thing. if you are making a decision about whether or not there is a primary key, that means you have the table to be mapped. There are events where you can make decisions about the table before the mapper() is applied - such as, "if not table.primary_key: table.append_column(some_primary_key)".
Since this is an extension aiming to be very generic you might want to break into using events for things. they are more flexible than metaclasses.
-
repo owner here's what that looks like:
from sqlalchemy import * from sqlalchemy.orm import * from sqlalchemy.ext.declarative import declarative_base, declared_attr from sqlalchemy import event Base = declarative_base() @event.listens_for(Base, "instrument_class", propagate=True) def _ensure_pk(mapper, class_): table = mapper.local_table if table is not None and not table.primary_key: table.append_column(Column('id', Integer, primary_key=True)) class DeclMixin: @declared_attr def id(cls): return Column(Integer, primary_key=True) class NoPk(Base): __tablename__ = 'foobar' class HasPk(DeclMixin, Base): __tablename__ = 'batho' class SingleInhSubcl(HasPk): pass print NoPk.id == 5 print HasPk.id == 10 print SingleInhSubcl.id == 20
that way you don't have to tell your users to use a special decorator
-
reporter I've pushed an initial patch to fix tablename generation in Flask-SQLAlchemy. I'm pretty sure I tried moving the tablename detection to a
declared_attr
or events before and couldn't get it to work in all cases, but I don't remember clearly, it was months ago that I worked on that. If you can offer any insight in the PR, that would be much appreciated. https://github.com/mitsuhiko/flask-sqlalchemy/pull/541 -
repo owner that approach looks very complicated. did you consider setting up the primary key contract as a simple mediation between the Table and the mapper() as in my example above? Declarative is all about bolting Python class construction patterns into a system that creates mappings - but trying to ensure a mapping has certain attributes set up, that seems like it can happen at a more fundamental level. Obviously for buliding up the name of the table itself, the usual table / tablename hooks need to be employed since you can't create Table without a name (but without a PK column, no problem).
-
reporter What are "the usual tablename hooks"? A declared attr for tablename doesn't propagate. If an intermediate class sets a tablename directly, subclasses of it need to re-inherit the declared attr or it's not called.
class NameMixin: @declared_attr # doesn't matter if .cascading is used here def __tablename__(cls): return cls.__name__.lower() class User(NameMixin, Base): __tablename__ = 'USER' # interrupts mixin tablename, no longer inherited id = Column(Integer, primary_key=True) class Employee(User): id = Column(ForeignKey(User.id), primary_key=True) # tablename is not set by attr, even though that's the most likely desired outcome in this case
instrument_class
is called too late, if the tablename is not already set, SQLAlchemy errors before the event is triggered.before_
andafter_configured
do not receive information about the class being configured.mapper_configured
is not called consistently (and is also called too late), for example I calledmetadata.create_all()
andmapper_configured
wasn't triggered. -
repo owner OK I am not doing a good job with communication today, elsewhere as well. The
__table__()
__tablename__()
are "hooks" because they are methods one can implement and they are called when declarative wants to know about the table it is creating. But in your case, you are making use of__table__
and__tablename__
"hooks" in your approach at https://github.com/mitsuhiko/flask-sqlalchemy/pull/541/files#diff-9697d7f9c915064678df24a3479e4972R579 . I tried to state that the approach of using the "instrument_class" event would not be appropriate for this part of what you are trying to do. However the "create a primary key" part of it can.Here's a demo of automatic tablename and pk, is this simpler ?
from sqlalchemy import * from sqlalchemy.orm import * from sqlalchemy.ext.declarative import declarative_base, declared_attr from sqlalchemy import event import re def camel_to_snake_case(name): return ( name[0].lower() + re.sub(r'([A-Z])', lambda m: "_" + m.group(0).lower(), name[1:]) ) class SetsTablename(object): @declared_attr def __tablename__(cls): return camel_to_snake_case(cls.__name__) Base = declarative_base(cls=SetsTablename) @event.listens_for(Base, "instrument_class", propagate=True) def _ensure_pk(mapper, class_): table = mapper.local_table if table is not None and not table.primary_key: table.append_column(Column('id', Integer, primary_key=True)) class DeclMixin: @declared_attr def id(cls): return Column(Integer, primary_key=True) class NoPkNoTable(Base): pass class NoPkHasTable(Base): __tablename__ = 'foobar' class Abstract(Base): __abstract__ = True class HasPkNoTable(DeclMixin, Base): pass class HasPkHasTable(DeclMixin, Base): __tablename__ = 'batbar' class SingleInhSubcl(HasPkNoTable): __tablename__ = None for cls in NoPkHasTable, NoPkNoTable, HasPkNoTable, HasPkHasTable, SingleInhSubcl: print cls.__table__.name print cls.__table__.primary_key
-
reporter Thanks for continuing to work through this with me. Unfortunately, that example fails to work with the example I showed above; a manually set tablename stops the declared tablename from continuing on to later subclasses.
What I've found over time with improving the method we use now is that there's always another corner case, which keeps making the declared attr tablename less viable, at least given the attempts I've made in the past. Let me try to explain the "rules" that we've been using for tablename generation.
- I want the user to specify (or not) the primary key.
- If they set a primary key, that means they either have a basic table or want joined table inheritance, and need a name set.
- If they don't set a primary key, that means they want single table inheritance, so a name should not be set.
- If the user specifies the tablename themselves, this should take priority over generation.
- If they've used a declared attr, that should be used going forward for the subclass branch.
- Otherwise, generation should resume in subclasses. This doesn't work with
declared_attr
; once a class sets it manually it doesn't carry to subclasses unless the attr is mixed in again.
- Abstract models, mixins, and declared (cascading) attrs should be usable for both the tablename and primary key, in any combination and hierarchy.
This set of rules is something that I've accumulated over time as the behavior that requires the least manual intervention. Unless the user decides to use single-table instead of joined inheritance, they hopefully never need to worry about setting the tablename manually.
The metaclass, while perhaps not the "natural" SQLAlchemy way to do things, works, and while I'd love to figure out a better way, I'm ok supporting it as-is. The only issue that comes up is that we can't see into declared attrs safely using this method (at least as far as I can tell). Hence the initial bug report as I tried to find a solution to that.
- I want the user to specify (or not) the primary key.
-
repo owner So another thing I was trying to indicate in my earlier comment is that your current metaclass approach is doing the tablename thing that you want already, so there is no reason to change it, either. Using the approach you already have does not preclude handling this new primary key case using an event listener. you can use the metaclass to decide tablename and an event listener to find the primary key column if any.
I am still optimistic that straight usage of the tablename method can achieve all your use cases but this is a separate concern from the primary key issue.
as far as the bug here, I'd want to know if using an event listener as I've illustrated, just to handle assignment of primary key but not tablename, is sufficient to solve for the use case first linked to the bug. It's true thi is more of a mailing list issue at this point but we're here now and thankfully you are very responsive so it's fine.
-
repo owner oh. here's the problem:
"If they don't set a primary key, that means they want single table inheritance, so a name should not be set."
I remember this issue, and this is unfortunately the source of all your problems. If it were me, I'd simply not have it work this way. Single inheritance can be indicated by any other means, including a flag like "single_inheritance = True", or "tablename = None", etc. The original proposal here included that you were trying to add a new @declared_attr "declared_pkey" in any case, so already this would be new API for Flask-SQLAlchemy. It's a lot simpler IMO to just have a simple flag or mixin on the class.
-
repo owner But , this case as written still may be doable without the mixin. not sure yet. need to play with it.
-
reporter Assigning primary keys is not the goal of this bug, that sneaked in as a side effect of one of the examples I gave, sorry for the confusion.
The goal is to give the user a way to indicate that a cascading declared attr will return a primary key, without needing to evaluate the attr. Alternatively, if I can have a way to evaluate the attr on the class being mapped (
value.__get__(cls)
) without raising a warning, that may be better. That goes back to a comment you made earlier:OK so i think what we are warning here on is that when you call upon the @declared_attr without the mapping, you are creating a brand new object and then throwing it away. That is, if you said "MyMixin.some_attr.foo = 'bar'", that "foo" is now gone. Also if there are side effects from the method, like creating a table or constraint that modifies the metadata, this is a problem also.
would need to see the way you are using this but a "safe" means might be a sub-method ".invoke".
I would want to be sure that doing this wouldn't cause unintended side effects.
-
reporter Hmm, yes, I may be able to work this a different way, by always assuming joined inheritance and generating a tablename, even if no primary key is set, and requiring the user to set
__tablename__ = None
for single table inheritance. -
repo owner well, if im understanding it this time around, you want to create a Table only if a primary key was somehow set up by the mapping, through whatever means, and if there is no primary key, then use single table inheritance. your earlier examples seemed to indicate you wanted to do the standard "create a pk if one is not present" thing but IIUC that is the opposite of what you want to do here.
This is conditional creation of
__table__
and sort of cooincidentally there is a hook already that does this. It is completely undocumented and not part of public API, only used by an internal test fixture. But, it makes what you have described, if im not missing anything, pretty simple. How about this:from sqlalchemy import Integer, Column, Table, ForeignKey from sqlalchemy.orm import clear_mappers from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.ext.declarative import has_inherited_table from sqlalchemy.ext.declarative import declared_attr import re import unittest def camel_to_snake_case(name): return ( name[0].lower() + re.sub(r'([A-Z])', lambda m: "_" + m.group(0).lower(), name[1:]) ) class SetsTablename(object): @declared_attr def __tablename__(cls): return camel_to_snake_case(cls.__name__) @classmethod def __table_cls__(cls, *arg, **kw): for col in arg: if isinstance(col, Column) and col.primary_key: return Table(*arg, **kw) else: return None def auto_base(): return declarative_base(cls=SetsTablename) # that's it. # ########################################################################33 # test suite. Can you add new tests here that illustrate cases that # still aren't working? class GenTableTest(unittest.TestCase): def setUp(self): self.Base = auto_base() def tearDown(self): self.Base = None clear_mappers() def test_spec_pk_direct_noname(self): """If they set a primary key, that means they either have a basic table or want joined table inheritance, and need a name set.""" class MyClass(self.Base): id = Column(Integer, primary_key=True) self.assertEqual("my_class", MyClass.__table__.name) def test_spec_pk_abstract_noname(self): """If they set a primary key, that means they either have a basic table or want joined table inheritance, and need a name set.""" class MyAbstract(self.Base): __abstract__ = True @declared_attr def id(cls): return Column(Integer, primary_key=True) class MyClass(MyAbstract): pass self.assertEqual("my_class", MyClass.__table__.name) def test_spec_pk_mixin_noname(self): """If they set a primary key, that means they either have a basic table or want joined table inheritance, and need a name set.""" class MyMixin(object): @declared_attr def id(cls): return Column(Integer, primary_key=True) class MyClass(MyMixin, self.Base): pass self.assertEqual("my_class", MyClass.__table__.name) def test_spec_pk_cascading_noname(self): """If they set a primary key, that means they either have a basic table or want joined table inheritance, and need a name set.""" class MyMixin(object): @declared_attr.cascading def id(cls): if has_inherited_table(cls): return Column(ForeignKey('my_class.id'), primary_key=True) else: return Column(Integer, primary_key=True) class MyClass(MyMixin, self.Base): pass class MySubClass(MyClass): pass self.assertEqual("my_class", MyClass.__table__.name) self.assertEqual("my_sub_class", MySubClass.__table__.name) if __name__ == '__main__': unittest.main()
-
repo owner here's the test for single, passes:
def test_spec_pk_mixin_single_inh(self): class MyMixin(object): @declared_attr def id(cls): return Column(Integer, primary_key=True) class MyClass(MyMixin, self.Base): pass class MySubClass(MyClass): pass self.assertEqual("my_class", MyClass.__table__.name) self.assertIsNone(MySubClass.__table__) self.assertTrue(MyClass.__table__ is MySubClass.__mapper__.local_table)
is that it?
-
reporter Wow, this looks really promising. Testing now.
-
reporter I pushed a solution that simplifies how the metaclass decides to generate the tablename, then uses
__table_cls__
to create the table or clear the name. Seems to cover all test cases and open issues about it, I'm just waiting on some confirmation from a couple users.I just want to confirm that
__table_cls__
won't go anywhere even though it's undocumented. Is it safe to use that in Flask-SQLAlchemy?At this point the original issue isn't relevant, so I'll close it. Thanks for your help working through this!
-
reporter - changed status to resolved
-
repo owner a test case should be added on the SQLA side to maintain the behavioral contract of this attribute, and there should be a bit of a docstring somewhere. it's currently used as part of test cases within fixtures.
-
repo owner - changed status to open
this goes back to 1.0
https://gerrit.sqlalchemy.org/#/q/I87be5bcb72205cab89871fa586663bf147450995
- Log in to comment
Additionally, it would be nice if
declared_attr.__init__
was modified to accept any keyword arguments, so that we can add other instance attributes besidescascading
, although it's not necessary for this issue.