ORM mapping with jython has incorrect values.

Issue #1547 resolved

Former user created an issue 2009-09-20

I setup a test database and table and inserted data using sqlalchemy. When I query data back the fields are not in the right place. I apologize if I am doing something completely wrong here but I was unable to find documentation about a proper way to do this.

OS X 10.5[BR] Jython 2.5[BR] mysql-connector-java-5.1.8-bin.jar[BR] sqlalchemy 0.6[BR]

Comments (24)

Former user Account Deleted
- attached jsql-test.py
Example of bad data.
- 2009-09-20T19:31:05+00:00

Former user Account Deleted

My debug output shows:

INFO:sqlalchemy.engine.base.Engine.0x...0x8:SELECT userstest.id AS userstest_id, userstest.name AS userstest_name, userstest.fullname AS userstest_fullname, userstest.password AS userstest_password 
FROM userstest
INFO:sqlalchemy.engine.base.Engine.0x...0x8:[]
DEBUG:sqlalchemy.engine.base.Engine.0x...0x8:Col (u'userstest_id', u'userstest_name', u'userstest_fullname', u'userstest_password')
DEBUG:sqlalchemy.engine.base.Engine.0x...0x8:Row (1, u'ed', u'Ed Jones', u'edspassword')
DEBUG:sqlalchemy.engine.base.Engine.0x...0x8:Row (2, u'ed', u'Ed Jones', u'edspassword')

edspassword Ed Jones

edspassword Ed Jones <- this should actually be Usertest.name / Usertest.id

I ran the same script with cpython and it works fine.

2009-09-20T19:50:54+00:00

Philip Jenvey
- marked as critical
- assigned issue to
- changed milestone to 0.6.0
- 2009-09-20T20:00:08+00:00
Philip Jenvey
- marked as blocker
This is due to the hash workaround for ~~#1475~~. In this case ClauseElement's simple hash based on Jython's id() (which just generates ids in a sequence) ends up colliding in ResultProxy._props with integer column indexes that also go in there

If I added 1000 or so to ClauseElement's hash it'd avoid this, but I think we're going to have potential collisions somewhere else down the line. We may already with some with the intermittent failures on the build bot

We just need to use a real identity map on Java. If the dicts that need this aren't used by too many threads we could use a synchronized IdentityHashMap, otherwise we could use a normal dict with a key wrapper
- 2009-09-21T12:54:39+00:00
Mike Bayer repo owner
did we ever actually try putting a __nonzero__ on BinaryClause and going through all the effort to fix the ORM/SQL to never do "if clause" ? the __nonzero__ would have to be tailored to the two endpoints of the expression being the same element.
- 2009-09-21T13:03:47+00:00
Mike Bayer repo owner
i mean, the same element, as well as if == or != or other expressions were used to generate the BinaryClause.
- 2009-09-21T13:04:29+00:00
Mike Bayer repo owner
I suggest this because ResultProxy._props couldn't even be an identityhashmap, since we're putting integer indexes and string column names into it. or we'd have to reorganize ResultProxy not to mix these types.
- 2009-09-21T13:07:30+00:00
Mike Bayer repo owner
a patch is attached which starts to address this. I got test.sql.test_select to pass fully and part of test.orm.test_query by putting a pdb in the __nonzero__() method to see where it's getting called.

Surprisingly, it is getting called in the ResultProxy dictionry lookup even on cython. I'm not sure why this is, since that would imply lots of breakages. I'd have to research further to understand that.
- 2009-09-21T13:22:15+00:00
Mike Bayer repo owner
this patch is closer. It also suggests (though I need to think about it) that we should not call bool(element) on any ClauseElement anywhere and raising within the base __nonzero__() would also export the idea into userland when they use 0.6 that they need to use "clause is None" instead of "bool(clause)"
- 2009-09-21T13:55:00+00:00
Philip Jenvey
Yea I'd expect it to be called in dicts, equality comparisons (a == b) is always a.eq(b).nonzero() (if that's what you're asking). I'm definitely down with this solution if you think it's workable
- 2009-09-21T15:36:09+00:00
Philip Jenvey
The fix we discussed yesterday works with something ugly like this (the clause can be Annotated):

http://pylonshq.com/pasties/d8e2a5f4ca3405d0473b508b9f0540d6
- 2009-09-22T12:35:23+00:00
Mike Bayer repo owner
OK I think instead ClauseElement should have some kind of _equals() method so that Annotated_ can do the right thing.
- 2009-09-22T12:52:02+00:00
Philip Jenvey
That's simple enough, might an Annotated ClauseElement ever have _equals() called on it against another Annotated ClauseElement? (it never happens during the test suite)
- 2009-09-22T14:20:27+00:00
Mike Bayer repo owner
Replying to pjenvey:

That's simple enough, might an Annotated ClauseElement ever have _equals() called on it against another Annotated ClauseElement? (it never happens during the test suite)

strange that it doesn't. But yeah of course it can be called between two of those. I think if you were to say bool(MyClass.foo == OtherClass.bar) would do it (i.e. declarative class attributes).
- 2009-09-22T16:22:16+00:00
Philip Jenvey
r6375 seems to pass the test suite and fix this issue, but it just makes the collisions less likely -- I'm still worried about potential collisions at some point since the comparison is ultimately based on hash codes
- 2009-09-23T13:35:08+00:00
Mike Bayer repo owner
collisions between two ClauseElements is impossible, since __hash__() returns their unique id().

Hash collisions between ClauseElement and an integer or String are possible, but then the hashing algorithm calls __eq__(), which would create a BinaryExpression, which has a left and right value that are always ClauseElement objects - the int/string is coerced - and then __nonzero__() on that element would return False. so I think we're covered.
- 2009-09-23T14:06:55+00:00
Philip Jenvey
ok so those are always ClauseElements, I'm sold then

I also liked the idea of getting rid of the jython hash workaround since you pointed out that can be a perf hit when there's many ClauseElements. We can always change that later if there's a better way, though. I'll merge this to trunk, thanks!
- 2009-09-23T15:06:26+00:00
Mike Bayer repo owner
well we still need __hash__() to return id() from ClauseElement at least because they internally rely upon comparing that number (plus the Annotated_ thing). i looked at some other ways to achieve that but hash() seems simplest.
- 2009-09-23T15:32:17+00:00
Philip Jenvey
- changed status to resolved
fixed in 5a9c1b8824bb84aaf8baccdfa2780a94af5c0f44, thanks for the original test case
- 2009-09-23T22:16:26+00:00
Former user Account Deleted
(original author: ged) I haven't followed the issue very closely but 5a9c1b8824bb84aaf8baccdfa2780a94af5c0f44 breaks a lot of code (at least in Elixir) for not a good reason (AFAICT). What does the __hash__ issue have to do with the __nonzero__ method? Couldn't the __nonzero__ method just always return True?

I have a lot of "if MyClass.table:" and "if prop:" in my code and even if I could change it all to "is None", I don't understand why you went through all that trouble of a hash problem which I don't see how it is even affected by the __nonzero__ change.
- 2009-09-24T10:35:53+00:00
Mike Bayer repo owner
Replying to ged:

I haven't followed the issue very closely but 5a9c1b8824bb84aaf8baccdfa2780a94af5c0f44 breaks a lot of code (at least in Elixir) for not a good reason (AFAICT).

it broke a lot of code in SQLAlchemy and it took me about an hour to fix all the errors. In the course of which it uncovered half a dozen tests that were invalid since they were assuming clause1 == clause2 would return True if the two objects were the same. I think the effort to no longer consider a clause element as having a boolean value is well worth it.

What does thehashissue have to do with thenonzeromethod? Couldn't thenonzero` method just always return True?

a set or dict in python hashes things based on the return value of hash(obj). when two objects return the same hash value, the __eq__() method is then used to determine if they are the same object or not. In SQLAlchemy, __eq__() returns a BinaryExpression object. therefore __nonzero__() must return True or False depending on how the BinaryExpression was created - otherwise ClauseElements cannot be safely hashed. We had one such issue in cpython when enormous lists of bind parameters were used and we did some workarounds for that one. But in Jython and other systems this kind of situation is much more prevalent. Just try out Elixir + SQLA 0.6 before this change + the test case attached to this ticket adapted for Elixir to see for yourself.
- 2009-09-24T10:49:12+00:00
Mike Bayer repo owner
its possible, but then you have this case where any code that says "if clause:" will randomly break depending on the type of clause. since there is one very common subclass of "clause" forwhich saying "if clause:" is not valid, its a much better idea to set that up for the whole hierarchy with a loud failure. By setting it on ClauseElement I found a lot more places where I was making this assumption.
- 2009-09-24T12:46:00+00:00
Former user Account Deleted
(original author: ged) As always, thanks for your legendary patience and useful explanations. Crystal clear now. Crazy but makes sense.
- 2009-09-24T15:17:51+00:00
Mike Bayer repo owner
- removed milestone
Removing milestone: 0.6.0 (automated comment)
- 2014-02-23T18:14:32+00:00
Log in to comment

Assignee: Philip Jenvey

Type: bug

Priority: blocker

Status: resolved

Component: mysql

Milestone: –

Votes: 0

Watchers: 0