When filtering for
table.c.column == foo_string, if
foo_string contains a NUL (
\x00) character, the string comparison stops just before the first NULL byte. That is,
.where(table.c.column == 'foo') and
.where(table.c.column == 'foo\x00bar') return identical result sets. This happens with Postgres and the postgresql dialect.
This is surprising behavior because it means that the returned rows do not satisfy the invariant I would expect, that
row.column == foo_string. It also has the potential to cause security vulnerabilities if the programmer assumes that
foo_string has passed some kind of validation because it matched a database row.
Attached is a MCVE, and here is its output on my system:
% createdb nullbytes; python sqla_null.py createdb: database creation failed: ERROR: database "nullbytes" already exists Python: 3.5.1 (default, Dec 7 2015, 21:59:08) [GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.1.76)] SQLAlchemy: 1.0.12 Postgres: PostgreSQL 9.4.4 on x86_64-apple-darwin14.3.0, compiled by Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn), 64-bit 'firstname.lastname@example.org' == 'email@example.com\x00 huh??'
I would expect the output not to include the last line (because the result set of the query should be empty).