Patch to correct ambiguous/buggy handling of postgres schemas; also, precompile regular expressions to speed up autoloads

Issue #402 resolved
Former user created an issue

The postgres autoload support breaks down with the following statement:

payments = Table('payments', ff_schema, autoload=True, schema='agents')

If a "payments" table appears both in the agents schema and in another schema earlier in the namespace search path (i.e., if there is a public.payments and agents.payments). The SQL used throws an error (Too many rows returned by subquery.)

Also, even if the above is not true, primary and foreign key introspection was spotty if a schema was named in the Table() invocation as the SQL used to find these constraints didn't properly account for named schemas.

In this patch, I also took the time to clean up some loop code by precompiling regular expressions. When autoloading 176 tables, this significantly improved the startup time.

Comments (6)

  1. Former user Account Deleted

    Also, this patch fixed a syntax error when a DOUBLE PRECISION column was encountered. (types.Float expected a single argument, but the code was passing in two arguments).

  2. Mike Bayer repo owner

    this is a great patch, cleaning up the row fetches and improving the queries for schema support (something thats missing from the current unit test suite) is great, and all the current reflection unit tests still pass so its good to go. the patch is committed in changeset:2172

    However I did move the regular expressions back into inline re.search() and re.split() calls; the code is clearer and easier to read when the regexes are defined where they are used, and pre-compiling regular expressions is nearly always an unnecessary optimization with negligible performance effects. the re module caches the compiled version of the expression in a dictionary the first time it is used, so you only add the overhead of a single native dictionary lookup (and maybe a little more method overhead).

    heres how long it takes to run the SA suite of reflection tests with postgres, with two runs for each version of the code:

    running:  python test/engine/reflection.py --db postgres
    
    SA rev 2171
    9.594s    9.533s
    
    using patch as posted to ticket #402:
    9.554s  9.751s
    
    SA rev 2172 with patch/inlined regexes:
    9.708s  9.514s
    

    so, no significant difference on my machine. also attached is a test program which illustrates calling 'search' on each regex 10000 times each, a lot more than would be called for 176 tables even if every table had 50 columns each, which on my machine produces a 32/100ths of a second speed improvement for precompiling (1.55 seconds vs. 1.87 seconds)...not a very big number considering a reflection of 176 tables probably takes well over a minute.

  3. Log in to comment