consider removing name.lower() logic with result column targeting

Issue #2423 resolved
Mike Bayer repo owner created an issue

Attached patch adds a flag "case_sensitive", off by default. As expected, when the flag is turned off, it adds approximately 7% method call overhead to selected performance tests, as it doubles the calls to "lower()" within result column rendering and processing. With the flag on, tests go back to normal.

If we were to drop the "lower()" thing entirely, then we'd speed things up overall.

Would need to decide among three options:

  1. remove the lower() thing totally
  2. add a flag, and change default behavior in 0.8
  3. add a flag, and reverse default behavior in 0.8

Option #3 would appear to be the middle ground here that will maintain performance going forward.

Comments (11)

  1. Mike Bayer reporter

    also note that any DBAPI that is converting case within cursor.description will break with this behavior turned off, and we should try pretty hard to see if any DBAPIs are doing this.

  2. Mike Bayer reporter

    fails on Oracle, for one.

    Here's some result keys:

    RESULT KEYS [u'query_users_user_name'](u'query_users_user_id',)
    METADATA KEYS ['QUERY_USERS_USER_NAME']('QUERY_USERS_USER_ID',)
    

    so this is due to "case insensitive" names, perhaps if they were quoted then we'd get back the actual case. Just makes the logic of lowercasing that much more complicated.

  3. Mike Bayer reporter

    OK that one isn't a big deal, we just need to move up the "name normalize" step, updated patch attached.

  4. Mike Bayer reporter

    It's creating semi-silent failures with Pyodbc + MSSQL. This is more serious as we're getting back from information schema a metadata like this:

    ['TABLE_NAME', 'COLUMN_NAME', 'IS_NULLABLE', 'DATA_TYPE', 'ORDINAL_POSITION', 'CHARACTER_MAXIMUM_LENGTH', 'NUMERIC_PRECISION', 'NUMERIC_SCALE', 'COLUMN_DEFAULT', 'COLLATION_NAME']('TABLE_SCHEMA',)
    

    although, that is how we named them in information_schema.py, I recall there was some reason for that.

    For some reason the result_map has partially lower case names during checks for tables, so these aren't matching up.

  5. Mike Bayer reporter

    another easy one, mssql compiler reimplements visit_column so there's a lower() in there. another patch attached.

  6. Mike Bayer reporter

    let's look into using conditionals and ternaries against the flag rather than a function. the inlining should be worth it.

  7. Log in to comment