Index hinting in selects and joins

Issue #921 resolved
jek created an issue

Some dbs allow index hints on queries, typically tied to an (or each) table in a FROM clause. Some partial examples:

mysql:: ... FROM blah FORCE {INDEX|KEY} [{JOIN|ORDER BY|GROUP BY}]({FOR) (index_list) sybase:: ... FROM blah [ WITH ( table-hint | INDEX ( index-name ) ) | FORCE INDEX ( index-name ) ] oracle:: SELECT /*+ index(blah ix_blah) */ ...

sqlite seems to have some join order hinting via using CROSS JOIN instead of INNER JOIN.

Comments (11)

  1. Mike Bayer repo owner

    heres two assumptions I make about index hints:

    • they are oriented towards a very specific SQL statement which has proven, through analysis, to require the hint (i.e. you would not be issuing "hints" generically across statements)
    • because of the above rule, they are also oriented towards a specific database, since every DB has different index behavior (i.e., you would not be issuing "hints" generically across databases)

    So an easy to implement and mostly open ended API looks like:

    select(...).with_hint(blah_table, "FORCE {INDEX|KEY} [{JOIN|ORDER BY|GROUP BY}]({FOR) (index_list)")
    
    select(...).with_hint(blah_table, "WITH ( table-hint | INDEX ( index-name ) ) | FORCE INDEX ( index-name )")
    
    select(...).with_hint(blah_table, "+ index(blah ix_blah)")
    

    so the compiler is only taking care of inserting the hint at the appropriate location in the statement.

    a more locked down API would be along the lines of:

    select(...).with_hint(blah_table, ["ix_foo", ...]("ix_blah",))
    

    but that doesn't account for the various syntaxes I see here. like whats the difference in sybase between "WITH INDEX foo" versus "WITH FORCE INDEX foo" ?

  2. jek reporter

    a strict syntax probably won't fly, the implementations are just too varied. i like the with_hint(t, s). one other note: blah_table can be a table or alias.

  3. Former user Account Deleted

    Hi,

    I'm on a team using SqlAlchemy on top of MySQL. MySQL consistently chooses the wrong index for several queries that we do, so this feature is currently at the top of needs for SqlAlchemy.

    Can you give some more details about when this feature might be scheduled for implementation?

    If we were to implement this feature ourselves, are there any gotchas that we should look out for?

    Thanks! Chris

    cegner at yahoo-inc dot com

  4. Mike Bayer repo owner

    Assuming the proposed API does whats needed, the implementation would go something like this: a "with_hint()" method is added to sqlalchemy.sql.expression.Select, which generates a copy of itself in the same manner as all the other generative methods (such as the "where()" method, for example), and stores the "hints" in a dictionary keying Table object to "hint" string. The dictionary of hints would need to be copied each time since the method of select's "generative" behavior is that the new select() is only a shallow copy of the previous; each mutator method must deeply copy whatever mutable structures it changes (so i guess that qualifies as a "gotcha").

    the compiler step is a little more weird since the position of the index hint is not consistent in the output. It seems like the mysql and sybase compilers would override "visit_table()", check for the "asfrom=True" flag and if present, compile the hint into the returned string; so fairly straightforward. For Oracle, it looks like oraclecompiler can implement an overridden "get_select_precolumns()" method and place its little comment-oriented hint right before the columns list using that method (while ensuring to also include the output of the default get_select_precolumns() method).

    so a working patch and especially with compilation-testing unit tests in test/dialect/mysql.py etc. we can commit right away.

  5. Mike Bayer repo owner

    I would also add that we can accept a mysql-only patch to start with as well, no issue there.

  6. Mike Bayer repo owner
    • changed milestone to 0.6.0

    tentatively putting this on 0.6.0, since this should be a very quick implementation. But it could still move out to 0.6.xx

  7. Log in to comment