postgresql - to_tsquery docs and implementation detail
I recently realized some implementation details of
to_tsquery that are incompatible with the docs I drafted and how sqlalchemy integrates it.
Not sure how to handle this.
I based the docs/examples on existing tests and text. So we have this as the first bit of "Full Text Search" ( http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html#full-text-search )
select([sometable.c.text.match("search string")]) SELECT text @@ to_tsquery('search string') FROM table
well, if we put this into psql...
badsql=> select to_tsquery('search string') ; ERROR: syntax error in tsquery: "search string"
that's because tsquery has a rigid enforcement of input. text must either be tokenized and joined with acceptable operators :
select to_tsquery('cat & rat');
be quoted :
select to_tsquery('''search string''');
or use the alternate function
select plainto_tsquery('search string');
So, these are acceptable:
select to_tsquery('search & string'); select to_tsquery('''search string'''); select plainto_tsquery('search string');
but this is not:
select to_tsquery('search string');
I'm not sure the best way to handle this nuance in the docs.
As far as the implementation detail goes...
this creates issues with
Column.match, because that generates (invalid) sql like this:
table.column @@ to_tsquery('search string')
not valid sql like:
table.column @@ to_tsquery('''search string''') table.column @@ plainto_tsquery('search string')
This is an annoying implementation detail , but the content/type of string will affect the function or format t search on.
i thought about just regexing this into submission, but the multitude of possible operators and edge cases suggests that the text would need to be parsed for tokenization instead -- otherwise throwing in a bit of text that has an operator will function as a real operator and probably trigger an invalid syntax.
i think the easiest scenario would be to replace match's