Wednesday, January 20, 2010

When SQL "Like" clause will use your indexes?

I had to run a search query over a varchar column in a MySQL table that have over 8  million rows, that column had a full text index created for it.

The query went fine when i was querying that table and using "=" to filter values of that column , but when i changed from "=" to "LIKE"  , things got changed.

I was using wildcards (%) to get all possible results , the "WHERE" clause was "WHERE myColumn LIKE '%abc%' "

Running this query , it took 10x of the time needed to execute the query when it was "WHERE myColumn='abc'"

The reason here is that , when "=" was used , the MySQL used the full-text index to search for the value i want, but with wildcards , i killed any chance to use an index, and MySQL chosen to scan the table to get my query results.

So what can be done to speed this query? The answer is tune your "Like" clause or to be more specific tune the wildcards placement to give MySQL more chances to use an index.

In my case the tunning i did was removing the wildcard at the beginning, so the "WHERE" clause became " WHERE myColumn LIKE 'abc%' ".

Running this query it gave  me results in almost even if not less than the time it took to execute the query with "=". Why ?
Because when we removed the starting wildcard, MySQL used the index we have which is mostly a B-Tree index to execute the query.

But what if you really want to have the wildcard as the beginning ("WHERE" clause became " WHERE myColumn LIKE '%abc' ")

Running this query will get us to the same query execution time problem, because we killed the chance to use our full-text index.

So how to fix it? if we were on Oracle, we would use a Reverse Key Index , but we are on MySQL now.

The solution is a little hack to the query that makes use of the MySQL "Reverse()" function, first you will use this function to store your strings, then you will use in your query.
So assuming you have already stored strings after applying "reverse()" to each value, and have the full-text index created in that column , you can use wildcard in the beginning with "LIKE" clause like this
" WHERE myColumn LIKE REVERSE('%abc); "

And thats it.


Anonymous said...

What a useless post. Using reverse then removing the wildcard from the "end" of the query denies you the freedom of searching from the string within a string.

There is no tweak or use of indexes when a keyword is wrapped in wildcards. The only answer is to use Full Text Search.