Don’t mess with LIKE
As I have already shown you, implicit type conversion is one of the most dangerous features of Oracle SQL and PL/SQL. It is dangerous because it happens automatically without your knowledge and may lead to unpredictable results. These problems are most common when dealing with DATE conversions, but not limited to them.
For example, let’s have a look at this Stackoverflow.com question by James Collins. James had a problem, the following query was slow:
SELECT a1.* FROM people a1 WHERE a1.ID LIKE '119%' AND ROWNUM < 5
A1.ID was indexed, the index wasn’t used and
the explain plan looked like this:
SELECT STATEMENT ALL_ROWS Cost: 67 Bytes: 2,592 Cardinality: 4 2 COUNT STOPKEY 1 TABLE ACCESS FULL TABLE people Cost: 67 Bytes: 3,240 Cardinality: 5
James was wondering why.
Well, the key to the issue lies, as it often happens with Oracle, in an implicit data type conversion. Because Oracle is capable to perform automatic data conversions in certain cases, it sometimes does that without you knowing. And as a result, performance may suffer or code may behave not exactly like you expect.
In our case that happened because
ID column was
LIKE pattern-matching condition expects to see character
types as both left-hand and right-hand operands. When it encounters a
NUMBER, it implicitly converts it to
VARCHAR2. Hence, that query was basically silently rewritten to this:
SELECT a1.* FROM people a1 WHERE To_Char(a1.ID) LIKE '119%' AND ROWNUM < 5
That was bad for 2 reasons:
- The conversion was executed for every row, which was slow;
- Because of a function (though implicit) in a
WHEREpredicate, Oracle was unable to use the index on
If you came across a problem like that, there is a number of ways to resolve it. Some of the possible options are:
- Create a function-based index on
CREATE INDEX people_idx5 ON people (To_char(ID));
- If you need to match records on first 3 characters of
IDcolumn, create another column of type
NUMBERcontaining just these 3 characters and use a plain = operator on it.
- Create a separate column
VARCHAR2and fill it with
TO_CHAR(id). Then index it and use instead of
- Or, as David Aldridge
pointed out: “It might also be possible to rewrite
the predicate as
ID BETWEEN 1190000 and 1199999, if the values are all of the same order of magnitude. Or if they’re not then
ID = 119 OR ID BETWEEN 1190 AND 1199, etc.”
Of course if you choose to create an additional column
based on existing ID column, you need to keep those 2 synchronized.
You can do that in batch as a single
UPDATE, or in an
or add that column to the appropriate
UPDATE statements in
James choose to create a function-based index and it worked like a charm.