Don’t mess with LIKE
As I have already shown you, implicit type conversion is one of the most dangerous features of Oracle SQL and PL/SQL. It is dangerous because it happens automatically without your knowledge and may lead to unpredictable results. These problems are most common when dealing with DATE conversions, but not limited to them.
For example, let’s have a look at this Stackoverflow.com question by James Collins. James had a problem, the following query was slow:
SELECT a1.*
FROM people a1
WHERE a1.ID LIKE '119%'
AND ROWNUM < 5
Despite column A1.ID
was indexed, the index wasn’t used and
the explain plan looked like this:
SELECT STATEMENT ALL_ROWS
Cost: 67 Bytes: 2,592 Cardinality: 4 2 COUNT STOPKEY 1 TABLE ACCESS FULL TABLE people
Cost: 67 Bytes: 3,240 Cardinality: 5
James was wondering why.
Well, the key to the issue lies, as it often happens with Oracle, in an implicit data type conversion. Because Oracle is capable to perform automatic data conversions in certain cases, it sometimes does that without you knowing. And as a result, performance may suffer or code may behave not exactly like you expect.
In our case that happened because ID
column was NUMBER
.
You see, LIKE
pattern-matching condition expects to see character
types as both left-hand and right-hand operands. When it encounters a
NUMBER
, it implicitly converts it to VARCHAR2
. Hence, that query was basically silently rewritten to this:
SELECT a1.*
FROM people a1
WHERE To_Char(a1.ID) LIKE '119%'
AND ROWNUM < 5
That was bad for 2 reasons:
- The conversion was executed for every row, which was slow;
- Because of a function (though implicit) in a
WHERE
predicate, Oracle was unable to use the index onA1.ID
column.
If you came across a problem like that, there is a number of ways to resolve it. Some of the possible options are:
- Create a function-based index on
A1.ID
column:
CREATE INDEX people_idx5 ON people (To_char(ID));
- If you need to match records on first 3 characters of
ID
column, create another column of typeNUMBER
containing just these 3 characters and use a plain = operator on it. - Create a separate column
ID_CHAR
of typeVARCHAR2
and fill it withTO_CHAR(id)
. Then index it and use instead ofID
in yourWHERE
condition. - Or, as David Aldridge
pointed out: “It might also be possible to rewrite
the predicate as
ID BETWEEN 1190000 and 1199999
, if the values are all of the same order of magnitude. Or if they’re not thenID = 119 OR ID BETWEEN 1190 AND 1199
, etc.”
Of course if you choose to create an additional column
based on existing ID column, you need to keep those 2 synchronized.
You can do that in batch as a single UPDATE
, or in an ON-UPDATE
trigger,
or add that column to the appropriate INSERT
and UPDATE
statements in
your code.
James choose to create a function-based index and it worked like a charm.