Sergey Stadnik's blog

Date conversions in Oracle

Wed May 13, 2009 in technology oracle

When I was going through PL/SQL procedures written by some of my colleagues, I noticed a few mistakes made around the Oracle’s date conversion functions. There are some peculiarities about those functions that I thought everyone knew about. But I reckon if I write about it, it may help others to avoid such mistakes. I also allowed myself to outline a few rules that, if you adhere to them, will help you to write better programmes.

There are 2 functions in Oracle to convert strings to dates and back.

The first one is TO_DATE – it takes a string parameter and returns a date. Ok, it’s actually adate/time combination enclosed into a single data type – Oracle’s DATE.

The second one is TO_CHAR – does the opposite: it takes date/time as Oracle’s DATE data type and converts it to string.

Actually, these functions are bit more complex than that, but for our purpose that will do. What’s important to understand here is the distinction between date as a DATE data type and its string representation.

For example, when you type ’01-APR-09’ in the procedure’s text, that’s a string, representing a date. Pay attention here: although you meant a date, Oracle sees a string. For Oracle everything that is enclosed in single quotation marks is a string. To make it a date, we need to convert this string to a DATE data type. Such conversion can be carried out by 2 possible ways: explicitly and implicitly.

Explicit conversion is when we apply the TO_DATE function to the string:

v_date DATE;
v_date := TO_DATE('01/04/2009', 'DD/MM/YYYY');

Now v_date is a date, representing April 1st, 2009.

Implicit conversion is when we let Oracle to perform the conversion:

v_date DATE;
v_date := '01-APR-09';

It has the same effect. Every time Oracle sees a string in place where it expects a date, it is smart enough to perform the conversion for us. “Well”, you may think, - “That’s great. Oracle does it all for us, so we don’t have to do it. Life is easier, let’s go for another coffee break”.

Not quite.

You see, when Oracle does such implicit conversion, it relies on some assumptions. If you read the documentation for TO_DATE and TO_CHAR functions, you’ll find that they take another optional parameter – the date format. That format tells Oracle how the string representing the date/time should be treated. If the format parameter is not specified, it is taken from NLS_DATE_FORMAT Oracle parameter. Here’s the crux: We can’t assume that this parameter will be the same on all Oracle systems. Although it is ‘DD-MON-RR’ by default and it is left like that on most Oracle systems, we can’t assume that it’s going to be like this always and everywhere. And if you rely on implicit date conversions and some DBA changes NLS_DATE_FORMAT parameter – WHAM! – All your programs will stop working.

So, a good practice and rule of thumb for you should be:

Never ever rely on implicit date conversions! Whenever you need to convert date to string or vice versa, use an appropriate TO_DATE or TO_CHAR function and always specify a date format.

Just like this:

v_date DATE;
v_date := TO_DATE('01/04/2009', 'DD/MM/YYYY');

The danger of NLS_DATE_FORMAT being changed is the biggest threat but not the only one.

Pay attention to the default date format I provided just above – ‘DD/MM/RR’. Do you notice anything suspicious? The year is 2 digits. Here Oracle tries to be smartass and tries to guess whether you mean XX or XXI century. Your only hope that it can figure out what you meant and doesn’t make a mistake. But if it mistakes – oops, welcome back the Millennium Bug. This brings us to the second rule:

Always specify the 4-digit year.

Another dangerous programming technique is trying to convert Date to Date where no conversion is necessary.

Let’s have a look at the following example, or should I say a puzzle?

DECLARE
   v_date DATE := '01-APR-09';
   v_date_2 DATE := TO_DATE (v_date, 'DD/MM/YYYY');
BEGIN
   dbms_output.put_line (TO_CHAR (v_date_2, 'DD/MM/YYYY'));
END;

Try to guess what will be printed as a result.

If you think ‘01/04/2009’, you’ve just screwed your business critical application and have sent it two thousand years back in time.

In fact, you’ll get ‘01/04/0009’.

This is where it all goes bad:

v_date_2 DATE := TO_DATE (v_date, 'DD/MM/YYYY');

And here’s why:

The first thing Oracle tries to do is to execute TO_DATE function. There is only one TO_DATE function in Oracle – the one that takes a string and converts it to a date. Despite we know that v_date is not a string, Oracle still proceeds with its logic. If you run this code, it won’t produce an error.

Oracle successfully convinces itself that it sees a String where it has a Date. That happens because Oracle is able to implicitly convert that date to a string, effectively turning that line into

v_date_2 DATE := TO_DATE (TO_CHAR(v_date), 'DD/MM/YYYY');

But, as we’ve already learned, implicit date to string conversions are performed using the date format recorded in NLS_DATE_FORMAT Oracle parameter, which is by default set to ‘DD-MM-RR’. Hence, what Oracle effectively does is this:

v_date_2 DATE := TO_DATE (TO_CHAR(v_date, 'DD-MM-RR'), 'DD/MM/YYYY');

Can you spot the error already? The date formats are inconsistent! This is what you get when you don’t pay attention to the details.

So, here comes rule 3:

Avoid unnecessary conversions. Never convert dates to dates.

If you think that all this stuff is pretty confusing, that’s because it indeed is. The good news is that you can avoid the confusion altogether by learning to program in a more clear, more concise way. That is a foundation of a good programming style.

Role play modelling

Thu Nov 06, 2008 in technology

Virtually every software development methodology stresses a need for specifying how the software end product is expected to work before starting the programming. But the truth is that writing specifications is boring. And most programmers, being lazy people (in good meaning of “lazy”. Yes, I believe there is a good meaning) try to avoid boring work. So, when the specs are written, they are often incomplete and don’t cover all the possible scenarios. Moreover, as the requirements change, these specs become outdated and, hence, even less useful.

I wonder if there is a way to replace the boring specs with something which is fun.

One possible way to go may be a “Role play-style modelling”. Just think about it. Usually software product consists of a number of modules communicating to each other. Each module does a specific task. In a typical software project, different developers are assigned to implement different modules.

Now imaging that before diving into coding, you have a kind of a role-playing game, where each of the programmers is actually pretending to be a module he is going to program. It goes like that:

Dave is a module that reads an XML file and converts it into a spreadsheet, Bob is a module that is responsible for taking the spreadsheet and sending it to users as an e-mail attachment.

Dave: Trying to open the XML file. Wait a minute, what if it’s not a well-formed XML, what am I supposed to do then? I should probably send an error message back. Ok, I read it and successfully transformed. Now I notify Bob that it’s ready. Bob, we need to decide on this notification protocol. Now, it’s your turn.

Bob: Ok, first I need to create an e-mail. Where do we get the e-mail address and subject line from – should we ask the user? This is something to find out. Then I need to check that the e-mail address is well-formed and the subject line is not empty. Then I read the spreadsheet file Dave provided and attach it. Dave, how big this file can be?

Dave: I reckon it’s up to 1 Mb, but I need to clarify it.

Bob: I need to find out if I need to compress it. Then I try to send the e-mail. If it is sent successfully, I report Ok status back to Dave. If it’s failed, I report failure status with the error message.

…

And so on.

That would give all the participants the understanding of how the end product should work. And, assuming the programmers are taking notes as they do it, every one will end up with a mini-spec just for themselves.

Another advantage of such approach is that it would encourage people to discuss every possible situation and ask “what if” questions. And asking them is the most important part of the whole process. While there is at least one outstanding question, the spec is incomplete. So, after our game, Bob and Dave would depart on harassing whoever appropriate, like business analysts or even end users, to extract the answers from them. And once they have the answers, we would repeat the game again.

And would do it over and over again, until everything unknown is ruled out.
It sounds like a odd approach, but it just might be odd enough to work.

The recruitment landscape is changing…again

Thu Oct 30, 2008 in technology

A few days ago I went to a presentation by an Oracle recruiter. He talked about the ways Oracle searches for the potential candidates and and hires them. One particular thing that struck me was that Australian Oracle recruitment team does not use specialised recruitment web sites.

About a decade ago, looking for a job was all about reading newspaper ads. Then along came job web sites, and the recruitment landscape changed forever.

I am myself of a younger generation, and looking for a job for me was never associated with newspapers. When I look for a job, I go to a web sites such as Seek or Monster.com and search there. I take it as a given that all the jobs I may be interested in are posted there, either directly of through recruitment agents. This approach is more convenient for both job advertisers and seekers, because it allows them to find each other more easily. And most of the activity is concentrated around a few dominant players such as aforementioned Seek and Monster. This is because of so-called network effect: such web sites are created to connect people to each other, in our case job seekers and advertisers, and because the number of possible connections is proportional to the square of participants, the probability of “getting connected” through a large job web site is much higher than through a small one. Thus, size does matter. Indeed, if I want to find a job, the first thing I do is look at the largest local job site because this is where most jobs are, much like Willie Sutton robbing banks “Because that’s where the money is”.

But that is changing too.

As I have already said, I was surprised to hear that Australian Oracle recruiters do not use job web sites. Instead, they exploit a relatively new phenomenon of Internet social network services.

Social networks benefit from network effect at the scale of magnitude comparing to the traditional job sites. This is because the network size of job sites depend on the number of job advertisers paying for placing their ads. In other words, id depends on supply of ads. And if supply side does not increase, the network will not grow.

On the other hand, the growth rate of a social network is not limited by either supply or demand. As long as new members keep subscribing, network will grow. Even if they stop, the existing members will still be able to establish new connections until all the members are connected to each other.

A good example is LinkedIn, which is effectively a tool for building a professional social network. It allows you to easily find your co-workers, provided they are registered with the site, as well as connect with potential employers.

So, what is there for employers? The benefit for them is that they are able to build their network, even if they are not actively hiring. And after that, when they need it, they will be able to instantly find the right candidates through their connections. It is like having a job seekers bank at their disposals, a readily available database of resumes. And that bank will update and grow by itself.

That leads us to an interesting conclusion: within this new paradigm if you’re not online, you cannot be found. And if you cannot be found, you can’t be hired. Just because you are outside of the network.

All in all, we are witnessing a birth of a new recruitment methods. Who knows where it is going to lead us. Maybe in 5 years it will be mandatory for job seekers to have a Second Life avatar and all the interviews will be conducted in virtual reality.

Australian Oracle recruitment team has a blog, they also follow LinkedIn, Twitter and MySpace networks.

The issue of risk

Tue Oct 14, 2008 in technology

Once during one of the projects I was asked what was the risk of something going wrong. I remember I answered that we, programmers, do not assess risks. We just do everything possible to prevent them. “There is no point”, I said, “to calculate the chances and the impact of a file not being in the incoming directory upon the start-up of the process if we can implement a simple check and avoid the disaster”. Needless to say, I was wrong.

Nowadays I work in a bank. And one of the things I learned about banks, they all have departments responsible for managing risks. A bank in its business plays a risky game - it lends money. It is risky because there’s always a possibility that it will not get them back. And banks learnt to manage these risks. They ask: “What can go wrong? What can we do to prevent that? And what can we do to minimize the impact if we won’t manage to prevent it?”

Software industry is much much younger than banking. But we are proud of ourselves, because we use cutting-edge technology and most modern methodologies. Yet, we haven’t learnt some basic principles other industries employ for decades.

We don’t manage risks.

I’ve seen a few projects screw-ups, all for various reasons: a technology chosen wasn’t robust enough to support the solution, the staff’s skills were not strong enough to implement the solutions, requirements were unrealistic… The cause may be different, but the result is always the same - failure, or infinite slippage of the deadline, which is the same as a failure. And in every case the failure seemed to have been unexpected. It puzzled me - why could nobody predict it? And it happens over and over again. A few years ago I was a faithful reader of The Daily WTF blog. Every day it published failure stories like the ones I witnessed. So, if we are so smart, why does it keep happening over and over again?

Because in a software project nobody asks what can go wrong.

Not all risks are the same. Some risks can be assessed and eliminated. Such as a risk that the existing or planned hardware will not be able to manage the workload. This kind of risk is calculable. We can do a stress test, measure the execution time, do basic calculations and determine not only that we need a new server with more powerful CPU, but how exactly powerful it needs to be.

Others are harder to predict. How would you assess a risk of your project manager being a crook and ruining a project?

And, finally, there’s a mother of all risks – a highly improbable event, the Black Swan, as Nassim Nicholas Taleb calls it in his book “The Black Swan: The Impact of the Highly Improbable”. (I would highly recommend it to everyone who wants to learn more about risks) What, for example, is going to happen to your project if all the programmers decide to quit at the same time?

Some time ago I worked on a project that employed so called custom rules engines for performing complex batch calculations on a huge dataset - about a hundred million rows. We had a few thousand rules defined by business people, and in order to get a result we needed to apply every rule to the source dataset. As you can imagine, it was slow. Scanning a hundred million rows is not fast, especially when you need to do it a thousand times. So, a strategic decision was made to speed it up. One of our local geniuses came up with a brilliant, as he thought, idea: instead of applying a thousand rules hundred thousand times each, we will reformat the existing rules so that they form one monstrous rule, containing all thousand, which can be applied to the source dataset in one pass. And so we did. One thing, though, we didn’t take into account: when combined all together, the rules formed a frankensteinian SQL expression more than 1,000,000 character long (that’s right, more than a million). Yes, it worked, and the process became reasonably fast. But what was the price paid?

Imagine that you are a support programmer. You are quite unhappy with your life, and your job in particular, because you’re fed up with those stupid users and their problems. And on one evening, just as you were going to go home, you get a call. Apparently, the rules engine failed. You open the log and what do you see? A million characters long SQL and at an error message the end, saying that an error happened somewhere in there. Yep, somewhere in those million chars. You see, that’s the way Oracle works. It tells you that an error has happened and tells which one (let’s say, division by zero), but doesn’t tell where exactly within the given SQL it has happened. So, what are you going to do, step by step? Don’t know about you, but I would start writing a resignation letter. And I wasn’t the only one confused. No one I asked could answer that question.

You may ask, what does it have to do with risk? There is a direct connection.
We had certain risk that the system being built couldn’t handle the workload. The amount of that risk was known. We could calculate how much time it took to process the data set, extrapolate it to the given requirements and find out that it took 10 times more time than was acceptable. We could resolve it by buying 10 times more powerful server, or cluster of less powerful servers. In any case, the risk and the cost associated with eliminating those risks were calculable.

After a strategic decision was made to solve the performance problem by much more complex rules engine, the risk became incalculable. Yes, they prevented the performance problem, but introduced another and much more serious: in a case of a single failure the system was going to go down for undefined time. Because no one would be able to fix it. And the undefined downtime would mean an unlimited loss for the business, because in a business world time is always money.

So, what they did is substituted an imminent, but assessable risk, that, in fact, could be prevented, with a less probable, but potentially much more severe which that could not be assessed. I don’t believe that solving a minor problem by introducing a much severe one, and hiding a possibility of a failure under the rug is a proper risk management. In the long term it is never a good strategy.