Recently, I’ve been working with Hive a lot and I bumped into this simple thing that I wasn’t aware of existed in SQL. So I decided to share in this blog post.
Let’s say you need to extract a month from a timestamp. The following code snippet will return a string: one or two numeric characters, depending on the month. October – December will have two characters, all other months will have one.
MONTH(your_timestamp) AS month
This can be annoying, for example when you’re concatenating the month to the year. There are multiple ways to ensure you always receive two digits.
Using a CASE statement
First, you could add a CASE statement that checks the month and adds a zero when necessary.
CASE WHEN MONTH(your_timestamp) < 10 THEN CONCAT(0,MONTH(your_timestamp)) ELSE MONTH(your_timestamp) END AS month
But maybe, let’s not add four lines of SQL when it’s not necessary.
However, there’s a more elegant way: use LPAD. By setting the minimum length of the strength, and setting the padding character to a zero, you’ll achieve the same thing.
LPAD(MONTH(your_timestamp),2,0) AS month
By the way, this should work in many other SQL dialects such as MySQL and Oracle.
Finally, you could stick with date functions and replace MONTH with DATE_FORMAT. By specifying the MM date pattern, the DATE_FORMAT function will always return two digits for the month.