Python: Timezone and Daylight savings

Timezone is a hard problem. DST is even a harder problem. I found myself walking into problems and problems when I started using datetime in Python properly. So I decide to write a blog to share my experience.

“Naive” and “Aware”

The first thing to know is that in Python there are two types of datetime: offset-naive and offset-aware. Offset naive means that the datetime has no timezone information. It could be very error prone if you are new to Python. If you mix a naive datetime and aware datetime, you will get an error. And Python does not have built-in timezone support, you need to use pytz, a module for timezone information.

import pytz
from datetime import datetime

To avoid this kind of problem, you will have to make sure that your datetime objects are always offset-naive or always offset-aware. But we can’t avoid dealing with timezone, so my best practice is to always work with offset-aware datetime.

Python default is naive datetime. now() returns naive datetime in your local timezone, and utcnow() also returns a naive datetime, even the function already indicates the UTC timezone! You will need the pytz module and also tzlocal module to get things right. Below is how I get time with the right timezone.

import datetime
import pytz
import tzlocal

Parsing datetime with timezone

Update: This is only is an issue in Python 2

Another surprise is python strptime parser does not work with timezone. The below code will just fail.

>>> from datetime import datetime
>>> datetime.strptime('2017-11-15T12:00:00-0700', '%Y-%m-%dT%H:%M:%S%z')

To parse datetime with timezone, you will need another module python-dateutil . This module provides a parser that will work with timezone. Now you can parse datetime with timezone!

>>> from dateutil import parser
>>> parser.parse('2017-11-15T12:00:00-07')

Timedelta and DST

The last thing you need to be careful when manipulating offset-aware datetime. pytz will help you to tell if an date is under DST influence by checking dst() method.

>>> import pytz
>>> pst = pytz.timezone("US/Pacific")
>>> pst.localize(datetime(2017, 10, 1)).dst()
datetime.timedelta(0, 3600)
>>> pst.localize(datetime(2017, 12, 1)).dst()
datetime.timedelta(0)

Note: DST for 2017 ended at Nov 5th. So any date after Nov 5th will have timedelta(0)

But the dst() information is not updated when you manipulate the datetime

>>> (pst.localize(datetime(2017, 10, 1)) + timedelta(days=60)).dst()
datetime.timedelta(0, 3600)

What you can really do is to convert back to naive-offset datetime, then apply the delta…

pst.localize(yourdate.replace(tzinfo=None) + td).dst()

Summary

Working with Datetime in Python is error prone. It’s because Python does not support well datetime and timezone. You will have to use a lot of additional modules. However, there is actually a silver bullet, you can just use arrow. This module provides a replacement for python datetime module, and it is always offset-aware, and it also solves all problems that I mention above. The only thing I may be worry is the integration with other external applications like Spark.

I write, so I learn.