Python: Timezone and Daylight savings
Timezone is a hard problem. DST is even a harder problem. I found myself walking into problems and problems when I started using datetime in Python properly. So I decide to write a blog to share my experience.
“Naive” and “Aware”
The first thing to know is that in Python there are two types of datetime: offset-naive and offset-aware. Offset naive means that the datetime has no timezone information. It could be very error prone if you are new to Python. If you mix a naive datetime and aware datetime, you will get an error. And Python does not have built-in timezone support, you need to use pytz, a module for timezone information.
import pytz
from datetime import datetimetznaive_datetime = datetime(2018, 1, 1, 12, 0)
tzaware_datetime = datetime(2018, 1, 1, 12, 0, tzinfo=pytz.utc)# this will raise an error
tznaive_datetime == tzaware_datetime
To avoid this kind of problem, you will have to make sure that your datetime objects are always offset-naive or always offset-aware. But we can’t avoid dealing with timezone, so my best practice is to always work with offset-aware datetime.
Python default is naive datetime. now()
returns naive datetime in your local timezone, and utcnow()
also returns a naive datetime, even the function already indicates the UTC timezone! You will need the pytz
module and also tzlocal
module to get things right. Below is how I get time with the right timezone.
import datetime
import pytz
import tzlocaldef utcnow():
return pytz.utc.localize(datetime.utcnow())def now():
return tzlocal.get_localzone().localize(datetime.now())
Parsing datetime with timezone
Update: This is only is an issue in Python 2
Another surprise is python strptime
parser does not work with timezone. The below code will just fail.
>>> from datetime import datetime
>>> datetime.strptime('2017-11-15T12:00:00-0700', '%Y-%m-%dT%H:%M:%S%z')ValueError: 'z' is a bad directive in format '%Y-%m-%dT%H:%m:%S%z'
To parse datetime with timezone, you will need another module python-dateutil
. This module provides a parser that will work with timezone. Now you can parse datetime with timezone!
>>> from dateutil import parser
>>> parser.parse('2017-11-15T12:00:00-07')datetime.datetime(2017, 11, 15, 12, 0, tzinfo=tzoffset(None, -25200))
Timedelta and DST
The last thing you need to be careful when manipulating offset-aware datetime. pytz
will help you to tell if an date is under DST influence by checking dst()
method.
>>> import pytz
>>> pst = pytz.timezone("US/Pacific")
>>> pst.localize(datetime(2017, 10, 1)).dst()
datetime.timedelta(0, 3600)
>>> pst.localize(datetime(2017, 12, 1)).dst()
datetime.timedelta(0)
Note: DST for 2017 ended at Nov 5th. So any date after Nov 5th will have timedelta(0)
But the dst()
information is not updated when you manipulate the datetime
>>> (pst.localize(datetime(2017, 10, 1)) + timedelta(days=60)).dst()
datetime.timedelta(0, 3600)
What you can really do is to convert back to naive-offset datetime, then apply the delta…
pst.localize(yourdate.replace(tzinfo=None) + td).dst()
Summary
Working with Datetime in Python is error prone. It’s because Python does not support well datetime and timezone. You will have to use a lot of additional modules. However, there is actually a silver bullet, you can just use arrow. This module provides a replacement for python datetime module, and it is always offset-aware, and it also solves all problems that I mention above. The only thing I may be worry is the integration with other external applications like Spark.