Working with date and time data is one of the most challenging tasks in Data Science as well as programming in general. While dealing with different date formats, different time zones, daylight saving time, and whatnot, it can be difficult of keeping track of what days or times you are referencing.
Fortunately, we have the built-in Python module datetime, that comes to our rescue.
Why programming with dates and times is hard
The reason that programming with dates and times can be such a pain, is due to the fundamental disconnect between the ordered and regular fashion a computer program prefers its events to be, and how irregular and unordered ways in which humans tend to use dates and times.
One great example of such irregularity is daylight time saving, adapted by the United States and Canada. What they essentially do is set the clock forward one hour on the Second Sunday of March, and set back one hour on the First Sunday in November.
However, things might get more complicated if you factor in Time Zones into your projects. Ideally, timezones should follow straight lines along the longitudes, however, due to politics and historical reasons, these lines are seldom straight.
Standard date formats
We know, different cultures use different date formats like D M Y or Y M D or M D Y. As you can see it would be chaos if there wasn’t any standardized format, so the International Organization for Standardization (ISO) developed the ISO 8601 to define one format and avoid confusion.
According to the format the convention is from the most significant digit to the least significant digits. Thus, the format is:
How computers measure time
Most of the computers count time from an arbitrary instant called the Unix epoch. This arbitrary date is January 1st, 1970, at 00:00:00 hours UTC. Coordinated Universal time (UTC) refers to the time at 0° longitude, popularly known as the GMT or Greenwich Meridian Time. It is not adjusted for daylight saving time thus keeps constantly twenty-four hours in every day.
Unix time is measured in seconds from January 1, 1970. You can easily view the current Unix time with a few lines of code in Python.
from datetime import datetime current_time = datetime.now() a = datetime.timestamp(current_time) print(a) > 1620566182.766565
Thus 1620566182.766565 seconds have passed since January 1, 1970, while I’m writing this blog! Here is an interesting fact about Unix time. Since most of the older operating systems are 32-bit, they store the Unix time in a 32-bit signed integer.
You already know where this is going if you are familiar with the Y2K Problem. Storing in a 32-bit signed integer format means at 03:14:07 on January 19th, 2038, the integer will overflow, resulting in what’s known as the Year 2038 Problem, or commonly known as Y2038. To avoid catastrophic consequences to critical systems this problem needs to be addressed soon.
The datetime Module
The datetime module comes inbuilt in Python, so there’s no need to install it separately. Just import them into your code and you are good to go.
Different classes are supplied to work with dates, times, and time intervals in Python. The two main objects are date and datetime. These classes are of 6 types, here they are with their official description:
1) datetime.date: Used to manipulate date without interfering with time. Attributes are year, month, and day.
2) datetime.time: Used to manipulate time without interfering with date, assuming every day has exactly 24*3600 seconds. The attributes are hour, minute, second, microsecond, and tzinfo.
3) datetime.datetime: Used to manipulate the combination of date and time. The attributes are year, month, day, hour, minute, second, microsecond, and tzinfo.
4) datetime.timedelta: Duration between two date, time, or datetime objects with resolution up to microseconds.
5) datetime.tzinfo: This is an abstract base class for time zone objects. Can be used by the datetime and time classes to provide a customizable notion of time adjustment ( eg to account for timezones and/or daylight saving time).
6) datetime.timezone: An implementation of the tzinfo abstract base class.
You can read more about the classes in-depth here. Now let’s get started with the coding!
Creating DateTime objects
You can create a datetime object using the ISO format.
from datetime import datetime a = datetime(2021, 5, 9, 13, 13, 6) print(a) > 2021-05-09 13:13:06
To get the present date:
today = datetime.now() print(today) > 2021-05-10 16:28:05 print(type(today)) > <class 'datetime.datetime'>
Thus, today is indeed a datetime object. Some of the useful methods that come with datetime class are described below.
dt_nw = datetime.now() # to get hour from datetime print('Hour: ', dt_nw.hour) > Hour: 16 # to get minute from datetime print('Minute: ', dt_nw.minute) > Minute: 28
We can get the name of the day of the week using the member function .weekday(), which can then be converted into string format(i.e Monday, Tuesday, Wednesday…..) using another module called calendar.
First, we shall import the Calendar module and then use some of the useful operations.
import calendar my_date= datetime.now() # To get month from date print('Month: ', my_date.month) > Month: 5 # To get month from year print('Year: ', my_date.year) > Year: 2021 # To get day of the month print('Day of Month:', my_date.day) > Day of Month: 10 # to get name of day(in number) from date print('Day of Week (number): ', my_date.weekday()) > Day of Week (number): 0 # to get name of day from date print('Day of Week (name): ', calendar.day_name[my_date.weekday()]) > Day of Week (name): Monday
How to deal with Timezones
Date and Time objects can be broadly divided into two categories, mainly ‘Aware‘ and ‘Naive‘. In simple words, if an object contains the timezone information it is Aware, or else, Naive.
The aware objects like datetime, date, and time have an additional optional attribute to them, called the tzinfo. But tzinfo itself is an abstract class. To deal with these, you need to exactly know which methods are needed. Handling timezone problems are eased by the pytz module. Using this module we can deal with daylight savings time in locations that use it and cross-timezone conversions.
from pytz import timezone
# Create timezone UTC utc = timezone('UTC') # Localize date & time loc = utc.localize(datetime(2020, 5, 10, 17, 41, 0)) print(loc) > 2020-05-10 17:41:00+00:00 # Convert localized date & time into Asia/Dhaka timezone dhaka = timezone("Asia/Dhaka") print(loc.astimezone(dhaka)) > 2020-05-10 23:41:00+06:00 # Convert localized date & time into Europe/Berlin timezone berlin = timezone('Europe/Berlin') print(loc.astimezone(berlin)) > 2020-05-10 19:41:00+02:00
The localize() function is used to add a timezone location to a datetime object. The function astimezone() is used to covert the present time zone to some other specified timezone.
Time difference or Timespan
Sometimes while programming we need to find the remaining time for some task, or specify some kind of time span. This is where timedelta objects come into play. In most scientific jargon, delta means the difference between two things. We can use this to add or subtract dates and times from each other. Here’s how it’s done:
from datetime import timedelta # get current time now = datetime.now() print ("Today's date & time: ", str(now)) > Today's date & time: 2021-05-10 12:56:08.979894 #add 365 days to current date future_date_after_one_year = now + timedelta(days = 365) print('Date & time after one year: ', future_date_after_one_year) > Date & time after one year: 2022-05-10 12:56:08.979894 #subtract 7 days from current date seven_days_ago = now - timedelta(days = 7) print('Date & time seven days ago: ', seven_days_ago) > Date & time seven days ago: 2021-05-03 12:56:08.979894
Date Formatting using strftime() & strptime()
datetime, date, and time objects all support the strftime() method to convert an object to a string of an explicit format. the reverse is done using the datetime.strptime() method, to create a datetime object from a string.
from datetime import datetime date_str = "10 May, 2021" # format date date_obj = datetime.strptime(date_str, "%d %B, %Y") print("Today's date is: ", date_obj) > Today's date is: 2021-05-10 00:00:00
Now let’s see an example of strftime():
# current date and time now = datetime.now() # format time in HH:MM:SS time = now.strftime("%H:%M:%S") print("Time:", time) > Time: 14:05:55 # format date date_time = now.strftime("%m/%d/%Y, %H:%M:%S") print("Date and Time:",date_time) > Date and Time: 05/10/2021, 14:05:55
While working with time data or time series, it is common to deal with timestamps. We can use the datetime. timestamp() to store your data in Unix timestamp format.
# get current date now = datetime.now() # convert current date into timestamp timestamp = datetime.timestamp(now) print("Date and Time :", now) > Date and Time : 2021-05-10 14:17:54.739358 print("Timestamp:", timestamp) > Timestamp: 1620656274.739358
Similarly, we can obtain the date and time objects from timestamps:
timestamp = 1620656274.739358 #convert timestamp to datetime object date_obj = datetime.fromtimestamp(timestamp) print("Today's date & time:", date_obj) > Today's date & time: 2021-05-10 14:17:54.739358
Pandas DateTime objects
Pandas can safely be called one of the pillars of any Data Science Project. It makes it so much easier to deal with date and time objects. While working with DataFrames to_datetime() can come in handy to convert text and strings to python datetime objects.
import pandas as pd # create date object using to_datetime() function date = pd.to_datetime("10th of May, 2021") print(date) > 2021-05-10 00:00:00
There is so much more that one can do with Pandas. It is not in the scope of this one single article to go in-depth with that. But here is an awesome blog by Analytics Vidhya where you can read some more about it.
I hope you had a great time reading the article since I had writing it. Have a good day, Cheers!!
The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.