This article was published as a part of the Data Science Blogathon.
Time series is a series of data that are gathered over time and ordered appropriately like hourly, daily, monthly or yearly series of data in the time sequence. Application of time series includes sensor readings from industrial processes, weather data like precipitation, rainfall, temperature or crop growth in agriculture, medical records of the patient over a period, etc. Time series analysis finds a hidden pattern like trends or seasonality and analyzes the insight from it.
The pattern analyzed in time series are:
Trends represent an increase or decrease in series of data over the period.
Seasonality is a characteristic of a time series in which the data experiences regular and predictable changes that recur every calendar year. Any predictable fluctuation or pattern that recurs or repeats over one year is said to be seasonal. For instance, the holiday season affects the rate of tourism at a different location, retail sales also show similar trends.
Cyclicity represents If the pattern that repeats are not of the fixed time then they are cyclic. Stock market values are cyclic.
Irregularity is the variation of observations in a time series that is unexpected and is usually unpredictable. Floods, fires, revolutions, epidemics, and strikes are some examples.
ML models are used to generate future values of the series, i.e., to make the forecast over series of months or years. Thus, time series forecasting is the act of predicting the future by understanding the past.
What is a Heatmap?
A heatmap represents values for the first variable of interest (like rainfall, temperature, or sensor data) across two axis variables as a grid of colored squares and each cell’s color indicates the value of the first variable in the corresponding cell range i.e. Each cell reports a numeric count, like in a standard data table though the count is accompanied by color, with larger counts associated with darker colorings (In following example darker the color represents high temperature). By observing the cell color variation we can predict if there are any pattern resides inside the data.
A calendar heatmap uses colored cells, typically in a single base color hue and extended using its shades, tones, and tints like shades of blue from light to dark. It shows a relative number of events for each day in a calendar view. Days are arranged into columns by week and grouped by month and years. That enables you to quickly recognize daily and weekly patterns.
Most of us come across a Github platform inside Github we could see the daily contribution of the use which is one of the common examples for Time series data sampled by day in a heatmap per calendar year. GitHub’s contributions plot represents the number of contributions made by the user in past years. The color tiles represent the number of contributions as shown below the color scale. From this heatmap, we can detect the daily pattern like contribution daily and over months like march and April has elevated participation.
Visualization is a great way to get insight into the data. while examining the time series data it is essential to know the seasonality or cyclic behavior from the data if involved. work with calplot python library to create a heatmap. Calplot creates heatmaps from Pandas time-series data
In this example, we will use the temperature data as we know the climate are the seasonal example. we can expect the behavior of the data to be different in specific months based on the location.
Let’s dive into visualizing the time-series for average daily temperature for the year 2000.
From the above heatmap the month June to August has a high temperature compared to other months due to summer and the low temperature in the month of December as it is the winter season. By using the heatmap you could easily visualize the extreme low and high temperature. In the FEB period, you notice the white block because FEB has only 28 days.
Applying calplot separates each month of the year and the weekdays and provides a clean visual. we can customize the visuals as you need by adding the custom color or style to the heatmap.
Here you could easily relate to the temperature with the colormap gradient added line edge is not enabled.
In the above heatmap, we included a title and made the year text color black.
we can also display the values inside the block and further resized the image using fig size for better visual. Here the missing values are filled with ‘-‘. However, we do not have any missing data in this dataset
Visualizing the missing values in the heatmap
The white block above indicates the missing value. when displaying the corresponding values in the square each missing value is substituted with ‘-‘ as we discussed before.
In calplot, if the data contains the missing value it is filled by default and replace some value with zero if it can not fill the missing value. If drop zero=False is not set then it is filled with ‘-’.
we can make the line edge separating the month thicker using linewidth.
Custom colormap in calplot
To make the custom color scale make the list of hex colors and assign it to calplot. By giving the series of hex values it arranges itself to the gradient color. To reverse the gradient, rearrange the list of hex values.
When to use a calendar map?
A calendar heatmap is useful to analyze the daily values or day of the week. If we want to view daily data for the whole year, then Calendar Heat maps are helpful.
I hope you had an enjoyable time reading about my work and my insights! Any suggestions and feedback are always welcome.
I’m Kavitha a software developer and ML enthusiast fascinated by computer science and A.I.
The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.