How to create animated GIF images for data visualization using gganimate (in R)?

guest_blog 26 Jul, 2020 • 5 min read

Introduction

Data visualization is probably the most important and typically the least talked about area of data science.

I say that because how you create data stories and visualization has a huge impact on how your customers look at your work. Ultimately, data science is not only about how complicated and sophisticated your models are. It is about solving problems using data based insights. And in order to implement these solutions, your stakeholders need to understand what you are proposing.

One of the challenges in creating effective visualizations is to create images which speak for themselves. This article will tell one of the ways to do so using animated GIF images (Graphics Interchangeable format). This would be particularly helpful when you want to show time / flow based stories. Using animation in images, you can plot comparable data over time for specific set of parameters. In other words, it is easy to understand and see the growth of certain parameter over time.

Let me show this with an example

 

Example – GDP vs. Life expectancy over time

Let us say you want to show how GDP and life expectancy have changed for various continents / countries over time. What do you think is the best way to represent this relationship?

You can think of multiple options like:

  • Creating a 3D plot with GDP, life expectancy and time on 3 plots and draw lines for each continent / country. The problem is that human eye is really bad as interpreting 3D visualizations in 2D. Especially so, if there is too much data. So, this option would not work.
  • Creating 2 plots side by side – one showing GDP over time and other life expectancy over time. While this is a 2 dimensional plot, we have left a lot for user to interpret. The person need to pick a country and see its movement on each plot and then correlate them. Again, I would ask this from my stakeholders.

Now, let us look at this using an animated plot using .gif file:

The recent development of gganimate package had made this possible and easier. By the end of this article, you will be able to make your own .gif file and create your own customised frame to compare different parameters on global or local scale.

 

Pre-requisites

Please install the following packages:

  • ggmap
  • gganimate
  • dplyr
  • animation

In addition to the above libraries in R, you will also need Image Magick Software in your system. You may download and install the same from Image Magick

 

Get the Data

This article is an attempt to make .gif file on earthquake data from 1965-2016. It is better to plot year wise global seismic activity rather than a static look of all the values on the map. The data set for earthquake is available on Kaggle.
The data set contains data for global seismic activity from 1965 to 2016. Please visit the above link and scroll down to get the .csv file.

 

Earthquake magnitude of 7 points on Richter Scale from 1965-2016

The dataset had been modified and only seismic value of 7 points on richter scale has been considered for the study.

 

Data Manipulation

From the .csv file we have only selected few parameters for the sake of simplicity.

  • Date
  • Time
  • Latitude
  • Longitude
  • Type is the type of seismic activity
  • Depth is the distance of the epicenter from the seal level.
  • Magnitude is the reading on the richter scale
  • ID is the event ID of the seismic activity

We are all set to start coding in R. I have used RStudio environment. You are free to use any environment you prefer.

 

R Codes

## Read the datatset and load the necessary packages
library(plyr)
library(dplyr)
library(ggmap)
library(ggplot2)
library(gganimate)
EQ=read.csv("eq.csv",stringsAsFactors = FALSE)
names(EQ)
## Only Select the data with magnitude greater than or equal to 7.
EQ<-EQ%>%filter(Magnitude>=7)

Split the Date into year, month and date

This is done in order to get the frame which is important for the plot. In other words, The core of the approach is to treat frame
(as in, the time point within an animation) as another dimension, just like x, y, size, color, or so on. Thus, a variable in your data can be mapped to frame just as others are mapped to x or y.

## Convert the dates into character in order to split the coloumn into "dd" "mm" "yy"" columns
EQ$Date<-as.character(EQ$Date)

## Split the date and create a list for the same

list<-strsplit(EQ$Date,"-")

## Convert the list into dataframe
library(plyr)
EQ_Date1<-ldply(list)
colnames(EQ_Date1)<-c("Day","Month","Year")

## Column bind with the main dataframe
EQ<-cbind(EQ,EQ_Date1)
names(EQ)
## Change the Date to numeric
EQ$Year=as.numeric(EQ$Year)

## Get the world map for plot and load the necessary package
library(ggmap)
world<-map_data("world")

## Remove Antarctica region from the world map

world <- world[world$region != "Antarctica",]
map<-ggplot()+geom_map(data=world,map=world,aes(x=long,y=lat,map_id=region),color='#333300',fill='#663300')

#Plot points on world Map

p <- map + geom_point(data = EQ, aes(x = Longitude, y = Latitude, 
                                             frame = Year, 
                                            cumulative = TRUE,size=EQ$Magnitude), alpha = 0.3, 
                      size = 2.5,color="#336600")+
  geom_jitter(width = 0.1) +labs(title = "Earthquake above 7 point on richter scale")+theme_void()

# Plot .gif file using gganimate function

gganimate(p)

Earthquake

 

Speed up projection in .gif using animation package

As we can see that plot has too many years from 1965 to 2016. Thus, in order to speed up the visualization, we can use the animation package to fast forward using ani.option()

library(animation)
ani.options(interval=0.15)
gganimate(p)

Earthquake – 1.5x speed

 

Conclusion

This article was an introductory tutorial to the world of animated map. Readers can try this and apply the same in other projects. Some of the example are,

  • The same technology can be used to compare the heat map for the weather data across nation
  • Flood or other natural disaster in a particular location over a period of time.
  • Can be used to see the growth of metro in city using delaunay triangle. Please see the interesting article posted by Page Piccinini in r-Bloggers, Metro Systems over Time or you can directly access her page from her official site Page Piccinin.

Hope you found the article useful. If you have any questions, please feel free to ask in comments below.

Aritra Chatterjee is a professional in the field of Data Science and Operation Management having experience of more than 5 years. He aspires to develop skill in the field of Automation, Data Science and Machine Learning.

This post was received as part of our blogging competition – The Mightiest Pen. Check out other competitions here.

guest_blog 26 Jul 2020

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Kartik Patnaik
Kartik Patnaik 19 Jun, 2017

Awsome! Aritra, Great Work! Keep going..

Credendina Draco
Credendina Draco 19 Jun, 2017

gganimate package is not available for R version 3.4.0. How can I get it to work?

Vineet Kapoor
Vineet Kapoor 19 Jun, 2017

great blog , learning blog

Mun Arthur
Mun Arthur 19 Jun, 2017

Wow!

Mahesh
Mahesh 20 Jun, 2017

I am getting following error.Could you please help me with it.I cannot find ImageMagick with convert = "convert" but I can find it from the Registry Hive: C:\Program Files\ImageMagick-6.9.8-Q8 Executing:

Akram Khan
Akram Khan 21 Jun, 2017

this is great very nice article it is very helpful keep up on the good work thank you quiz online Programming interactive test Progamming Test

Akram Khan
Akram Khan 21 Jun, 2017

This is a very informative article. I also agree with your post title and your really well explain your point of view. I am very happy to see this post. Thanks for share with us. Keep it up and share the more most related post. quiz online Programming test

Sangamesh K S
Sangamesh K S 29 Jun, 2017

Interesting.. Thank you :)

rakesh
rakesh 06 Jul, 2017

This is very useful article and we are providing one workshop regarding BIG DATA by real-time experts in industry.

Lila
Lila 02 Aug, 2017

Very nice work... I´m trying to replicate it but when I´m trying to run this: ## Convert the list into dataframe library(plyr) EQ_Date1<-ldply(list) colnames(EQ_Date1) colnames(EQ_Date1) <- c("Day","Month","Year") Error in `colnames<-`(`*tmp*`, value = c("Day", "Month", "Year")) : 'names' attribute [3] must be the same length as the vector [1] How can I solve that? Thanks ...

  • [tta_listen_btn class="listen"]