6 Essential Data Visualization Python Libraries – Matplotlib, Seaborn, Bokeh, Altair, Plotly, GGplot

[email protected] 25 Jun, 2020
6 min read

Introduction

“Visualization gives you answers to questions you didn’t know you had.” – Ben Shneiderman

My day-to-day work as a Data Scientist requires a great deal of experimentation. That means I rely a lot on data visualization to explore the dataset I’m working on.

And I couldn’t relate more to Ben Shneiderman’s quote! Data visualization gives me answers to questions I hadn’t even considered before. After all, a picture is worth a thousand data points!

This naturally leads to the million-dollar question – which Python library should you use for data visualization? There are quite a few across the board. Even seasoned data scientists can get lost in the myriad sea of features that each Python library has to offer.

data visualization python

That’s why I wanted to write this article espousing the advantages and unique features of the different data visualization Python libraries. We will cover some of the most amazing libraries for visualization that Python supports. Each of these libraries possesses its own flair and is really useful for a particular kind of visualization task.

So without much ado, let’s start!

If you’re new to Python and/or data visualization, I suggest checking out the below resources by Analytics Vidhya:

 

The 6 Data Visualization Python Libraries We’ll Cover

  1. Matplotlib
  2. Seaborn
  3. Bokeh
  4. Altair
  5. Plotly
  6. ggplot

 

1. Matplotlib

Chances are you’ve already used matplotlib in your data science journey. From beginners in data science to experienced professionals building complex data visualizations, matplotlib is usually the default visualization Python library data scientists turn to.

matplotlib is known for the high amount of flexibility it provides as a 2-D plotting library in Python. If you have a MATLAB programming background, you’ll find the Pyplot interface of Matplotlib very familiar. You’ll be off with your first visualization in no time at all!

 

Unique features of Matplotlib

Matplotlib can be used in multiple ways in Python, including Python scripts, the Python and iPython shells, Jupyter Notebooks and what not! This is why it’s often used to create visualizations not just by Data Scientists but also by researchers to create graphs that are of publication quality.

Matplotlib supports all the popular charts (lots, histograms, power spectra, bar charts, error charts, scatterplots, etc.) right out of the box. There are also extensions that you can use to create advanced visualizations like 3-Dimensional plots, etc.

data visualization python

What I personally like about matplotlib is that because it’s so flexible, it lets the user control aspects of the visualization at the most granular level, from a single line or dot in the graph to the entire chart. This means you can customize it at the highest levels.

 

Tutorial(s) to learn matplotlib

Here are some useful tutorials to learn matplotlib:

Here’s Matplotlib’s creator giving an introductory tutorial:

 

 

 

 

 

 

 

 

 

 

2. Seaborn

When I look at visualizations built by Seaborn, only one word comes to mind – beautiful! Seaborn is built on top of matplotlib and provides a very simple yet intuitive interface for building visualizations. When using Seaborn, you will also notice that many of the default settings in the plots work quite well right out of the box.

 

Unique features of Seaborn

The first unique feature of Seaborn is that it is designed in such a way that you write way lesser code to achieve high-grade visualizations. Here is an example of this simplicity. Notice how we can create a complex visualization with just a single line of plotting code:

data visualization python

Source: Seaborn

The second useful feature of Seaborn is that it supports a plethora of advanced plots like categorical plotting (catplot), distribution plotting using kde (distplot), swarm plot, etc. right out of the box. And of course, we saw one example of relplot above.

Now, because Seaborn is built on top of matplotlib, it is highly compatible with it. So that means when building visualizations, you can start with advanced plots that seaborn already supports and then customize them as much as you want with the help of matplotlib.

 

Tutorial(s) to learn Seaborn

Here are some helpful resources that you can utilize to start using the seaborn library for data visualization:

 

3. Bokeh

Bokeh is a library designed to generate visualizations that are friendly on the web interface and browsers. And that’s what this visualization library specifically targets.

data visualization python

You will also notice that the visualizations generated from Bokeh are interactive in nature, which basically means you can convey information in a more intuitive way through your plots.

 

Unique features of Bokeh

Bokeh supports unique visualizations like Geospatial plots, Network graphs, etc. right out of the box. If you want to show these visualizations in a browser, there are options available to export them and you can also use it through JavaScript itself!

 

Tutorial to learn Bokeh

Here is a nice tutorial to learn Bokeh for data visualization:

 

4. Altair

Altair is a declarative library for data visualization. Its principle is that rather than focusing on the code part, one should focus on the visualization part and write as less code as possible and still be able to create beautiful and intuitive plots. That’s right down my alley!

 

Unique features of Altair

Since Altair uses a declarative style to create plots, it becomes very easy and quick to iterate through visualizations and experiments at a rapid pace when using this library.

 

Tutorial(s) to learn Altair

Here is a good introduction to Altair in Python:

 

5. Plotly

The first thing that comes to my mind when I think about Plotly is interactivity! This data visualization library is by far my go-to library whenever I want to create visualizations that need to be highly interactive for the user.

Just check out this visualization created using Plotly:

data visualization python

 

Unique features of Plotly

Plotly is highly compatible with Jupyter Notebook and Web Browsers. This means whatever interactive plots you create can easily be shared in the same manner with your teammates or end-users.

I also want to point out that Plotly supports a gamut of plots right from basic chart types, Seaborn-like beautiful and advanced plots, 3-D plots, Map-based visualizations, scientific plots, etc. The list is endless!

Plotly’s plots can also support animation capabilities as well. So, it’s a pretty useful library if you want to do storytelling through visualizations.

 

Tutorial to learn Plotly

Here are a couple of tutorials to get you up and running with Plotly for data visualization:

 

6. ggplot

data visualization python

ggplot is the Python version of the famous ggplot2 of R and the Grammer of Graphics language. If you have used it in R before, you will know just how simple it is to create plots using this library.

I personally love the flexibility of ggplot. We can easily wrangle data while building plots on the fly – a super useful concept!

 

Unique features of ggplot

ggplot is also a declarative style library like Bokeh but is also tightly coupled with Pandas. This means you can easily build visualizations using your Pandas dataframe itself!

 

Tutorial(s) to learn ggplot

You can learn more about ggplot and how to work with it here:

 

End Notes

In this article, we explored some of the must-know libraries for performing data visualization in Python. Each of these libraries is quite popular in its own right and shines out in different scenarios.

I hope this article will be like a rosetta stone when you are going to decide which library to use for your next project.

Do you think any other data visualization library should be on this list? Did you like the article? If yes, comment below!

[email protected] 25 Jun, 2020

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,