Harshit Ahluwalia — September 1, 2021
Beginner Data Visualization

Overview

  • What is Data Visualization
  • How to choose the right chart for your data visualization

By the end of this article, you will learn “How to choose the right chart for data visualization”

 

Introduction

I love data visualization. The complete amount of knowledge it conveys to the audience in such a limited space is astonishing. It is so easy to broadcast your message to your audience using data visualization. It allows the audience to grasp the insights in the fastest and easiest way. I’ve worked with many data visualizing tools such as Power BI, Tableau, and MS Excel. These are the brilliant tool to perform data cleaning, data preprocessing, and data visualization in many analytics projects. There are many varieties of graphs that are present in these tools such as Bar graphs, Line charts, scatter plots, Dual-axis charts, Sparklines charts, Waterfall charts, Pie charts, Area charts, Column charts are many more. In this article, I want to answer the eternal question of “How do you decide which chart to choose for your problem or your project?” It can be very overwhelming if you are new to this kind of thing, and choosing the right chart is very important.

If you are new to the data visualization field and excited to learn more, make sure you check out the FREE “MS Excel” and “Tableau” courses. You will learn the basic functionalities and how to create different charts. It’s a perfect starting point.

 

Table of content

  1. Importance of Data Visualization
  2. The objective of your visual
  3. Choosing the right visualization for your data
    1. Comparison charts
    2. Distribution charts
    3. The breakup of a whole charts
    4. Relationship charts
    5. Trend charts

 

Importance of Data Visualization

Data Visualization is a graphical representation of data and plays a vital role in understanding information in a better way. It is a way to represent data in visual content.

Look at the data that is displayed below:

Right chart for data visualization data

Picture 1: Doesn’t make any sense

subcatecory Right chart for data visualization

 

Picture 2: This makes sense(because of visualization)

What do you think, by looking at which picture, you can grasp the insights?

Of course, it is the second picture because of the graphical representation of the data.

I’ve listed down some benefits of visualization:

  1. It helps us to convey the right message to the audience through visuals.
  2. It helps us find outliers in our data.
  3. It helps the business leader to take an accurate decision.
  4. It helps us to understand how the data is distributed over time.

The objective of your visual

Before making the visualization, it is best to ask yourself what the audience will be looking for in your chart. Understand the requirements and preferences of your viewer. Know their background. Do they have enough time for a detailed visualization? How aware are they of the context of the visualization? What additional information are they looking for? Are they aware of the graphs being used? And so on. Your viewer’s information needs should be your guide in creating effective and compelling data visualizations.

 

Choose the right visualization for your data

There are a tremendous number of charts available. Choosing the right visualization is paramount when you’re presenting to a senior leader. It is not easy as it sounds, because an incorrect representation can lead to a wrong message or wrong decision taken by the audience or whatever you’ve in your mind when you were creating that chart, that message might not be conveyed to the audience. Here, your focus should be on conveying the right message to your audience in an optimal way. Now let me take you through the type of messages, that we usually send out when we’re creating impactful visualizations in business.

Right chart for data visualization 3 These are the types of messages that you usually work on. Maybe you want to show a comparison of two features for example reason wise sales, the distribution of the data, maybe you want to show the breakup of the entire whole visualization, or you simply want to show trends for example sales trends.

Let’s look at all these one by one and see what kinds of charts we can use to convey the right message.

1) Comparison Chart

In this chart, we compare one value with the other like region-wise sales, economy rate comparison of bowler in cricket. We can use the following charts for comparison.

comparison Right chart for data visualization

  • Column charts
    • It is used to compare values across multiple categories.
    • Here, the category appears horizontally(X-axis) and values vertically(Y-axis).
    • In the column charts, you can also show information about parts of a whole across different categories, and you can show this in absolute value as well as relative terms. Here comes the concept of a stacked column chart and 100% stacked column charts.
  • Bar charts

    • As you’re quite familiar with column charts, you will find that working with bar charts is very synonymous.
    • The only difference between them is that in a bar chart, values are represented on the X-axis and categories on the Y-axis.
    • We typically use a bar graph to show values across categories when the duration or category text is long.
    • Stacked bar charts are used to compare parts of a whole(relative and absolute) and compare change over categories or time.
  • Line charts
    • It is one of the most popular charts and vitally used in most industries.
    • Whether you’re analyzing sales data, whether you’re looking at year-on-year profit, whether you’re looking at how a person’s salary increases in the last year, line charts are very helpful in these scenarios.
    • The line chart is used to show trends over time or categories.
    • Here, the category appears horizontally(X-axis) and value vertically(Y-axis).
  • Scatter plots
    • An XY(Scatter) chart uses numerical values along both axes.
    • Scatter plots are useful for showing a correlation between the data points that may not be easy to see from the data alone.
    • It is used for displaying and comparing numerical values, such as scientific or statistical data.

2) Distribution charts

  • These charts are used to show the spread of the data values over categories or continuous values. We can use the following charts in order to visualize the distribution of the data. For example Distribution of bugs found in 10 weeks of the software testing phase.

distribution charts Right chart for data visualization

  • Histogram

    • It is used to graphing the frequency over a distribution. It is a very useful graph in the analytics world and can infer many useful insights from the data.
    • Visually, all the bars are touching each other with no space between them.
  • Box plot
    • It is also known as Box and whiskers plot.
    • The line in the middle of the box is the median value. This means that 50% of the data are above the median value and 50% of the data are below the median value.
    • Medians are useful because they’re not swayed by outliers as mean is.
    • Within the box itself, there is 25% of data above the median and 25% of data below the median, so 50% of the data is within the box.
    • By using this plot, we can easily spot outliers and the distribution of the plot.

box plot Right chart for data visualization

  • KDE Plot
    • KDE is an abbreviation for the Kernel Density Estimation plot.
    • It’s a smooth form of a histogram.
    • A kernel density estimate (KDE) plot is a method for visualizing the distribution of observations in a dataset, analogous to a histogram.
    • Relative to a histogram, KDE can produce a plot that is less cluttered and more interpretable, especially when drawing multiple distributions.

3) The breakup of a whole chart

These charts are used to analyze, how various parts comprise the whole. These charts are very handy in many scenarios where we have to analyze revenue contribution by different regions, batsmen scored on which sides of the ground. Charts used to represent these are listed below.

breakup

  • Pie Chart
    • If you want to represent your categorical data as part of the whole, then you should use a pie chart.
    • Each slice represents the percentage that the given category occupies out of the whole.
    • It’s better to use a pie chart if you’re having less than 5 categories.
  • Donut Chart
    • It is a variant of a pie chart, with the hole in the center.
    • It displays the categories as arcs rather than slices.
  • Stacked Column Chart
    • A Stacked column chart is used when you want to show the relative percentage of multiple data series in stacked columns, the total (cumulative) of stacked columns always equals 100%.
    • The 100% stacked column chart can show the part-to-whole proportions over time, for example, the proportion of quarterly sales per region or the proportion of monthly mortgage payment that goes toward interest vs. principal.
  • Stacked Bar Chart
    • A Stacked Bar chart is used to show the relative percentage of multiple data series in a stacked bar.

 

4) Relationship charts

These relationships charts are very helpful when we want to know that what is the relation between the different variables. Charts used to visualize the relationship between the variables are listed below.

  • Scatter Plot
    • A scatter chart uses numerical values along both axes.
    • It uses dots to represent the values for two different numerical values.
    • The position of each dot on the horizontal axis and the vertical axis signifier the value of a particular data point.
    • It is useful for showing a correlation between the data points that may not be easy to see from the data alone.
    • It is used for displaying and comparing numerical values, such as scientific or statistical data.
  • Line Chart
    • As discussed above, a line chart is also used to find the relationship between the two variables.

5) Trend charts

This is used to visualize trends of values over time and categories, it is also known as “Time Series” data in the data-driven world. For example Run rate tracker over by over, Hourly temperature variation during a day. Listed below are the charts used to represent time series data.

  • Line Chart
    • The best way to visualize trend data is by line chart.
    • Line charts are also used to see the trends in various domains.
  • Area Chart
    • It is used to see the magnitude of the values.
    • It shows the relative importance of values over time.
    • It is similar to a line chart, but because the area between lines is filled in, the area chart emphasizes the magnitude of values more than the line chart does.
  • Column Chart
    • A column chart as discussed above is also used to show the trends of values over time and categories.

 

End Notes

With this, we’ve reached the end of the article. To keep this small and concise, I’ve listed some of the basic plots that we can use in the different scenarios. Let me know in the comments if you want me to cover some more visualization concepts in the future.

About the Author

Our Top Authors

  • Analytics Vidhya
  • Guest Blog
  • Tavish Srivastava
  • Aishwarya Singh
  • Aniruddha Bhandari
  • Abhishek Sharma
  • Aarshay Jain

Download Analytics Vidhya App for the Latest blog/Article

Leave a Reply Your email address will not be published. Required fields are marked *