See the data. Show the visual. Tell the story. Engage the audience.
Tableau is one of the most popular Data Visualization tools used by Data Science and Business Intelligence professionals today. It enables you to create insightful and impactful visualizations in an interactive and colorful way.
It’s use is not just for creating traditional graphs and charts. You can use it to mine actionable insights thanks to the plethora of features and customizations it offers.
Famous for its ease of use and simple functionalities, making insightful dashboards like the below takes only a few clicks:
In this article, we will look at a few advanced graphs that go beyond the drag and drop feature. We will create calculations to dive deeper into our data to extract insights. We will also look at how R can be integrated and used with Tableau.
This article assumes that you possess a fair amount of knowledge about using Tableau, such as basic chart formation, calculations, parameters etc. In case you don’t, I would recommend referring to the following articles first and then heading back here:
Almost all Tableau users are privy to the various elementary graphs, such as those shown in the introductory dashboard. Such charts can be easily made using the ‘Show Me’ feature of Tableau. But since this is an article meant for advanced users, we are going to move beyond ‘Show Me’ and explore graphs that require some extra computations.
First, let’s take a quick look at what we are going to be making in the next few sections. Below is some basic analysis of the Sales and Profit of our Superstore. Simple graphs will serve the same purpose as those in the dashboard, but I think you would agree that there is something exciting and enrapturing about the grandeur of these charts.
Before we begin, have a look at Hans Rosling’s World Economics Representation visualization. Hit play, and see the magic unfold.
Interested in making one of your own now? If you have already started worrying about animation, don’t! What you saw is called a Motion Chart. Using this, you can view the changes in your data in real-time.
So let’s start by downloading the Superstore dataset which can be found here.
By now making trend lines like the following should be easy for you:
But what we are first going to learn in this section is how to make the below trend lines in motion:
So let’s get started!
Suppose you want to explore the Sales of the various segments of the Superstore (for an entire year). One way to do this is the following:
While an alternate option could be the below:
Although the Line chart managed to show the difference of Sales between each Segment, the Bump Chart (in the above image), gave a more clear and concise picture of the same outcome.
Such charts are mostly used to understand how the popularity of a particular product is changing over the years.
Let’s try and make one of our own now:
The chart that you will get won’t look like the chart in the dashboard because it lacks the Labels. Let’s remedy that quickly, with the help of a Dual Axis:
You see Rank and Rank (2) in the Marks Pane? We are going to use these to create those circled Labels.
A donut chart is yet another representation of an elementary chart. To put it candidly, its a pie chart with a hole in the middle, but it helps put more emphasis on the various segments, as you can see below:
Let’s understand the difference as we create this.
You must have understood by now that all the above charts, although different in their final looks, were all derived from the core graphs of the ‘Show Me’ feature. But wait, its not over yet. I have more to show you.
A waterfall chart derives its name from its analogous orientation and flow. Here we have plotted the Running Sales of the Superstore over its years, and you can see the two small red areas in the middle of 2013 and the beginning of 2014, indicating that the Sales actually dipped and also the measure by how much.
This implies that such charts are used to analyze the cumulative effect of a Measure, and see how it increases and decreases as a whole. To understand this better, let’s visualize it.
A waterfall chart is a derivative of a Line Chart, so we will begin with this graph:
Note: Here the X-axis is Order Date (in Month-Year format and converted to Discrete). And the Y-axis is Profit.
The calculated field was used to fill in the space in the Gantt Chart. A negative value in Profit would extend the bar downwards, whereas a positive one would extend it upwards.
The length of each small bar in the chart represents the amount of change in Profit from one month to the next.
The graph that you will get could be very easily represented in the form of a Bar Chart as well. Do note that I have reversed the colors here, to make the anomalies stand out:
But I am sure you would agree that using a Waterfall chart was a more intuitive way of representing the data, especially to see the changes in Measures such as Sales and Profit over the years.
Below I have visualized a popular 80-20 principle of data analytics. If you have not heard of it, let me try and explain it with our example. It is often observed that the majority of the sales of a Superstore come from a select few products.
One cannot expect bread and eggs to have the same sales figures as cakes, right? This is officially termed as the 80-20 principle, meaning that 80% of the Sales come from 20% of the Products. In our Superstore, this principle can be observed in the below chart, where most of the sales are generated by Phones and Chairs :
Quite a popular visualization, Pareto charts are often used for Risk Management to determine the most common problems that are having the most negative impact on a project; but as we will see, it can have other applications too.
Let’s see how its done:
One thing I like about Tableau is that its not just a tool meant to create pretty graphs with mere drag and drop actions. With the release of Tableau 8.1 in 2013 came a plethora of new functionalities.
The introduction of R, to enable making richer and dynamic visualizations, was one of the predominant features. R can be used with Tableau for techniques like Clustering, Prediction and Forecasting, to name a few.
I wanted to start the exploration of R and Tableau through Clustering, so I used the ultra popular Iris Dataset. It contains different features to distinguish between 3 types of flowers, namely Virginica, Setosa and Versicolor. As you can see in the below image, the R integration quite easily creates clusters of these 3 species:
Interested in making this yourself? First let’s go through the basics and the installation process, before delving into the visualization!
The following depicts the flow of control between Tableau and R to make this integration possible:
R scripts are written in Tableau as Table Calculations, which are sent to the R serve package of R. Here the module carries out the necessary computations and returns the result to Tableau.
Note: To properly understand and thereby use this feature, you must possess some knowledge of R and its various syntaxes. For the same you may refer to the following tutorial:
Learn Data Science in R from scratch
Now let’s look at the steps for this integration:
install.packages(“Rserve”); library(“Rserve”); Rserve()
So now that you have the proper ingredients ready, let’s start cooking!
As was shown in the image above, you make use of Tableau’s Table Calculation to communicate with R :
If you scroll down the list of functions, you will come across the following four:
Tableau automatically understands that the script is meant for R when these functions are included in the calculation area.
I hope that your initial excitement of making the clusters is still there! Let’s proceed.
What we have above is a Scatter Plot, which shows clusters of data points divided into 3 distinct clusters.
Let’s try doing the same with R now, and compare the two visualizations that we will get. We will be using the most common clustering algorithm, K-Means:
SCRIPT_INT( 'result <- kmeans(data.frame(.arg1,.arg2,.arg3,.arg4), 3);result$cluster;', SUM([Petal length]), SUM([Petal width]),SUM([Sepal length]),SUM([Sepal width]))
Although there are a few overlaps, the two visualisations do appear to be quite accurate.
This was a small gist of the potential of integrating R with Tableau. It’s applications are limitless, and I am sure you must have already started to think of the different ways you can interact with it.
It would be naive of me to say that this is all there is to Tableau. As new versions roll in, so do new functionalities.
Not only that, people are always experimenting and exploring Tableau, and coming up with new visuals. There are multiple blogs where people publish their experiments with data too. Do check them out.
You can also find new and gorgeous visualizations weekly on Tableau’s official Gallery page. I would definitely advise you to keep referring to these posts, creating your own visuals, and sharing it with the community.
Stay creative and all the best on your journey as a Data Explorer!
Lorem ipsum dolor sit amet, consectetur adipiscing elit,