Amit Kulkarni — January 25, 2022
Beginner Data Visualization Libraries Python

This article was published as a part of the Data Science Blogathon.

 

Dashboards in Python
Source: Designed by Freepik

Introduction

We are living in a world where data is collected at every transaction, be it taking a cab ride, online shopping details of what was bought and how much was bought, there are reminders for getting our vehicles serviced and insurance premium to be paid, emails are flooded with targeted marketing where companies are selling a particular product for a specific price based on a model’s customer segmentation. With smartphones/mobile applications, collecting, and processing data at an individual level has become that much easier. Gone are the days when companies said only if they had enough data to make a business decision.

These advancements in technology coupled with ease of procuring and processing data bring with it a whole set of challenges. It is the same data that consumes almost 65-75% of the time in data science projects and making sense out of this data is only possible if it is visualized. This has given rise to a plethora of tools & packages to help us navigate this hurdle. The other interesting thing to note is that visualizing the data to extract useful information is not only technical skill but a lot depends on the individual’s creative skills as well. Now, the creative stuff is generally very subjective and there is a possibility that it brings in a lot of preconceived notions, biased, etc. This is evident when we see reports from different sources giving different perspectives & conclusions from the same data. Not to forget the multiple rounds of reviews and revisions to generate a report.

The Need for Dashboards in Python

There are also scenarios where business disagrees with certain reports/ recommendation purely because they understand the business better than the model and the data on which it was built. So, how should a business make their decision when they have conflicting recommendations?. A simple way would be to give the power of data & visualization to businesses where they can view the data from a business perspective with minimal dependencies on technical teams. This has given rise to dashboards and tools like Power BI /Tableau enabled the business to make business decisions on the fly. But, these tools are not free, there is a premium associated with them and it is sometimes per user based.

In this blog, we will look into the explainer dashboard package from python which will help us build not only dashboards but also gives predictive models with minimum lines of code. Needless to say, it is all free and open source.

Before we move further with the blog, let’s take a step back to understand how far and how fast the technology is evolving.

Last year, I had built a dashboard/data app with a plotly dash package in python. This was for a customer loyalty program. Every single plot, data point, color, layout, and interactivity had to be defined and coded for the app to work. It has files namely app.py which defines the functionality, dataProcess.py for preparing data for the dashboard, and layout.py for defining the layout. The application really came out well in the end but it involved more than a decent amount of python code. Read more on the blog.

GitHub code: Customer Loyalty Program

Furthermore, I had built an R shiny application that would build a model automatically based on user selection on the UI. The app was very basic, it has UI and server files with defined interactivity and did the following tasks on user selection. Again needless to say everything in the app was coded end to end. Read more on the blog.

  • Data showcase
  • Data summary & basic statistics
  • Let user play with train/test split and visualize plots like correlation etc.
  • Build the model and view variable importance followed by a final prediction on test data.

GitHub code: Interactive modeling with Shiny

The need of dashboard

Source: Author

 

The two example apps that we saw above were built on python and R Shiny respectively. A quick flashback of the above apps was needed so that we understand the skills required to build such a dashboard and also to appreciate the explainer dashboard that we will explore in the next section. We will achieve more with very minimal lines of code – everything built-in.

Let’s Get Started!

Install the library: Open your terminal and install the library with the below code

pip install explainerdashabord

We will build a random forest classifier model using a built-in dataset.  Hence we will install other dependencies as follows

from sklearn.ensemble import RandomForestClassifier
from explainerdashboard import ClassifierExplainer, ExplainerDashboard
from explainerdashboard.datasets import titanic_survive, feature_descriptions

Create our train/test dataset as below

X_train, y_train, X_test, y_test = titanic_survive()

We will create an object by explainer which will help us create a dashboard.

explainer = ClassifierExplainer(model, X_test, y_test, 
                               cats=['Sex', 'Deck', 'Embarked'],
                               descriptions=feature_descriptions,
                               labels=['Not survived', 'Survived'])

Finally, we will run the explainer object to viewing the dashboard.

ExplainerDashboard(explainer).run()
* Serving Flask app "explainerdashboard.dashboards" (lazy loading)  * Environment: production    WARNING: This is a development server. Use a production WSGI server instead.  * Debug mode: off
 * Running on http://0.0.0.0:8050/ (Press CTRL+C to quit)

Now we can click on  http://0.0.0.1:8050/ to view the dashboard.

 

Model Explainer | Dashboards in Python

Source: Author

 

Model performance metrics and confusion matrix

Source: Author

Precision and classification Plot | Dashboards in Python
Source: Author
ROC AUC and PR AUC Plots | Dashboards in Python
Source: Author
Lift curve and cumulative precision | Dashboards in Python
Source: Author

All the plots and metrics that we saw above are generated automatically by the explainer dashboard and it is also neatly segregated into various tabs. It lets users peek into areas of their interest, make changes to configuration and draw conclusions.

Switching off the tabs: We can also hide the tabs that we don’t need with the below setting.

ExplainerDashboard(explainer,
   importances=True,
   model_summary=False,
   contributions=True,
   whatif=True,
   shap_dependence=True,
   shap_interaction=False
   decision_trees=True
).run()

One of the interesting sections is “What if…” where users can make various selections and instantly see the outcomes. This is very similar to the implementation we tried in the R shiny app and here we achieved more than that with only a few lines of code.

Closing Note on Dashboards in Python

Data visualization is needed because our brain is not equipped with the capacity to analyze and process huge volumes of structured/unstructured data, identify the trends and make meaningful sense out of it. Graphically representing data lets us interact with data and interpret the result in a way that makes business sense to us. The dashboard by itself doesn’t answer all of our questions but definitely helps us figure out answers in our own way.

Hope you liked the blog on dashboards in Python. Happy Learning!

You can connect with me – Linkedin &  Github

References

https://explainerdashboard.readthedocs.io/en/latest/

https://www.freepik.com/free-vector/site-stats-concept-illustration_7140739.htm#query=dashboard&position=3&from_view=search

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion. 

About the Author

Our Top Authors

Download Analytics Vidhya App for the Latest blog/Article

Leave a Reply Your email address will not be published. Required fields are marked *