Churn Analytics in Telecommunications Company

Shweta Rawat 19 Jan, 2023
7 min read


What is Churn Analytics? And how do telecommunication companies effectively use this analysis in day-to-day activities? Learn from the industry expert Sakshi Gujral who will take you through all the essential details and give you some tips on improving churn analytics results when used practically. 

About Speaker: Sakshi Gujral is currently working as a Data Scientist at Concentrix. She is also pursuing her Ph.D. from IIIT- Delhi. Sakshi is a GATE Scholar and UGC-NET qualifier, and an alumnus of the Defense Research Development Organization. She has 5 years of experience in Industries like TCS, and Genpact, solving Data science problems in the domains of finance, health care, and telecommunication use cases. Sakshi has done research work in the field of Machine Learning, NLP, Internet of Things.

Connect with Sakshi on Linkedin.

Table of Contents

  1. Churn Analytics: Telecommunication Industries
  2. What is Churn Analytics?
  3. Using Data Science and Analytics to Analyse the Churn
  4. Extensive Exploratory Data Analysis
  5. Dataset Discussion
  6. Hands-on Python Notebook
  7. Conclusion

Churn Analytics: Telecommunication Industries

Customer retention is important for businesses to analyze their growth and effective working strategies. In this DataHour, Sakshi will discuss factors that constitute the degradation of business due to churn, especially in Telcos.

What is Churn Analytics?

To understand in easy terms, suppose there was a boy named Rahul who used to buy groceries from retail stores. But nowadays, Rahul orders his groceries from online stores. Rahul shifted from retail stores to online stores. Hence, Rahul is the “Churn” for Retail shops. When a person who is using a service from company X suddenly stops using it and shifts to another service providing company Y, then that person is called Churn for company X. Reasons could be more benefits, better options, accessible customer care services, and much more. Churn analytics help to determine all these problems.

When a company faces high churn analytics rates, then eventually revenue of the company decreases. It impacts the share market value of the company also. Due to this, a company loses its Brand value, which is the main reason for the layoffs.

In the past few years, you have noticed that millions of people shifted to some particular Telecommunication company because it was providing free data and calling services. Due to easy online application and doorstep delivery of sim, people frequently port their service provider for better experiences.

Hence, looking after your churn rate to improve your business strategy before going bankrupt is necessary.

Few companies where churn analytics is showing its vast impacts are Telecommunication, Gaming Industry, Local shops, restaurants, Banks, Shopping Malls, etc.

Using Data Science and Analytics to Analyze the Churn

Now we will understand this problem from a Data Science and analytics perspective.

  1. Data Set Procurement and understanding: First, we need the complete Data in a digitized form containing all the features, which will help analyze churn. For this, we will use python code with a standard data set, which will help you to get an overview of how your chunk data should look, especially for the telecommunication sector.
  2. Data in enrichment and preparation: In real-time scenarios, you always receive the data in a very messy form. So first, you need to enrich it and prepare it so we can understand it easily.
  3. Exploratory Data Analysis: Analysis means finding the hidden Trends within the data.
  4. Handling Imbalance in Data-set: Often, we see an imbalance in the data set; it may occur due to the biased nature of a particular customer class or group.
  5. Performing Modeling: We will use machine learning and deep learning modeling for better understanding.
  6. Evaluating and Analyzing Results: At the end, we will evaluate our findings from the above processes.

Data-Set Description: The data set we will use in this project is the “IBM Telco Churn Dataset.” It has 33(Independent Variables) that indicate the characteristics of clients of a Fictional Telecommunication company. The churn column (response variable) indicates whether the customer left in the last months. Class ‘NO’ indicates the clients who haven’t left the company in the last months. Class ‘YES’ indicated the clients who left the company in the last few months.

Below is the python notebook that Sakshi has prepared for today’s project. Here is Telco Churn excel; you can see 33 columns.

 Below is the pic of all the columns by name. We will understand them separately. All these different parameters will help us to understand the churn.


Now the target column for us is the “Churn Label,” as shown in the screenshot below. It’s either YES or NO, as described before. 

Churn Analytics

 In the below pie chart, we can clearly see that the data set is biased toward “YES.” So, it’s important to handle the class imbalance here. We are taking help from SMOTE here. SMOTE is a ‘Synthetic Minority Oversampling Technique’ that helps handle imbalanced datasets. So, from the original datasets, we have generated a few more samples.

Churn Analytics

 Extensive Exploratory Data Analysis

In the python notebook, the hidden Trends which are already there in the data will help take out the decision of what kind of machine learning model we need to apply in a later stage.

Below are the points that we are gonna look at in this project.

Churn Analytics

Now, we will start enriching the data to apply machine learning models. We have applied a combination of categorical and float columns; we need to make them in such a format that our machine learning algorithms can quickly process them. After pre-processing, we took out some statistics data like mean and Standard Deviation. In the last 2 columns, you can see ‘CLTV’ (customers retaining for longer times) and ‘churn reason’, which are very important. So on this text data, I have applied MP to find the trends making customers move to another company. We can also analyze the data of other companies’ data and design our product so that customers won’t leave in the future.

Churn Analytics

 Below is the correlation; you can clearly see that highly correlated variables are close to 1. For example, the ‘Total Charges’ is 0.93, indicating that if the service charges are pocket-friendly, then customers will remain for a more extended period. 

Churn Analytics

 With the below code, we will study the distribution, which is very important in any machine-learning problem. It helps us to decide which algorithm we’re going to use ahead.

Churn Analytics

Now, we need to segregate the medical and categorical values. Below you can clearly see the churning part in terms of gender distribution. We can conclude that churning in terms of gender is having no much difference.

Churn Analytics

In the chart below, we can see that customers who have taken plans for more time periods show more retention. For them, the churn rate is much lower.


Below is an exciting chart showing that as tenure increases, the chance of churn decreases.

Churn Analytics

Through all these graphs, we are trying to figure out which factors are responsible for more churn rates and which are not. So that we can work on the loopholes to decrease our churning; for example, we can tell telecom companies to provide more extended plans with discounts so that the churning rate will decrease. 

For the last column, ‘Churn Reason,’ we will put all customer texts in the form of word clouds for which we have applied some part of NLP. You can see all the reasons that are contributing to churning.

Churn Analytics

The graph shows the relationship between churn score with zip code, latitude, longitude, tenure, etc.

Results: The sketchbook plot below shows the distribution of 0 and 1 labels on test data.


By applying analytics and understanding the insights from it, telco companies can work on improving their plans and reducing churners. Other Sectors like Hotels, Retail shops, and shopping malls can also use this method to stop their customers from being churners.

Below are our takeaways from the above analysis.

  1. Electronic Check medium is the highest churner.
  2. Monthly customers are more likely to churn because they can move without contract terms.
  3. Customers who don’t feel online security and get no Tech. Support the highest churn.
  4. Non-senior citizens are the highest churners.

Here is another interesting article: Bank Customer Churn Prediction Using Machine Learning.

The media shown in this article is not owned by Analytics Vidhya and is used from the presenter’s presentation.
Shweta Rawat 19 Jan, 2023

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers