A Detailed Study on COVID 19 Vaccinations Data

Akshita Chugh 22 Apr, 2022 • 6 min read

This article was published as a part of the Data Science Blogathon.

Introduction

The global battle against COVID 19 pandemic can be won only if a large part of the world gets vaccinated against the SARS-CoV-2 virus. A considerably low vaccination rate has been observed in low-income countries of the world. In this blog, we study the COVID 19 vaccination trends across the world using python, and we aim to derive key insights from the data which can help policymakers modify their policies.

Data

The country vaccinations data have been downloaded from Kaggle, and it was last updated on March 8, 2022.

Link: https://www.kaggle.com/code/terencemao/covid-vaccination-rates/data

Country vaccinations data contain the following columns:

COVID-19 Vaccinations Data

 

Importing libraries in Python

This study will use python’s – plotly, pandas,matplotlib, and seaborn libraries for data visualization.

import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
import plotly.express as px
from plotly.offline import download_plotlyjs,init_notebook_mode,plot,iplot
import plotly.graph_objects as go
import plotly.figure_factory as ff
from plotly.colors import n_colors
from wordcloud import WordCloud,ImageColorGenerator
init_notebook_mode(connected=True)
from plotly.subplots import make_subplots
from pywaffle import Waffle
import warnings
warnings.filterwarnings("ignore")

Reading the Data File

Pandas library is used to read the csv files in python.

df_vaccination = pd.read_csv("C:\Users\ASUS\Downloads\archive\country_vaccinations.csv", parse_dates = ['date'])
df_manufacture = pd.read_csv("C:\UsersASUS\Downloads\archive\country_vaccinations_by_manufacturer.csv", parse_dates = ['date'])

Exploratory Data Analysis

Dataframe.info is used to summarize the data frame in python. There are 81,976 rows and 15 columns in the df_vaccination dataset and 31,126 rows and 4 columns in df_manufacture data. There are missing values in both datasets.

df_vaccination.info()

RangeIndex: 81976 entries, 0 to 81975
Data columns (total 15 columns):
 #   Column                               Non-Null Count  Dtype
---  ------                               --------------  -----
 0   country                              81976 non-null  object
 1   iso_code                             81976 non-null  object
 2   date                                 81976 non-null  datetime64[ns]
 3   total_vaccinations                   41873 non-null  float64
 4   people_vaccinated                    39638 non-null  float64
 5   people_fully_vaccinated              37119 non-null  float64
 6   daily_vaccinations_raw               34033 non-null  float64
7   daily_vaccinations                   81697 non-null  float64
 8   total_vaccinations_per_hundred       41873 non-null  float64
 9   people_vaccinated_per_hundred        39638 non-null  float64
 10  people_fully_vaccinated_per_hundred  37119 non-null  float64
 11  daily_vaccinations_per_million       81697 non-null  float64
 12  vaccines                             81976 non-null  object
 13  source_name                          81976 non-null  object
 14  source_website                       81976 non-null  object
dtypes: datetime64[ns](1), float64(9), object(5)
memory usage: 9.4+ MB
df_manufacture.info()

RangeIndex: 31127 entries, 0 to 31126
Data columns (total 4 columns):
 #   Column              Non-Null Count  Dtype
---  ------              --------------  -----
 0   location            31127 non-null  object
 1   date                31124 non-null  datetime64[ns]
 2   vaccine             31127 non-null  object
 3   total_vaccinations  31127 non-null  int64
dtypes: datetime64[ns](1), int64(1), object(2)
memory usage: 972.8+ KB
# Creating a new dataset df with limited set of columns
df = df_vaccination.groupby(["country"])['people_fully_vaccinated_per_hundred'].max().reset_index()

 Vaccination Trends Across the World

 The African continent has the least percentage of fully immunized population vis-a-vis the other continents.

fig = px.choropleth(df,locations = 'country',locationmode = 'country names',color = 'people_fully_vaccinated_per_hundred',
                   title = 'people_fully_vaccinated %',hover_data= ['people_fully_vaccinated_per_hundred'])
fig.show()

Vaccine Schemes used Across the Countries

Across the world, various vaccine schemes were used for immunization. India used Covaxin, Oxford/Astra Zeneca while Russia used Epivaccorona and SputnikV. Pfizer, Oxford/Astra Zeneca and Moderna were used by Australia, Canada and UK. China used CanSino, Sinopharm, Sinovac,Z52021.

fig = px.choropleth(df_vaccination, locations = 'country',locationmode = 'country names', color = 'vaccines',
                   title = 'VaccinationbyCountry', height = 1000)
fig.update_layout({'legend_orientation':'h'})
fig.update_layout({'legend_title':'Vaccine scheme'})
fig.show()
Vaccine schemes
COVID-19 Vaccinations Data

Average Daily Vaccination Availability Across the World

The average daily vaccination count(in Millions) is the highest in China followed by India, the United States, and Brazil.

Average daily vaccination
fig = px.choropleth(dfdailyvaccination,locations = 'country',locationmode = 'country names',color = 'daily_vaccinations',
                   title = 'Average daily_vaccinations',hover_data= ['daily_vaccinations'])
fig.show()

Top 10 Countries with the Highest Vaccination Availability in Billions

China has the highest count of vaccination in the country followed by India and the United States due to their large population. Also, vaccinations in the country are greater than the population of the country as an individual in most of the immunization programs receives two vaccines for COVID 19.

highest vaccination availablity
vaccine = df_vaccination.groupby(["country"])['total_vaccinations'].max().nlargest(10).reset_index()
vaccine.columns = ["country", "Total vaccinations"]
fig = px.bar(vaccine, x='country', y='Total vaccinations')
fig.show()

Top 10 Countries with the Highest Vaccination Availability per Capita

df1 = df_vaccination.groupby(["country"])['total_vaccinations_per_hundred'].max().nlargest(10).reset_index()
df1.columns = ["country", "total_vaccinations_per_capita"]
fig = px.bar(df1, x='country', y='total_vaccinations_per_capita', height = 1000 , width = 1000)
fig.show()
COVID-19 Vaccinations Data

The Total vaccination per capita is higher for Gibraltar, Cuba, Chile, Singapore, UAE, Malta, and Brunei.

Top 10 Countries with the Lowest Vaccination Availability per Capita

Countries from the African continent like Burundi, Democratic Republic of Congo, Chad, Madagascar, and Tanzania have the least vaccination count per capita.

COVID-19 Vaccinations Data
df1 = df_vaccination.groupby(["country"])['total_vaccinations_per_hundred'].max().nsmallest(15).reset_index()
df1.columns = ["country", "total_vaccinations_per_capita"]
fig = px.bar(df1, x='country', y='total_vaccinations_per_capita', height = 1000 , width = 1000)
fig.show()

Top 10 Countries with the Highest Vaccinated population( at least one dose) per Capita

Gibraltar, Pitcairn, United Arab Emirates, Portugal, Cuba, Chile, Cayman Islands, Brunei, Singapore and Malta have the highest vaccinated(at least one dose) population per dose.

Highest vaccinated population
vaccine = df_vaccination.groupby(["country"])['people_vaccinated_per_hundred'].max().nlargest(10).reset_index()
vaccine.columns = ["country", "people_vaccinated_per_capita"]
fig = px.bar(vaccine, x='country', y='people_vaccinated_per_capita')
fig.show()

Top 10 Countries with the Least Vaccinated population( at least one dose) per Capita

Countries from the African continent like Burundi, Democratic Republic of Congo, Haiti, Chad, Yemen, Papua New Guinea, Madagascar, and Tanzania have the least vaccinated population count per capita.

COVID-19 Vaccinations Data
vaccine = df_vaccination.groupby(["country"])['people_vaccinated_per_hundred'].max().nsmallest(10).reset_index()
vaccine.columns = ["country", "people_vaccinated_per_capita"]
fig = px.bar(vaccine, x='country', y='people_vaccinated_per_capita')
fig.show()

Top 3 Frequently used Vaccination Schemes in the World

China’s Cansino, Sinopharm and Sinovac vaccination schemes are most frequently used followed by India’s Covaxin, Oxford/Astra Zeneca and SputnikV and United States Johnson, Moderna and Pfizer.

Vaccination schemes

 

colors=['#fae588','#f79d65','#f9dc5c','#e8ac65','#e76f51','#ef233c','#b7094c'] #color palette
vaccinetotalpop = df_vaccination.groupby(["country", "vaccines"])['total_vaccinations'].max().nlargest(3).reset_index()
fig = px.treemap(vaccinetotalpop, path = ['country','vaccines' ], values = 'total_vaccinations',
                title="Total vaccinations per country grouped by vaccine scheme",  height = 800 , width = 1000 )
fig.update_layout( font_family = "Courier New", font_color = "black", treemapcolorway = colors)
fig.show()

 Daily Vaccination Trend

The daily vaccination trend peaked in Q2’21 for USA and China. India and Indonesia saw a rise in Q3’21. Pakistan and Bangladesh vaccination count spiked in March’22.

COVID-19 Vaccinations Data | Trends
COVID-19 Vaccinations Data | Trends 2
COVID-19 Vaccinations Data | Trends 3
country_vaccine_time = df_vaccination[["country", "date", 
                               'daily_vaccinations'
                               ]]
country_vaccine_time.columns = ["Country", "Date",
                               "Daily vaccinations" 
                               ]
countries = ['India','Germany',
              'United Kingdom', 'United States', 'China', 'Brazil', 'Indonesia','Japan','Pakistan', 'Bangladesh']
fig = px.line(country_vaccine_time1, x="Date", y="Daily vaccinations", color='Country')
fig.show()

 Conclusion 

The total COVID 19 vaccination count (in billions) is the highest in China followed by India, the United States, and Brazil. However, the total vaccination per capita is high for Gibraltar, Cuba, Chile, Singapore, UAE, Malta, and Brunei.

China’s Cansino, Sinopharm, and Sinovac vaccination schemes are most frequently used followed by India’s Covaxin, Oxford/Astra Zeneca, and SputnikV, and United States Johnson, Moderna, and Pfizer.

The analysis suggests that countries in the African continent have extremely low vaccination rates and are far behind the other continents of the world. Therefore, WHO organizations should intervene to provide equitable distribution of vaccines across the world.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Akshita Chugh 22 Apr 2022

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Michael Rübcke
Michael Rübcke 22 Apr, 2022

Thanks for sharing and the data points/insights - super interesting!