Sentiment Analysis Using VADER

Juveriya Last Updated : 27 Jan, 2025

4 min read

In this article, we will work on how to perform sentiment analysis using VADER. Sentiment analysis gives meaning to the text. Semantics helps us to interpret symbols, their types, and their relation with each other. Let us briefly understand what actually NLP is and also about NLTK Library.

This article was published as a part of the Data Science Blogathon.

What is NLP?
NLTK
Sentiment Analysis
Applications of Sentiment Analysis
NLTK’s VADER module
Practical Exercise
Conclusion

What is NLP?

NLP is an automatic way of manipulating or processing human language. We use NLP to extract meaningful data from textual data. There are various applications of NLP such as Sentiment Analysis, Chatbot, Speech Recognition, Machine Translation, spell checking, Information Extraction, Keyword search, Advertisement matching, etc. Some real-world examples are Google Assistant and Google translate.

NLTK

Natural Language Toolkit (NLTK) is one of the most powerful NLP libraries which contains packages to make machines understand human language and reply to them in an appropriate desired response. NLTK has many built-in packages to process our textual data at every stage. There are various stages in nltk processing such as data cleaning, visualization, vectorization, etc.

Sentiment Analysis

Sentiment analysis is used to find out the polarity of the text, which is positive, negative, or neutral. It is one of the advanced research areas in natural language processing. This is widely used in data mining, text mining, etc. It helps collect and analyze opinions about a brand or a product by processing blog posts, comments, reviews, tweets, etc.

In sentiment analysis, we classify the polarity of a given text at the document, sentence, or feature level. It tells us about the opinion, whether it is positive, negative, or neutral.

Applications of Sentiment Analysis

Social media monitoring: As we all know, social media is taking over the world. More than 55% of customers share their reviews about purchases socially on many social networking sites. It’s almost difficult to analyze the reviews manually. Sentiment analysis lets us analyze and derive meaning from them.
Brand monitoring: Brand owners use sentiment analysis tools to keep track of the bad reviews about their brand. They can also use machine learning algorithms to predict outcomes based on the results derived using semantic analysis.
Voice of customer: Various sentiment analysis algorithms let us analyze the voice of the customers, such as the product that are most needed by the customers and also the products that are highly rated, etc. The brand owners can create a personalized customer experience based on these evaluations.
Customer service: Chatbots are a widespread way of delivering good customer service. Using sentiment analysis, you can transfer the chat to a customer service associate whenever needed. Also, you can automate the tasks such as booking a ticket, an appointment for a salon, etc.
Market research: Using sentiment analysis, you can research how well your competitors are growing and what are their positive feedbacks from the customers. You can also analyze the way they deal with their customers. You can, in turn, work on the issues related to your product’s failure.
Product Analysis: You can do keyword research to identify the products in demand and the highly rated products. You can also determine what features of a particular product are highly appreciated by the customers or the end users.

NLTK’s VADER module

VADER( Valence Aware Dictionary for Sentiment Reasoning) is an NLTK module that provides sentiment scores based on the words used. It is a rule-based sentiment analyzer in which the terms are generally labeled as per their semantic orientation as either positive or negative.

First, we will create a sentiment intensity analyzer to categorize our dataset. Then, we use the polarity scores method to determine the sentiment.

Practical Exercise

In this exercise, I will use a CSV file containing reviews for different products. The link for the file is :

import numpy as np
import pandas as pd
import nltk
#download vader from nltk
nltk.download('vader_lexicon')
from nltk.sentiment.vader import SentimentIntensityAnalyzer
#creating an object of sentiment intensity analyzer
sia= SentimentIntensityAnalyzer()
#uploading csv file
from google.colab import files
uploaded = files.upload()
#reading csv file
df = pd.read_csv(io.BytesIO(uploaded['reviews.csv']))
df.head()

Polarity_scores: This function returns the sentiment strength based on the given input statement/text.

For example:

text= "Bobby is an amazing guy"
sia.polarity_scores(text)

{‘compound’: 0.5859, ‘neg’: 0.0, ‘neu’: 0.513, ‘pos’: 0.487}

You can observe that the above statement is neutral

text= "The food delivered was really very bad"
sia.polarity_scores(text)

{‘compound’: -0.6214, ‘neg’: 0.404, ‘neu’: 0.596, ‘pos’: 0.0}

This example statement is a negative one.

Let us now create a new column in our CSV file that stores the polarity scores of each review.

#creating new column scores using polarity scores function
df['scores']=df['body'].apply(lambda body: sia.polarity_scores(str(body)))
df.head()

Similarly, we then create three different columns each for compound scores, positive scores, and negative scores.

df['compound']=df['scores'].apply(lambda score_dict:score_dict['compound'])
df.head()
df['pos']=df['scores'].apply(lambda pos_dict:pos_dict['pos'])
df.head()
df['neg']=df['scores'].apply(lambda neg_dict:neg_dict['neg'])
df.head()

We then create a new column named type, which indicates whether the review is pos, neg, or neutral.

df['type']=''
df.loc[df.compound>0,'type']='POS'
df.loc[df.compound==0,'type']='NEUTRAL'
df.loc[df.compound<0,'type']='NEG'
df.head()

Finally, we loop through the rows and count the total number of positive, negative, and neutral reviews.

len=df.shape
(rows,cols)=len
pos=0
neg=0
neutral=0
for i in range(0,rows):
if df.loc[i][12]=="POS":
    pos=pos+1
if df.loc[i][12]=="NEG":
    neg=neg+1
if df.loc[i][12]=="NEUTRAL":
    neutral=neutral+1
print("Positive :"+str(pos) + "  Negative :" + str(neg) + "   Neutral :"+ str(neutral))

Positive :46060 Negative :13670 Neutral :8256

Therefore, using the VADER module, we concluded that our data has 46060 positive reviews, 13670 negative reviews, and 8256 neutral reviews.

Conclusion

Finally, as you all know, social media is taking over the world, and more than 55% of customers share their opinions or reviews about their purchases. Analyzing the semantics of the reviews would have given you a glimpse of how sentiment analysis is done using the concepts of NLP. As we have discussed in our article, there are many other applications of sentiment analysis beyond this.

In this article,

We have briefly introduced you to sentiment analysis and its applications in the real world.
We then learned basically what Natural Language Processing is.
Finally, we used the NLTK module and the VADER analyzer to conduct sentiment analysis on amazon reviews.
In short, NLTK is an open-source tool used for classifying the data, whereas VADER is a lexicon and rule-based tool of NLTK which is used to conduct sentiment analysis.

I hope this information helped you understand what sentiment analysis is and how it is done practically.

Juveriya

I am a Software Developer who has a great passion to teach and educate others. Always keen on learning new technologies and constantly seeking out innovative solutions to everyday problems.
If I talk about my hobbies, writing comes as number one. I almost write every day about whatever interests me and any new concept that I learn each day. I am also a keen gardener.

Free Courses

4.5

Learn to Build Intelligent Chatbots using AI

Build ethical chatbots via OpenAI & LangChain using PDF data.

4.9

Getting Started with DeepSeek-AI

DeepSeek is trending for its open-source AI, rivaling top models.

4.6

Nano Course Cutting Edge LLM Tricks

Learn cutting-edge LLM tricks from research. Build state-of-the-art LLMs.

4.6

Mastering Multilingual GenAI Open-Weight for Indic Language

Master Multilingual GenAI with open-weight models for Indic languages.

Reading list

Sentiment Analysis Using VADER

Table of contents

What is NLP?

NLTK

Sentiment Analysis

Applications of Sentiment Analysis

NLTK’s VADER module

Practical Exercise

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Learn to Build Intelligent Chatbots using AI

Getting Started with DeepSeek-AI

Nano Course Cutting Edge LLM Tricks

Mastering Multilingual GenAI Open-Weight for Indic Language

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques

Reading list

Introduction to NLP

Text Pre-processing

NLP Libraries

Regular Expressions

String Similarity

Spelling Correction

Topic Modeling

Text Representation

Information Retrieval System

Word Vectors

Word Senses

Dependency Parsing

Language Modeling

Getting Started with RNN

Different Variants of RNN

Machine Translation and Attention

Self Attention and Transformers

Transfomers and Pretraining

Question Answering

Text Summarization

Named Entity Recognition

Coreference Resolution

Audio Data

ASR

Audio Separation

Chatbot

Auto NLP

Sentiment Analysis Using VADER

Table of contents

What is NLP?

NLTK

Sentiment Analysis

Applications of Sentiment Analysis

NLTK’s VADER module

Practical Exercise

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Learn to Build Intelligent Chatbots using AI

Getting Started with DeepSeek-AI

Nano Course Cutting Edge LLM Tricks

Mastering Multilingual GenAI Open-Weight for Indic Language

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques