Part 10: Step by Step Guide to Master NLP – Named Entity Recognition

CHIRAG GOYAL 23 Jun, 2021 • 8 min read

This article was published as a part of the Data Science Blogathon

Introduction

This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article, we discussed semantic analysis, which is a level of NLP tasks. In that article, we discussed the techniques of Semantic analysis in which we discussed one technique named entity extraction, which is very important to understand in NLP.

So, In this article, we will deep dive into the entity extraction technique named Named Entity Recognition, which is a very useful component in the pipeline of NLP.

This is part-10 of the blog series on the Step by Step Guide to Natural Language Processing.

 

Table of Contents

1. What is Named Entity Recognition (NER)?

2. Different blocks present in a Typical NER model

3. Deep understanding of Named Entity Recognition with an example

4. How does Named Entity Recognition work?

5. Use-cases of Named Entity Recognition

6. How can I use NER?

What is Named Entity Recognition (NER)?

Let’s first discuss what entities mean?

Entities are the most important chunks of a particular sentence such as noun phrases, verb phrases, or both. Generally, Entity Detection algorithms are ensemble models of :

  • Rule-based Parsing, python
  • Dictionary lookups,
  • POS Tagging,
  • Dependency Parsing.

For Example,

What is  Named Entity Recognition

In the above sentence, the entities are:

Date: Thursday, Time: night, Location: Chateau Marmont, Person: Cate Blanchett

Now, we can start our discussion on Named Entity Recognition (NER),

1. Named Entity Recognition is one of the key entity detection methods in NLP.

2. Named entity recognition is a natural language processing technique that can automatically scan entire articles and pull out some fundamental entities in a text and classify them into predefined categories. Entities may be,

  • Organizations,
  • Quantities,
  • Monetary values,
  • Percentages, and more.
  • People’s names
  • Company names
  • Geographic locations (Both physical and political)
  • Product names
  • Dates and times
  • Amounts of money
  • Names of events

3. In simple words, Named Entity Recognition is the process of detecting the named entities such as person names, location names, company names, etc from the text.

4. It is also known as entity identification or entity extraction or entity chunking.

For Example,

Named Entity Recognition 2

5. With the help of named entity recognition, we can extract key information to understand the text, or merely use it to extract important information to store in a database.

6. The applicability of entity detection can be seen in many applications such as

  • Automated Chatbots,
  • Content Analyzers,
  • Consumer Insights, etc.

Commonly used types of named entity:

Commonly used types of named entity:

                                                   Image Source: Google Images 

Different blocks present in a Typical Named Entity Recognition model

A typical NER model consists of the following three blocks:

Noun Phrase Identification

This step deals with extracting all the noun phrases from a text with the help of dependency parsing and part of speech tagging.

Phrase Classification

In this classification step, we classified all the extracted noun phrases from the above step into their respective categories. To disambiguate locations, Google Maps API can provide a very good path. and to identify person names or company names, the open databases from DBpedia, Wikipedia can be used. Apart from this, we can also make the lookup tables and dictionaries by combining information with the help of different sources.

Entity Disambiguation

Sometimes what happens is that entities are misclassified, hence creating a validation layer on top of the results becomes useful. The use of knowledge graphs can be exploited for this purpose. Some of the popular knowledge graphs are:

Deep understanding of NER with an Example

Consider the following sentence:

Named Entity Recognition example

The blue cells represent the nouns. Some of these nouns describe real things present in the world.

For Example, From the above, the following nouns represent physical places on a map.

“London”, “England”, “United Kingdom”

It would be a great thing if we can detect that! With that amount of information, we could automatically extract a list of real-world places mentioned in a document with the help of NLP.

Therefore, the goal of NER is to detect and label these nouns with the real-world concepts that they represent.

So, when we run each token present in the sentence through a NER tagging model, our sentence looks like as,

example 1 Named Entity Recognition

Let’s discuss what exactly the NER system does?

NER systems aren’t just doing a simple dictionary lookup. Instead, they are using the context of how a word appears in the sentence and used a statistical model to guess which type of noun that particular word represents.

Since NER makes it easy to grab structured data out of the text, therefore it has tons of uses. It’s one of the easiest methods to quickly get insightful value out of an NLP pipeline.

If you want to try out NER yourself, then refer to the link.

How does Named Entity Recognition work?

As we can simple observed that after reading a particular text, naturally we can recognize named entities such as people, values, locations, and so on.

For Example, Consider the following sentence:

Sentence: Sundar Pichai, the CEO of Google Inc. is walking in the streets of California. 

From the above sentence, we can identify three types of entities: (Named Entities)

  • ( “person”: “Sundar Pichai” ),
  • (“org”: “Google Inc.”),
  • (“location”: “California”).

But to do the same thing with the help of computers, we need to help them recognize entities first so that they can categorize them. So, to do so we can take the help of machine learning and Natural Language Processing (NLP).

Let’s discuss the role of both these things while implementing NER using computers:

  • NLP: It studies the structure and rules of language and forms intelligent systems that are capable of deriving meaning from text and speech.
  • Machine Learning: It helps machines learn and improve over time.

To learn what an entity is, a NER model needs to be able to detect a word or string of words that form an entity (e.g. California) and decide which entity category it belongs to.

So, as a concluding step we can say that the heart of any NER model is a two-step process:

  • Detect a named entity
  • Categorize the entity

So first, we need to create entity categories, like Name, Location, Event, Organization, etc., and feed a NER model relevant training data.

Then, by tagging some samples of words and phrases with their corresponding entities, we’ll eventually teach our NER model to detect the entities and categorize them.

Use-Cases of Named Entity Recognition 

As we have discussed in the above section that the Named entity recognition (NER) will help us to easily identify the key components in a text, such as names of people, places, brands, monetary values, and more.

And extracting the main entities from a text helps us to sort the unstructured data and detect the important information, which is crucial if you have to deal with large datasets.

So, Let’s discuss some of the interesting use cases of Named Entity Recognition:

Customer Support

Customer Service Vs Customer Support Vs Customer Success – Customer Service Blog from HappyFox

                                                    Image Source: Google Images

Let’s discuss the use case of customer support tickets where we deal with a rising number of tickets, there we can use named entity recognition techniques to handle the customer requests faster.

From a business perspective, if we automate the repetitive customer service tasks, such as categorizing customers’ issues, and queries, then it saves you valuable time. As a result, it helps to improve your resolution rates and boost customer satisfaction.

Here, we can also use entity extraction to pull the relevant pieces of information, like product names or serial numbers, which makes it easier to route tickets to the most suitable agent or team for handling that issue.

Gain Insights from Customer Feedback

How to Gather Insights from Customer Satisfaction Feedback | Zuyder

                                                      Image Source: Google Images 

For almost all of the product-based companies, Online reviews are a great source of taking the customer feedback, as they can provide rich insights about what customers like and dislike about your products and the aspects of your business that need improvements for business increment.

So, here we can use NER systems to organize all the customer feedback and pinpoint recurring problems.

For Example, we can use the NER system to detect locations that are mentioned most often in negative customer feedback, which might lead you to focus on a particular office branch.

Recommendation System

Applied Sciences | Free Full-Text | Recommendation System Using Autoencoders

                                                      Image Source: Google Images

Many modern applications such as Netflix, YouTube, Facebook, etc. rely on recommendation systems to produces optimal customer experiences. A lot of these systems rely on named entity recognition, which can give suggestions based on the user search history.

For Example, if you watch a lot of educational videos on YouTube, then you’ll get more recommendations that have been classified as entity education.

Summarizing Resumes

Professional Resume Summary Examples (25+ Statements)

                                                   Image Source: Google Images

While recruiting new peoples, Recruiters spend many hours of their day going through resumes and finds for the right candidate. Each resume contains almost the same type of information, but their organized manner and formatting are different, so this becomes a classic example of unstructured data.

So, here with the help of an entity extractor, the recruitment teams can instantly extract the most relevant information about candidates, from personal information such as name, address, phone number, date of birth, and email, etc, to information related to their training and experience like certifications, degree, company names, skills, etc.

Some more use-cases of NER are:

  • Optimizing search engine algorithms,
  • Content classification for news channels, etc.

 

How can I use NER?

If you work on a business problem statement, and if you think that your business could benefit from NER, then you can use it pretty easily with the help of the following excellent open-source libraries:

Each has its own pros and cons, which you can explore by referring to the above-mentioned links.

This ends our Part-10 of the Blog Series on Natural Language Processing!

Other Blog Posts by Me

You can also check my previous blog posts.

Previous Data Science Blog posts.

LinkedIn

Here is my Linkedin profile in case you want to connect with me. I’ll be happy to be connected with you.

Email

For any queries, you can mail me on Gmail.

End Notes

Thanks for reading!

I hope that you have enjoyed the article. If you like it, share it with your friends also. Something not mentioned or want to share your thoughts? Feel free to comment below And I’ll get back to you. 😉

CHIRAG GOYAL 23 Jun 2021

I am currently pursuing my Bachelor of Technology (B.Tech) in Computer Science and Engineering from the Indian Institute of Technology Jodhpur(IITJ). I am very enthusiastic about Machine learning, Deep Learning, and Artificial Intelligence. Feel free to connect with me on Linkedin.

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Natural Language Processing
Become a full stack data scientist