How Part-of-Speech Tag, Dependency and Constituency Parsing Aid In Understanding Text Data?

Abhishek Sharma 30 May, 2024

10 min read

Overview

Learn about Part-of-Speech (POS) Tagging,
Understand Dependency Parsing and Constituency Parsing

Introduction

I was amazed that Roger Bacon gave the above quote in the 13th century, and it still holds, Isn’t it? I am sure that you all will agree with me.

Today, the way of understanding languages has changed a lot from the 13th century. We now refer to it as linguistics and natural language processing. But its importance hasn’t diminished; instead, it has increased tremendously. You know why? Because its applications have rocketed and one of them is the reason why you landed on this article.

Fundamental concepts of NLP - Part-of-Speech Tag, Dependency and Constituency Parsing

Each of these applications involve complex NLP techniques and to understand these, one must have a good grasp on the basics of NLP. Therefore, before going for complex topics, keeping the fundamentals right is important.

That’s why I have created this article in which I will be covering some basic concepts of NLP – Part-of-Speech (POS) tagging, Dependency parsing, and Constituency parsing in natural language processing. We will understand these concepts and also implement these in python. So let’s begin!

Part-of-Speech(POS) Tagging

Part-of-Speech (POS) tagging is a natural language processing technique that involves assigning specific grammatical categories or labels (such as nouns, verbs, adjectives, adverbs, pronouns, etc.) to individual words within a sentence. This process provides insights into the syntactic structure of the text, aiding in understanding word relationships, disambiguating word meanings, and facilitating various linguistic and computational analyses of textual data.

In our school days, all of us have studied the parts of speech, which includes nouns, pronouns, adjectives, verbs, etc. Words belonging to various parts of speeches form a sentence. Knowing the part of speech of words in a sentence is important for understanding it.

That’s the reason for the creation of the concept of POS tagging. I’m sure that by now, you have already guessed what POS tagging is. Still, allow me to explain it to you.

Part-of-Speech(POS) Tagging is the process of assigning different labels known as POS tags to the words in a sentence that tells us about the part-of-speech of the word.

Broadly there are two types of POS tags:

1. Universal POS Tags :

These tags are used in the Universal Dependencies (UD) (latest version 2), a project that is developing cross-linguistically consistent treebank annotation for many languages. These tags are based on the type of words. E.g., NOUN(Common Noun), ADJ(Adjective), ADV(Adverb).

List of Universal POS Tags

Part-of-Speech(POS) Tagging : list of pos tags

You can read more about each one of them here.

2. Detailed POS Tags:

These tags are the result of the division of universal POS tags into various tags, like NNS for common plural nouns and NN for the singular common noun compared to NOUN for common nouns in English. These tags are language-specific. You can take a look at the complete list here.

Now you know what POS tags are and what is POS tagging. So let’s write the code in python for POS tagging sentences. For this purpose, I have used Spacy here, but there are other libraries like NLTK and Stanza, which can also be used for doing the same.

Part-of-Speech(POS) Tagging: pos tagging

In the above code sample, I have loaded the spacy’s en_web_core_sm model and used it to get the POS tags. You can see that the pos_ returns the universal POS tags, and tag_ returns detailed POS tags for words in the sentence.

Dependency Parsing

Dependency parsing is the process of analyzing the grammatical structure of a sentence based on the dependencies between the words in a sentence.

In Dependency parsing, various tags represent the relationship between two words in a sentence. These tags are the dependency tags. For example, In the phrase ‘rainy weather,’ the word rainy modifies the meaning of the noun weather. Therefore, a dependency exists from the weather -> rainy in which the weather acts as the head and the rainy acts as dependent or child. This dependency is represented by amod tag, which stands for the adjectival modifier.

Similar to this, there exist many dependencies among words in a sentence but note that a dependency involves only two words in which one acts as the head and other acts as the child. As of now, there are 37 universal dependency relations used in Universal Dependency (version 2). You can take a look at all of them here. Apart from these, there also exist many language-specific tags.

Various Spacy

Now let’s use Spacy and find the dependencies in a sentence.

In the above code example, the dep_ returns the dependency tag for a word, and head.text returns the respective head word. If you noticed, in the above image, the word took has a dependency tag of ROOT. This tag is assigned to the word which acts as the head of many words in a sentence but is not a child of any other word. Generally, it is the main verb of the sentence similar to ‘took’ in this case.

Now you know what dependency tags and what head, child, and root word are. But doesn’t the parsing means generating a parse tree?

Yes, we’re generating the tree here, but we’re not visualizing it. The tree generated by dependency parsing is known as a dependency tree. There are multiple ways of visualizing it, but for the sake of simplicity, we’ll use displaCy which is used for visualizing the dependency parse.

In the above image, the arrows represent the dependency between two words in which the word at the arrowhead is the child, and the word at the end of the arrow is head. The root word can act as the head of multiple words in a sentence but is not a child of any other word. You can see above that the word ‘took’ has multiple outgoing arrows but none incoming. Therefore, it is the root word. One interesting thing about the root word is that if you start tracing the dependencies in a sentence you can reach the root word, no matter from which word you start.

Now you know about the dependency parsing, so let’s learn about another type of parsing known as Constituency Parsing.

Constituency Parsing

Constituency Parsing is the process of analyzing the sentences by breaking down it into sub-phrases also known as constituents. These sub-phrases belong to a specific category of grammar like NP (noun phrase) and VP(verb phrase).

Let’s understand it with the help of an example. Suppose I have the same sentence which I used in previous examples, i.e., “It took me more than two hours to translate a few pages of English.” and I have performed constituency parsing on it. Then, the constituency parse tree for this sentence is given by-

In the above tree, the words of the sentence are written in purple color, and the POS tags are written in red color. Except for these, everything is written in black color, which represents the constituents. You can clearly see how the whole sentence is divided into sub-phrases until only the words remain at the terminals. Also, there are different tags for denoting constituents like

VP for verb phrase
NP for noun phrases

These are the constituent tags. You can read about different constituent tags here.

Now you know what constituency parsing is, so it’s time to code in python. Now spaCy does not provide an official API for constituency parsing. Therefore, we will be using the Berkeley Neural Parser. It is a python implementation of the parsers based on Constituency Parsing with a Self-Attentive Encoder from ACL 2018.

You can also use StanfordParser with Stanza or NLTK for this purpose, but here I have used the Berkely Neural Parser. For using this, we need first to install it. You can do that by running the following command.

!pip install benepar

Then you have to download the benerpar_en2 model.

You might have noticed that I am using TensorFlow 1.x here because currently, the benepar does not support TensorFlow 2.0. Now, it’s time to do constituency parsing.

Here, _.parse_string generates the parse tree in the form of string.

What is the use case of POS tagging?

Here are Some Use Cases of Pos tagging :

Syntactic Analysis: By understanding the grammatical role of each word (e.g., noun phrase, verb phrase), POS tagging helps analyze the sentence structure and relationships between words. This is achieved using hidden Markov models and other algorithms that predict the most likely sequence of POS tags based on the given text.

Disambiguation: Words like “play” can be a noun or verb. POS tagging helps identify the correct meaning based on context, using tagsets that define the possible tags for each word type and their contexts.

Language Modeling: POS tags provide valuable information about the relationships between words, which is useful for building statistical models of language. These models can be enhanced with deep learning techniques to improve their accuracy and handling of complex linguistic patterns.

Preprocessing for Other NLP Tasks: POS tagging is often a preliminary step for tasks like named entity recognition and information extraction. By identifying the part of speech for each word, we can better understand the structure of the text and extract relevant information more accurately. This involves prepositions and other parts of speech that help determine the relationships between entities in a sentence.

Lemmatization and Stemming: These techniques reduce words to their base forms (e.g., “running” to “run”). POS tags can help identify the correct base form depending on the word’s function in the sentence, distinguishing between different uses such as nouns, verbs, or interjections.

Grammar Checking: POS information can be used to flag potential grammatical errors, like using a verb in the wrong tense. This is particularly useful in applications such as grammar checking software, where understanding the pos tagger output helps identify mistakes.

Plays different Roles

By incorporating these keywords, we can understand how POS tagging plays a critical role in various aspects of natural language processing and syntactic analysis.

Why is POS tagging hard?

Here are some reasons for Pos tagging is challenging :

Word ambiguity: Many words in a corpora have multiple meanings and parts of speech depending on the context. For instance, “bat” can be a noun (a flying mammal) or a verb (to hit something). A part-of-speech tagger needs to consider the surrounding words to assign the correct tag.

words and complex grammar: Part-of-speech taggers are trained on large amounts of training data, but they can struggle with words they haven’t encountered before (out-of-vocabulary words) or languages with complex grammatical structures.

Here are some additional factors that make POS tagging tricky:

Idioms and slang: Informal language constructs often don’t follow standard grammar rules, making them difficult to tag accurately.
Domain dependence: A part-of-speech tagger trained on a general dataset might not perform well on very specific domains, like legal documents or medical reports.
Perception: The interpretation of a text can vary depending on individual perception, which can affect how parts of speech are tagged.
Cardinal numbers: Numbers can be challenging as they can function as nouns, adjectives, or even other parts of speech depending on their use in a sentence.
Transformation-based methods: These methods refine initial tagging decisions based on a set of learned rules, improving accuracy but adding complexity to the tagging process.

End Notes

Now, you know what POS tagging, dependency parsing, and constituency parsing are and how they help you in understanding the text data i.e., POS tags tells you about the part-of-speech of words in a sentence, dependency parsing tells you about the existing dependencies between the words in a sentence and constituency parsing tells you about the sub-phrases or constituents of a sentence. You are now ready to move to more complex parts of NLP. As your next steps, you can read the following articles on the information extraction.

In these articles, you’ll learn how to use POS tags and dependency tags for extracting information from the corpus. Also, if you want to learn about spaCy then you can read this article: spaCy Tutorial to Learn and Master Natural Language Processing (NLP) Apart from these, if you want to learn natural language processing through a course then I can highly recommend you the following which includes everything from projects to one-on-one mentorship:

With this article you will get proper understanding of conditional random fields providing different languages like : penn treebank,prp,suffix, training corpus different vectors. Before using past tense we are saying that word tokenize various perceptron like nnp,iterate etc.

Natural Language Processing using Python

Frequently Asked Questions

Q1. What is POS tagging a preprocessing step for?

A. Part-of-Speech (POS) tagging is a preprocessing step in natural language processing (NLP) that involves assigning a grammatical category or part-of-speech label (such as noun, verb, adjective, etc.) to each word in a sentence. It serves several purposes as a preprocessing step:
1. Syntactic Analysis: POS tagging helps in understanding the grammatical structure of a sentence. It provides information about the roles of words in forming phrases and sentences, aiding in syntactic analysis.
2. Feature Extraction: POS tags can be useful as features for various NLP tasks, such as text classification, named entity recognition, and machine translation. Different POS tags often convey different semantic or contextual information about words.
3. Disambiguation: Many words in natural language have multiple possible interpretations (polysemy). POS tagging helps disambiguate word senses based on their grammatical context.
4. Language Modeling: POS tagging is often used as a building block for language models, providing information about the relationships between words and their likely syntactic roles.
5. Rule-Based Processing: POS tags can be used in rule-based processing to identify patterns and grammatical structures in text.
6. Lemmatization and Stemming: POS information is valuable for lemmatization and stemming, where words are reduced to their base forms.
7. Parsing and Grammar Checking: POS tagging aids in syntactic parsing and grammar checking by providing information about how words function within sentences.
In summary, POS tagging is a fundamental preprocessing step that helps enhance the accuracy and effectiveness of various NLP tasks by providing insights into the grammatical and syntactic structure of textual data.

Q2. What is an example of POS tagging?

A. In the sentence “She quickly reads a book,” POS tagging assigns tags like “PRON” (pronoun) to “She,” “ADV” (adverb) to “quickly,” “VERB” to “reads,” “DET” (determiner) to “a,” and “NOUN” to “book.” This tagging clarifies the roles and grammatical functions of words, aiding syntactic and semantic analysis in natural language processing tasks.

Q3.What is the use of POS tagging in sentiment analysis?

POS tagging in sentiment analysis helps understand the emotional tone of words in a sentence. It labels words with their parts of speech, aiding in identifying nuances that influence sentiment.

Q4.What is POS tagging text classification?

POS tagging in text classification involves assigning parts of speech (like nouns, verbs, etc.) to words in a given text. This tagging adds linguistic context, enhancing the accuracy of algorithms in understanding and categorizing text content.

If you found this article informative, then share it with your friends. Also, you can comment below your queries.

Abhishek Sharma 30 May, 2024

He is a data science aficionado, who loves diving into data and generating insights from it. He is always ready for making machines to learn through code and writing technical blogs. His areas of interest include Machine Learning and Natural Language Processing still open for something new and exciting.

Beginner NLP Python Technique Unstructured Data