Text Summarisation

UPPU RAJESH KUMAR 15 Mar, 2022 • 5 min read

This article was published as a part of the Data Science Blogathon.

Introduction

Text Summarisation is an important Natural Language Processing(NLP) task. Text summarisation involves condensing a larger text into smaller sizes by preserving the meaning to convey the core message of the text. Basically, it wraps up long text into main points so that we can read any long text in short form. Automatic text summarisation is used in many domains. One of the most common uses of Text Summarisation is in the news field. For example, every day we read many news articles. While reading an article, generally, we come across many details that may not be needed in the article or we may across such details that are not so important for us to know. We would like to know the crux of the news article. In such a case, text summarisation techniques help us to condense the news article size into small size by giving us the crux of the news article. There are many applications like Inshorts that use text summarisation techniques to serve us with short news. Machine learning models are trained to learn the documents and filter the useful information so that they give us the summarized version of any new text we provide. Every day huge amount of data is being generated and there is a need to look at this data to make our businesses efficient. For this, we need to develop efficient machine learning models that condense the long data texts into small accurate ones, and fluently pass the intended summaries so that we can go through them in less time. There are two main approaches to summarizing texts. They are –

  1. Extraction-based Summarization
  2. Abstraction-based Summarization

Extraction-based Summarization works by extracting key phrases in the given text and joining them to form meaningful sentences whereas Abstraction-based Summarization works by creating new phrases and sentences that resonate with the meaning of the original text. Both are supervised machine learning problems.

In this project, we will create an interactive text summarizer using a pre-trained model from Facebook that’s available on the hugging face model hub and Gradio to create the user interface and host it on hugging face spaces.

Overview

  1. Gradio
  2. Hugging face spaces
  3. Building the application
  4. Deployment
  5. Conclusion

Gradio

Gradio is an open-source python library to quickly create web interfaces to prototype the Machine Learning models that we build. Gradio has become popular nowadays and is being used by machine learning practitioners. Recently gradio has been acquired by hugging face company. With gradio, we can quickly create see the demo of the machine learning models in action. It is very easy to code as it takes just a few lines of code to create an interface.

Grade | Text Summarization
Image-1

Hugging face Spaces

Hugging faces Spaces is a free platform to host our machine learning web apps to show a demo of how our machine learning model can perform in production. It is very easy to deploy and also a very robust platform. We can create as many applications as we want and deploy them. We can also create code on the same platform without having to code in a local machine. In this project, we will create code on this platform and host our application on the same platform.

Text Summarization | Spaces
Image-2

Building the Application

We will create a code repository on the hugging face spaces itself without using any local machine. Go to this website and create an account if you don’t have one. After creating your account then click the ‘create space’ button you see on the top right. You will see a new web page asking for the name of your repo. Give the desired name, choose an appropriate license, click ‘Gradio’ under the SDK option, and click the ‘create space’ button. Now you have created a space that will host your application. Now it’s time to create an ‘app.py’ file. Click the add file option in the top right and choose to create a new file. After that, you will see a code editor. Type the following code in that space.

import gradio as gr
title = 'Text Summarization'
text_ = "The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."
interface = gr.Interface.load("huggingface/facebook/bart-large-cnn",
title = title,
theme = "peach",
examples = [[text_]]).launch()

Explanation of above code –

First, we import gradio library. Hugging face spaces has some pre-installed libraries. So we can directly import them without installation. After importing, we create the title of our app using the ‘title‘ variable. We also want to give an example article for the user to use. So we declare that passage using the variable ‘text_‘. Next, we directly create the interface for our application using ‘gr.Interface.load()‘. Here we use a text summarizer model created by Facebook to summarize text trained on a CNN dataset. This model is available on hugging face model hub and using gradio we can load that model directly without having to install any library. We load that model as shown in the code and we declare the title, theme, and examples that we want to show. Finally, we launch our app using ‘.launch()‘.

We should also create a ‘requirements.txt‘ file. Add the following to that file and commit changes.

tensorflow
transformers

Deployment

Once we do the above procedure hugging face space takes care of the rest on deploys the application. Click on the ‘App’ button you see on the left top side. You can see your app running. A glimpse of what your application might look like is shown below –

Text Summarization

It looks very neat and elegant. You can enter your text in the input box and click submit button to get the summary of your text in the right side output box. If you want to see an instant example just click the text under ‘Examples’ and click submit button. You will see a summarized version of the article in the output box and then click the ‘Clear’ button to start fresh. Finally, we have created an interactive text summarizer using gradio library with a few lines of code.

Conclusion

We used the gradio library and models available on the hugging face model hub to create a simple text summarizer application. These applications are very useful in reducing the time of reading articles and accelerating the process of researching the information. If you have any doubts regarding the code please comment below so that I can clear them for you.

Read more articles on Text Summarisation here.

Image-1 source: Gradio

Image-2 source: Spaces – Hugging Face

Hope you liked my article on Text Summarisation.

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion. 

UPPU RAJESH KUMAR 15 Mar 2022

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear