How to Summarize Text with Transformer-based Models?

Janvi Kumari 30 May, 2024
9 min read


One of the most important tasks in natural language processing is text summarizing, which reduces long texts to brief summaries while maintaining important information. This subject has been transformed by Transformers, which are sophisticated deep learning models that provide unmatched performance in extractive and abstractive summarization techniques. Their cutting-edge skills and contextual knowledge power a wide range of applications, from document management to news aggregation. Implementing text summarization with ease using Transformers and Python modules creates new opportunities for efficient information processing and decision-making.

Top 8 Text Summarization Tools in 2024

What is Text Summarization?

Text summarization is about taking all long document and making it in shorter version that captures all the important points present in the document. The goal is extract the most important information present in the document in clear and concise manner. News aggregation, content analysis, and information retrieval are among the uses for text summarization.

How Text Summarization is Performed Using Transformers?

There are two ways to summarize text using transformer:

Extractive Summarization: Extractive summarization involves identifying important sections from text and generating them verbatim which produces a subset of sentences from the original text. Transformers improve this procedure by using text processing to extract features, which they then use to rank sentences according to these attributes. The primary actions consist of:

  • Text Processing: Transformers examine the text to determine its context and the connections among its various sections.
  • Feature Extraction: The text takes key words and phrases, along with other significant properties.
  • Sentence Ranking: The order of sentences is determined by how closely they relate to the main idea of the document.
  • Summary Generation: A logical summary is created by combining the sentences that scored highest.

Abstractive Summarization : Abstractive summarization uses natural language techniques to interpret and understand the important aspects of a text and generate a more “human” friendly summary.  This summarizes a text in a manner similar to that of a person. Here, methods like encoder-decoder models are used, where:

  • Encoder : Processes the input text to understand and extract its features.
  • Decoder : Generates the summary by creating new sentences that encapsulate the essence of the original text.

In this architecture, transformers can function as the encoder, the decoder, or both. In addition to offering greater freedom, this approach frequently results in summaries that are simpler to read and seem more natural.

Transformers are trained on enormous volumes of textual data for both extractive and abstractive summarization. Their in-depth training makes them especially adept at summarizing assignments since it teaches them intricate patterns and connections between words, sentences, and entire papers.

Why Should You Use Transformers to Summarize Text?

In today’s fast growing world, the information is constantly growing be it from news articles ,research papers or any other source in these cases text summarization comes in handy as it reduces large amounts of information into or short readable format

High Accuracy and Context Awareness

Transformers are designed to understand context at a deep level. Unlike traditional methods, they don’t just pick out keywords; they grasp the nuances and meaning of the entire text. This means the summaries they produce are more accurate and retain the essential information without losing the context.

Handling Complex and Varied Content

Whether you’re dealing with news stories, customer feedback, legal documents, or academic papers, transformers can handle it all. They are versatile and capable of summarizing various types of content effectively. This makes them ideal for applications across different fields, from marketing and research to corporate and legal settings.

Efficiency and Time-Saving

Manually summarizing documents can take a lot of time and labor. Transformers automate this process, delivering concise summaries in seconds. This allows you to quickly grasp the main points and make informed decisions without reading all the papers present in the document.

Improved Information Retrieval

In the digital age, search engines and digital libraries are essential tools. By summarizing search results, transformers help users find the most relevant information faster. This improves the overall effectiveness of information retrieval systems and enhances user experience.

Enhanced Document Management

Managing long documents, especially in corporate, legal, and academic environments, can be hectic. Transformers help by breaking down long papers into manageable chunks, making them easier to organize and reference. This streamlines workflow and boosts productivity.

Better Customer Insights

For businesses, understanding customer feedback is crucial. Transformers can summarize vast amounts of feedback to highlight common themes and issues. This helps companies quickly identify areas for improvement and enhance their products and services.

Legal contracts can be dense and difficult to understand. Transformers can summarize these documents, providing a clear overview of key terms and conditions. This makes it easier for stakeholders to comprehend and compare different contracts.

Streamlined Customer Service

In customer service, quickly identifying the root cause of an issue is vital. Transformers can summarize customer support requests, helping service teams resolve problems more efficiently. This leads to faster response times and improved customer satisfaction.

Transformers are quite useful for text summarization since they provide a number of important benefits.

  • Contextual Understanding: To comprehend the context of words, sentences, and documents, transformers make use of attention mechanisms. Accurately determining the most significant information within a text document depends on this. Transformers’ self-attention mechanism enables them to concentrate on various textual elements and comprehend the connections between disparate sections. 
  • Large Language Models:Transformers have a profound grasp of linguistic relationships and patterns since they have been educated on enormous volumes of textual data. They perform exceptionally well on text summarizing assignments that call for a thorough command of language thanks to their substantial training.
  • Scalability: Transformers are ideal for summarizing lengthy papers or massive volumes of text data because they can handle enormous amounts of text data in simultaneously. The summarization process is accelerated dramatically by this parallel processing capacity.
  • End-to-End Training: By training transformers on text summarizing tasks from beginning to end, we can tailor their performance to the particular task at hand. Thus, they can acquire the ability to produce 
  • State-of-the-Art: Text summarization is just one of the many natural language processing tasks that Transformers have accomplished state-of-the-art results on. Their reputation for producing top-notch summaries has earned them the preference in numerous summarizing apps.

Summary of the Coding Procedure

Let’s now examine the code!

The first step in putting these ideas into effect is to acquire the BBC news dataset. Long articles in this dataset make excellent candidates for summarization assignments. We will go over each stage of preparing the data, creating summaries, and training a Transformer model.

A high-level summary of the coding procedure is as follows:

  • Download the Dataset: Access the BBC news dataset, which contains a number of long stories that can be summarized.
  • Preprocess the Data: Tokenize and eliminate any extraneous information from the text data in order to make it clean and ready for training.
  • Train the Model: To learn from the dataset, apply a Transformer model. For abstractive summarizing, this entails configuring the encoder-decoder architecture; for extractive summarization, it requires feature extraction and rating.
  • Create Summaries: Use the model to create summaries for newly published articles after training, and assess the coherence and quality of the created summaries.
  • Evaluate and Improve: Using metrics like ROUGE scores, evaluate the summarization model’s performance and make necessary adjustments to improve it. 

Let’s dive into the coding part and see how we can implement text summarization using Transformers with the BBC news dataset.

The command will download the file from the URL .

Steps to Summarize Text with Transformer-based Models

Let us now dive deeper into the steps that we need to follow to summarize text with transformer-based model.

Step1: Install Transformers

!pip install transformers

Step2: Importing the pipeline Module from the transformers Library

from transformers import pipeline

Step3: Importing the textwrap Library

import textwrap

The textwrap library is a standard Python library used for text formatting. It provides functionalities to format and manipulate text, such as wrapping text to a certain width, indenting text, and filling text paragraphs. This is particularly useful when you need to display text in a more readable format, especially when working with long strings of text data.

Step4: Importing the numpy Library

import numpy as np

numpy is a fundamental package for numerical computing in Python. It provides support for arrays, matrices, and many mathematical functions to operate on these data structures. In the context of NLP and data manipulation, numpy is often used to handle numerical operations, create arrays for data processing, and perform statistical analysis.

Step5: Importing the pandas Library

import pandas as pd

Step6: Importing the pprint Function from the pprint Library

from pprint import pprint

The pprint module stands for “pretty-print” and is used to display data structures in a more readable and organized way. This is particularly helpful when you need to print large dictionaries or nested data structures in a human-readable format.

Step7: Loading the Dataset into a DataFrame

After importing the necessary libraries, the next step is to load the dataset into a pandas DataFrame. Here’s how you can do it:

df = pd.read_csv('bbc_text_cls.csv?dl=0')

Step8: Display the first few rows of the DataFrame to ensure it loaded correctly


In this section of the code:

The pd.read_csv() function from the pandas library is used to read the dataset from the specified URL and load it into a DataFrame. This function automatically handles the process of downloading the file and parsing its contents into a structured format.

We use the df.head() method to display the first few rows of the DataFrame. This is a quick way to verify that the dataset has been loaded correctly. The pprint function is used here to print the DataFrame in a more readable format.

Step9: Selecting a Business News Article from the DataFrame

doc = df[df.labels == 'business']['text'].sample(random_state=42)
  • DataFrame Filtering: df[df.labels == ‘business’] filters the DataFrame to include only the rows where the ‘labels’ column is equal to ‘business’.
  • Selecting the ‘text’ Column: [‘text’] extracts the ‘text’ column from the filtered DataFrame.
  • Random Sampling: .sample(random_state=42) randomly selects one row from the ‘text’ column. Setting the random_state=42 parameter ensures reproducible sampling, meaning we will select the same row each time we run the code with this seed value.

Step10: Defining the Text Wrapping Function

def wrap(x):
  return textwrap.fill(x, replace_whitespace=False, fix_sentence_endings=True)
  • Function Definition: def wrap(x): defines a function named wrap that takes a single parameter x.
  • Text Wrapping with textwrap.fill: return textwrap.fill(x, replace_whitespace=False, fix_sentence_endings=True) calls the textwrap.fill function on x with specific parameters to format the text.
  • Replace_whitespace Parameter: We set this boolean parameter to False, meaning that we will preserve consecutive whitespace characters in the input string x rather than replacing them with a single space.
  • Fix_sentence_endings Parameter: We set this boolean parameter to True, indicating that the function will attempt to end wrapped lines at sentence boundaries (i.e., after a period) when possible.

The wrap function inserts line breaks into the input string x, ensuring each line is no longer than a specified number of characters (default is 70), and returns the modified version.

Step11: Printing the Wrapped News Article

  • To access the selected article, we use doc.iloc[0] to retrieve the first (and in this case, the only) element from the doc Series. We use iloc to access elements by their integer-location based index.
  • Applying the wrap Function: wrap(doc.iloc[0]) calls the wrap function with the selected article text as its argument. This formats the text according to the specified wrapping rules.
  • Printing the Formatted Text: print(wrap(doc.iloc[0])) prints the wrapped text, making it more readable by ensuring that each line does not exceed a certain length and preferably ends at a sentence boundary.

Step12: Creating the Summarization Pipeline

summarizer = pipeline('summarization')

This line creates a summarization pipeline using the pipeline function from the transformers library. The argument ‘summarization’ specifies the task we will use the pipeline for.

By default, the pipeline utilizes the distilbart-cnn-12–6 model for abstractive summarization.

Step13: Selecting an Article and Generating a Summary

doc = df[df.labels == 'business']['text'].sample(random_state=42)


The first line randomly selects an article from the ‘business’ category in the DataFrame df.

The second line applies the summarization pipeline to the selected article. We split the article text into two parts using the split method with ‘\n’ as the separator. We then pass the second part, representing the main body of the article, to the summarization pipeline.

The summarization pipeline generates a condensed summary of the article.

Step14: Printing the Summarized Text


This line prints the summarized text generated by the summarization pipeline.

Step15: Repeating the Process for Another Article

doc = df[df.labels == 'entertainment']['text'].sample(random_state=50)


These lines select and summarize an article from the ‘entertainment’ category in a similar manner as above.


Transformers-powered text summarization marks a substantial development in natural language processing, making it possible to extract crucial information from massive amounts of text with unmatched precision and effectiveness. Transformers’ adaptability and efficiency in extractive and abstractive summarization methods have opened up new avenues for creative applications in content analysis, news aggregation, and information retrieval, among other fields. Organizations may improve decision-making processes, optimize information processing workflows, and extract new insights from textual data by utilizing Python modules like `pandas` and `transformers`. We expect the influence of Transformers in this sector to rise as text summarization progresses due to advances in deep learning and NLP, providing intriguing potential for additional study.

Frequently Asked Questions

Q1.What is text summarization?

A. Text summarization is the process of condensing a large text document into a shorter version while preserving its key information and meaning.

Q2. What are Transformers in the context of text summarization?

A. Advanced deep learning models, Transformers, have demonstrated remarkable performance in various natural language processing tasks, including text summarization. They utilize attention mechanisms to understand the context of words, sentences, and documents, making them well-suited for summarization tasks.

Q3. What are the two main approaches to text summarization using Transformers?

A. The two main approaches are extractive summarization and abstractive summarization. Extractive summarization involves selecting and combining important sentences or phrases from the original text, while abstractive summarization generates new sentences to convey the main ideas of the text.

Q4. What are some common applications of text summarization?

A. Text summarization has various applications, including news aggregation, content analysis, information retrieval, document management, meeting minutes, customer feedback analysis, legal contract summarization, and customer service optimization.

Q5. Why are Transformers preferred for text summarization tasks?

A. We prefer transformers for text summarization because they understand context, train extensively on large datasets, scale effectively, allow for end-to-end training, and consistently deliver state-of-the-art results.

Q6. How can I implement text summarization with Transformers in Python?

A. You can implement text summarization with Transformers by using libraries such as transformers and pandas in Python. These libraries provide high-level APIs for loading pre-trained models, preprocessing data, training summarization models, and generating summaries.

Janvi Kumari 30 May, 2024

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers


Martin Dougiamas
Martin Dougiamas 30 May, 2024

This is clearly AI-written, completely unnecessary and overly complex in 2024. You can summarise any text by simply prompting any LLM with “summarise the following text with [insert your specific requirements here if you have any]”