Aman Preet Gulati — June 22, 2021
Advanced Computer Vision Image Image Analysis NLP Project Python Unstructured Data

This article was published as a part of the Data Science Blogathon

What is OCR?

OCR is formerly known as Optical Character Recognition which is revolutionary for the digital world nowadays. OCR is actually a complete process under which the images/documents which are present in a digital world are processed and from the text are being processed out as normal editable text.

Purpose of OCR

OCR is a technology that enables you to convert different types of documents, such as scanned paper documents, PDF files, or images captured by a digital camera into editable and searchable data.

What is EasyOCR?

EasyOCR is actually a python package that holds PyTorch as a backend handler. EasyOCR like any other OCR(tesseract of Google or any other) detects the text from images but in my reference, while using it I found that it is the most straightforward way to detect text from images also when high end deep learning library(PyTorch) is supporting it in the backend which makes it accuracy more credible. EasyOCR supports 42+ languages for detection purposes. EasyOCR is created by the company named Jaided AI company.

 

EasyOCR What
Image Source: Github

Table of content

  1. Install core dependencies
  2. Importing libraries
  3. Reading images
    • Through URL
    • Locally
  4. Extracting text from the image
    • With GPU
    • Without GPU
    • English text
    • Turkish text and other languages
  5. Drawing results on images
    • Example 1
    • Example 2
    • Dealing with multiple lines of text

1. Install core dependencies

  • Pytorch

Installing PyTorch as a complete package can be a little tricky so I would recommend traversing through the official site of PyTorch. When you will open its official site then that’s what you will see in its interface as in the image below.

Install core dependencies easyocr
Image Source: PyTorch

Now, if you will look closely at the above image one can find out that there are numerous options available for us to choose from and get the command most compatible according to our choices.

Let me show you a representation of what I’m trying to mention!.

Install core dependencies pytorch
Image Source: PyTorch

In the above representation, one can notice that I have chosen the Package: pip and Compute platform: CPU and based on my choices I got the command as – pip install torch torchvision torchaudio. After getting this command it would be like walking on a cake, simply just run this command on your command prompt and your PyTorch library will be installed successfully.

  • EasyOCR

After installing the PyTorch library successfully it’s quite easy to install the EasyOCR library, one just has to run the following command:

pip3 install easyocr

Then your command prompt interface will be like:

command prompt easyocr

2. Importing Libraries

import os
import easyocr
import cv2
from matplotlib import pyplot as plt
import numpy as np

3. Reading images

  • Taking an online image: Here we will take an image from a URL (online)
IMAGE_PATH = 'https://blog.aspose.com/wp-content/uploads/sites/2/2020/05/Perform-OCR-using-C.jpg'

In the above code snippet, one can notice that the IMAGE_PATH holds the URL of the image.

  • Taking image as input locally: Here we will take an image from the local system.
IMAGE_PATH = 'Perform-OCR.jpg'

In the above code snippet, one can notice that I have taken the image locally i.e. from the local system.

4. Extracting text from the image

  • English text detection
reader = easyocr.Reader(['en'])
result = reader.readtext(IMAGE_PATH,paragraph="False")
result

Output:

[[[[95, 71], [153, 71], [153, 107], [95, 107]], 'OCR']]

Adding an image for your preference.

Extracting text from the image EasyOCR
Image Source: LaptrinhX

Now finally, we have extracted the text from the given image

Let’s break down code line by line:

  1. Here, we are using the Reader class from easyocr class and then passing [‘en’] as an attribute which means that now it will only detect the English part of the image as text, if it will find other languages like Chinese and Japanese then it will ignore those text.
  2. Now, as in the above line, we have set the attribute for language so, here we are loading the IMAGE_PATH in the readText() function and one will find out a parameter which is “paragraph” here it is set as False which means that now easyOCR will not combine the results i.e. if easyocr will encounter multiple texts it will not combine them instead it will show them separately.
  3. Getting the result in the form of a 2-D NumPy array.
  • Turkish text detection
# Changing the image path
IMAGE_PATH = 'Turkish_text.png'
# Same code here just changing the attribute from ['en'] to ['zh']
reader = easyocr.Reader(['tr'])
result = reader.readtext(IMAGE_PATH,paragraph="False")
result

Output:

[[[[89, 7], [717, 7], [717, 108], [89, 108]],
  'Most Common Texting Slang in Turkish'],
 [[[392, 234], [446, 234], [446, 260], [392, 260]], 'test'],
 [[[353, 263], [488, 263], [488, 308], [353, 308]], 'yazmak'],
 [[[394, 380], [446, 380], [446, 410], [394, 410]], 'link'],
 [[[351, 409], [489, 409], [489, 453], [351, 453]], 'bağlantı'],
 [[[373, 525], [469, 525], [469, 595], [373, 595]], 'tag etiket'],
 [[[353, 674], [483, 674], [483, 748], [353, 748]], 'follov takip et']]

For your preference I’m adding the image to which I have done this Turkish text detection!

turkish class 101
Image Source: TurkishClass101

Fact

EasyOCR currently supports 42 languages I have provided the set of all those languages with their notations. Have fun with it guys!

Afrikaans (af), Azerbaijani (az), Bosnian (bs), Czech (cs), Welsh (cy), Danish (da), German (de), English (en), Spanish (es), Estonian (et), French (fr), Irish (ga), Croatian (hr), Hungarian (hu), Indonesian (id), Icelandic (is), Italian (it), Japanese (ja), Korean (ko), Kurdish (ku), Latin (la), Lithuanian (lt), Latvian (lv), Maori (mi), Malay (ms), Maltese (mt), Dutch (nl), Norwegian (no), Polish (pl), Portuguese (pt),Romanian (ro), Slovak (sk), Slovenian (sl), Albanian (sq), Swedish (sv),Swahili (sw), Thai (th), Tagalog (tl), Turkish (tr), Uzbek (uz), Vietnamese (vi), Chinese (zh) – Source: JaidedAI

EasyOCR provides enough flexibility to choose Text detection with GPU or without.

  • Extracting text from image with GPU

# Changing the image path
IMAGE_PATH = 'Turkish_text.png'
reader = easyocr.Reader(['en'])
result = reader.readtext(IMAGE_PATH)
result

Output:

[([[89, 7], [717, 7], [717, 75], [89, 75]],
  'Most Common Texting Slang',
  0.8411301022318493),
 ([[296, 60], [504, 60], [504, 108], [296, 108]],
  'in Turkish',
  0.9992136162168752),
 ([[392, 234], [446, 234], [446, 260], [392, 260]], 'text', 0.955612246445849),
 ([[353, 263], [488, 263], [488, 308], [353, 308]],
  'yazmak',
  0.8339281200424168),
 ([[394, 380], [446, 380], [446, 410], [394, 410]],
  'link',
  0.8571656346321106),
 ([[351, 409], [489, 409], [489, 453], [351, 453]],
  'baglanti',
  0.9827189297769966),
 ([[393, 525], [446, 525], [446, 562], [393, 562]], 'tag', 0.999996145772132),
 ([[373, 559], [469, 559], [469, 595], [373, 595]],
  'etiket',
  0.9999972515293261),
 ([[378, 674], [460, 674], [460, 704], [378, 704]],
  'follow',
  0.9879666041306504),
 ([[353, 703], [483, 703], [483, 748], [353, 748]],
  'takip et',
  0.9987622244733467)]
  • Extracting text from image without GPU

# Changing the image path
IMAGE_PATH = 'Perform-OCR.jpg'
reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext(IMAGE_PATH)
result

Output:

[([[95, 71], [153, 71], [153, 107], [95, 107]], 'OCR', 0.990493426051807)]
# Where 0.9904.. is the confidence level of detection

Note: If you don’t have the GPU and yet you are not setting it as False then you will get the following warning:

GPU

5.1. Draw results for single-line text – Example 1

top_left = tuple(result[0][0][0])
bottom_right = tuple(result[0][0][2])
text = result[0][1]
font = cv2.FONT_HERSHEY_SIMPLEX

In the above code snippet,

  1. We are trying to get the coordinates to draw the bounding box and text over our image on which we have to perform our detection.
  2. In the top_left variable, we are getting the coordinate of the top_left corner in the form of tuple accessing from results. Similarly, we can see that in the bottom_right coordinate.
  3. Getting the coordinate of text from 2-d array format
  4. Choosing font of text as FONT_HERSHEY_SIMPLEX from cv2 package.
img = cv2.imread(IMAGE_PATH)
img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
img = cv2.putText(img,text,bottom_right, font, 0.5,(0,255,0),2,cv2.LINE_AA)
plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()

Now, as if we have got the coordinates let’s just plot them!

  1. Reading the image using the cv2 imread() function
  2. Drawing the rectangle using top_left and bottom_right coordinates and giving a descent color((0,255,0)) and thickness(3).
  3. Drawing text over the image by using top_left coordinate (just above the rectangle- bounding box)
  4. Showing the image

Output:

OCR output

5.2. Draw Results for single-line text – Example 2

IMAGE_PATH = 'sign.png'
reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext(IMAGE_PATH)
result

Output:

[([[19, 181], [165, 181], [165, 201], [19, 201]],
  'HEAD PROTECTION',
  0.9778256296390029),
 ([[31, 201], [153, 201], [153, 219], [31, 219]],
  'MUST BE WORN',
  0.9719649866726915),
 ([[39, 219], [145, 219], [145, 237], [39, 237]],
  'ON THIS SITE',
  0.9683973478739152)]

Getting the coordinates

top_left = tuple(result[0][0][0])
bottom_right = tuple(result[0][0][2])
text = result[0][1]
font = cv2.FONT_HERSHEY_SIMPLEX

Drawing text and bounding boxes

img = cv2.imread(IMAGE_PATH)
img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
img = cv2.putText(img,text,top_left, font, 0.5,(0,0,255),2,cv2.LINE_AA)
plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()

Output:

 

Draw Results for single-line text

But hold on! What if we want to see the all text detection in an image itself?

That’s what I’ll do in this section!

5.3. Draw results for multiple lines

img = cv2.imread(IMAGE_PATH)
spacer = 100
for detection in result: 
    top_left = tuple(detection[0][0])
    bottom_right = tuple(detection[0][2])
    text = detection[1]
    img = cv2.rectangle(img,top_left,bottom_right,(0,255,0),3)
    img = cv2.putText(img,text,(20,spacer), font, 0.5,(0,255,0),2,cv2.LINE_AA)
    spacer+=15
plt.figure(figsize=(10,10))
plt.imshow(img)
plt.show()

In the above code snippet, we just need to focus on few points:

  1. Instead of detecting one-line text, here we are looping through all the detection as we want to plot multiple lines of text
  2. While giving the coordinates on cv2.putText we are using an extra variable which is “spacer” this spacer later in the code is being incremented to +15 which is helping to restrict the text to collide over each other.
  3. This spacer variable will help the text to remain sorted and equally spaced.

Output:

 

Draw results for multiple lines

The conclusion of the model also concludes my discussion for today 🙂

Endnotes

Thank you for reading my article 🙂

I hope you have enjoyed the practical implementation and line-by-line explanation of the EasyOCR hands-on guide

I’m providing the code link here so that you guys can also learn and contribute to this project to make it even better.

You will never want to miss my previous article on, “PAN card fraud detection” published on Analytics Vidhyaas a part of the Data Science Blogathon-9. Refer to this link

Also on, “Drug discovery using machine learning”. Refer to this link.

If got any queries you can connect with me on LinkedIn, refer to this link

About me

Greeting to everyone, I’m currently working as a Data Science Associate Analyst in Zorba Consulting India. Along with part-time work, I’ve got an immense interest in the same field i.e. Data Science along with its other subsets of Artificial Intelligence such as, Computer Vision, Machine learning, and Deep learning feel free to collaborate with me on any project on the above-mentioned domains (LinkedIn).

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

About the Author

Our Top Authors

Download Analytics Vidhya App for the Latest blog/Article

One thought on "Text detection from images using EasyOCR: Hands-on guide"

Harish Nagpal
Harish Nagpal says: June 23, 2021 at 9:16 am
Well written Reply

Leave a Reply Your email address will not be published. Required fields are marked *