A Basic Introduction to OpenCV in Deep Learning

Pranshu Sharma 12 Mar, 2024 • 8 min read

Introduction

OpenCV Python is a massive open-source library for various fields like computer vision, machine learning, image processing and plays a critical function in real-time operations, which are fundamental in today’s systems. It is deployed for the detection of items, faces, Diseases, lesions, Number plates, and even handwriting in various images and videos. With help of OpenCV basics in Deep Learning, we deploy vector space and execute mathematical operations on these features to identify visual patterns and their various features.

OpenCV in Deep Learning — Source: Credera.com

This article was published as a part of the Data Science Blogathon.

What is Computer Vision?

Computer vision is an approach to understanding how photos and movies are stored, as well as manipulating and extracting information from them. Artificial Intelligence depends on or is mostly based on computer vision. Self-driving cars, robotics, and picture editing apps all rely heavily on computer vision

Human vision has a resemblance to that of computer vision. Human vision learns from the various life experiences and deploys them to distinguish objects and interpret the distance between various objects and estimate the relative position.

With cameras, data, and algorithms, computer vision trains machines to accomplish these jobs in much less time.

Computer vision allows computers and systems to extract useful data from digital images and video inputs.

Installing and Importing the OpenCV Image Preprocessing Package

OpenCV in deep learning is an extremely important important aspect of many Machine Learning algorithms. OpenCV basics is an open-source library (package) for computer vision, machine learning, and image processing applications that run on the CPU exclusively. It works with many different programming languages, including Python. It can be imported with single line command as being depicted below

pip install opencv-python

A package in Python is a collection of modules that contain pre-written programmes. These packages allow you to import modules separately or in their whole. Importing the package is as simple as calling the “cv2” module as seen below:

import cv2 as cv

Reading an Input Image

Colour photographs, grayscale photographs, binary photographs, and multispectral photographs are all examples of digital images. In a colour image, each pixel contains its colour information. Binary images have only two colours, usually black and white pixels, and grayscale images have only shades of grey as their only colour. Multispectral pictures gather image data spanning the electromagnetic spectrum within a specific wavelength.

To read the image, we use the “imread” method from the cv2 package, where the first parameter is the image’s path, including filename and extension, and the second parameter is a flag that determines how to read in the image.

By changing the absolute path of the image here, you can test reading it from your local computer or even the internet! If the image is already in your current working directory, you only need to specify the picture name and extension type. Set the second parameter to 0 to read it as a grayscale image, -1 to read it as unmodified (reads the image as alpha or transparency channel if it exists), and 1 to read it as a colour image if you want to read it as a colour image.

OpenCV Functions to Start your Computer Vision journey

The features of a picture that is being utilised as an input

import cv2
# To read image cv2.imread function, 
img = cv2.imread("pythonlogo.png", cv2.IMREAD_COLOR)
# Creating GUI window to display an image on screen
cv2.imshow("Cute Kitens", img)

Output:

Image Data Type

To discover the image’s type, use the “dtype” technique. This strategy enables us to comprehend the representation of visual data and the pixel value.

in addition to the image kind, It’s a multidimensional container for things of comparable shape and size.

Pixel values for the image

A collection of small samples can be thought of as an image. These samples are referred to as pixels. To have a better understanding of an image, try zooming in as much as possible. Divided into several squares, the same can be seen. These are pixels, and when all of them are combined, they form an image. One of the simplest methods to represent an image is via a matrix.

Code:

print("The data type of the image is",image.dtype) 

Output:
The data type of the image is uint8
uint8 is representing  each pixel value being an Unsigned Integer of
8 bits. This data type ranges between 0 to 255

Image Resolution

Image resolution is defined as the number of pixels in an image. As the number of pixels rises, the image quality improves. As we saw before, the image’s shape determines the number of rows and columns. Pixel values in images: 320 x 240 pixels (mostly suitable for small screen devices), 1024 x 768 pixels (appropriate for viewing on standard computer monitors), 720 x 576 pixels (good for viewing on standard definition TV sets with 4:3 aspect ratio), 1280 x 720 pixels (for viewing on widescreen monitors), 1280 x 1024 pixels (for viewing on full-screen monitors) Pixel values in images.

Image Classification Using CNN

Image Pixel Values

A collection of small samples can be thought of as an image. The unit of measurement for these samples is pixels. For improved comprehension, try zooming in on a picture as much as possible. The same can be divided into several different squares. These are pixels that, when combined, make up an image.

The quality of an image decreases as the number of pixels in the image increases. The image’s shape, which we saw earlier, determines the number of rows and columns.

Viewing the Images

Let’s have a look at how to make the image appear in a window. We’ll need to create a graphical user interface (GUI) window to display the image on the screen to do so. The title of the GUI window screen must be the first parameter, and it must be specified in string format. The image can be displayed in a pop-up window using the cv2.imshow() method. However, if you try to close it, you can get stuck with its window. We can use the “waitKey” method to mitigate this.

The “waitKey” parameter has been set to ‘0’ to keep the window open until we close it. (You can specify the time in milliseconds instead of 0, indicating how long it should be open for.)

# To read image from disk, we use
# cv2.imread function, in below method,
img = cv2.imread("python logo.png", cv2.IMREAD_COLOR)
# Creating GUI window to display an image on screen
# first Parameter is windows title (should be in string format)
# Second Parameter is image array
cv2.imshow("The Logo", img)
# To hold the window on screen, we use cv2.waitKey method,
If 0 pass an parameter, then it will
# hold the screen until user close it.
cv2.waitKey(0)
# for removing/deleting created GUI window from screen
# and memory
cv2.destroyAllWindows()

Output:

Output: GUI Window, Source: Author

Reconstructing the image bit planes after extracting the image bit planes

An image can be divided into several levels of bit planes. Divide an image into 8-bit (0-7) planes, with the last few planes containing the majority of the image’s data.

Image Operations Using OpenCV and Python

Checking Properties of the Input Image

Input Image:

import cv2
import numpy as np
import matplotlib.pyplot as plt
img = plt.imread("my pic.jpg")
plt.imshow(img)
print(img.shape)
print(img.size)
print(img.dtype)

Output:

(1921, 1921, 3)
11070723
uint8

Basic Image Processing

Input Image:

import matplotlib.pyplot as plt
import cv2
import numpy as np
image = cv2.imread(“baby yoda.jpg”)
#cv2.imshow(‘Example – Show image in window’,image)
img2 = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)

Output:

Dilation and Erosion of the Input Image

Input Image:

import cv2
import numpy as np
import matplotlib.pyplot as plt
img = plt.imread("baby yoda.jpg")
# Taking a matrix of size 5 as the kernel
kernel = np.ones((5,5), np.uint8)
# first parameter is basicaly  the original image,
# kernel is the matrix with which image is convolved 
# and third parameter is the number of iterations, which will determine how much
# you want to erode/dilate a given image.
img_erosion = cv2.erode(img, kernel, iterations=1)
img_dilation = cv2.dilate(img, kernel, iterations=1)
plt.imshow(img)
plt.imshow(img_erosion)
plt.imshow(img_dilation)

Output:

Normal image — Source: PopularMechanic.com

OpenCV Applications

The concept of OpenCV basics in Deep Learning is applied for recognition of faces.
Counting the number of people (foot traffic in a mall, for example)
Counting the number of automobiles on motorways and their speeds
Interaction-based art installations
Anomalies (defects) are detected during the production process (the odd defective products)
Stitching an image from a street view
Street view image stitching
Video/image search and retrieval
Robot and autonomous car navigation and control
Object recognition
Medical image analysis
Movies – 3D structure from motion

Functionality of OpenCV

I/O, processing and display of images and videos
Detection of objects and features
Computer vision based on geometry
Computer-assisted photography

Conclusion

So in this article, we covered the basic Introduction about OpenCV Library and its application in real-time scenarios. We also covered other key terminologies and fields where OpenCV in deep learning is being deployed(Computer Vision) as well as implemented python code for performing some of the basic image operations(dilation, erosion, and changing image colours) with the help of the OpenCV library. Apart from that OpenCV basics in deep learning would also find application in a variety of industries.

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.

Frequently Asked Questions?

Q1. What is OpenCV and what are its main applications?

A. OpenCV stands for Open Source Computer Vision. It is a vast open-source library utilized in fields such as computer vision, machine learning, and image processing. Its applications include object detection, facial recognition, medical image analysis, and more.

Q2. How does OpenCV contribute to real-time operations?

A. OpenCV Basically plays a critical role in real-time systems by providing algorithms and tools for processing images and videos swiftly. It enables tasks such as object detection, face recognition, and handwriting recognition in real-time scenarios.

Q3. How does Computer Vision relate to human vision?

A. Computer vision mimics human vision by interpreting visual data from images and videos. Similar to how humans learn from experiences to recognize objects and estimate distances, computer vision uses algorithms to analyze visual data and extract useful information.

Q4. What programming languages can be used with OpenCV Python Basics?

A. OpenCV Basics is compatible with various programming languages, including Python, C++, and Java. However, Python is widely used due to its simplicity and ease of integration with other libraries.

Q5. What are some key features of OpenCV’s Python image processing capabilities?

A. OpenCV provides functionalities for reading and manipulating images, including reading different image types (color, grayscale, binary), extracting pixel values, viewing images in graphical user interfaces, and performing basic image processing operations like dilation and erosion.