A Basic Introduction to OpenCV in Deep Learning
This article was published as a part of the Data Science Blogathon.
OpenCV is a massive open-source library for various fields like computer vision, machine learning, image processing and plays a critical function in real-time operations, which are fundamental in today’s systems. It is deployed for the detection of items, faces, Diseases, lesions, Number plates, and even handwriting in various images and videos. With help of OpenCV in Deep Learning, we deploy vector space and execute mathematical operations on these features to identify visual patterns and their various features.
Table of Contents
- What is Computer Vision?
- Installing and Importing the OpenCV Image Preprocessing Package
- Reading an Input Image
- Image Data Type
- Image Resolution
- Image Pixel Values
- Viewing the Images
- Image Operations Using OpenCV and Python
- OpenCV Applications
- Functionality of OpenCV
What is Computer Vision?
Computer vision is an approach to understanding how photos and movies are stored, as well as manipulating and extracting information from them. Artificial Intelligence depends on or is mostly based on computer vision. Self-driving cars, robotics, and picture editing apps all rely heavily on computer vision
Human vision has a resemblance to that of computer vision. Human vision learns from the various life experiences and deploys them to distinguish objects and interpret the distance between various objects and estimate the relative position.
Source: Analytics Vidhya
With cameras, data, and algorithms, computer vision trains machines to accomplish these jobs in much less time.
Computer vision allows computers and systems to extract useful data from digital images and video inputs.
Installing and Importing the OpenCV Image Preprocessing Package
OpenCV in deep learning is an extremely important important aspect of many Machine Learning algorithms. OpenCV is an open-source library (package) for computer vision, machine learning, and image processing applications that run on the CPU exclusively. It works with many different programming languages, including Python. It can be imported with single line command as being depicted below
pip install opencv-python
A package in Python is a collection of modules that contain pre-written programmes. These packages allow you to import modules separately or in their whole. Importing the package is as simple as calling the “cv2” module as seen below:
import cv2 as cv
Reading an Input Image
Colour photographs, grayscale photographs, binary photographs, and multispectral photographs are all examples of digital images. In a colour image, each pixel contains its colour information. Binary images have only two colours, usually black and white pixels, and grayscale images have only shades of grey as their only colour. Multispectral pictures gather image data spanning the electromagnetic spectrum within a specific wavelength.
To read the image, we use the “imread” method from the cv2 package, where the first parameter is the image’s path, including filename and extension, and the second parameter is a flag that determines how to read in the image.
By changing the absolute path of the image here, you can test reading it from your local computer or even the internet! If the image is already in your current working directory, you only need to specify the picture name and extension type. Set the second parameter to 0 to read it as a grayscale image, -1 to read it as unmodified (reads the image as alpha or transparency channel if it exists), and 1 to read it as a colour image if you want to read it as a colour image.
The features of a picture that is being utilised as an input
import cv2 # To read image cv2.imread function, img = cv2.imread("pythonlogo.png", cv2.IMREAD_COLOR) # Creating GUI window to display an image on screen cv2.imshow("Cute Kitens", img)
OUTPUT: PYTHON LOGO, Source: Medium.com
Image Data Type
To discover the image’s type, use the “dtype” technique. This strategy enables us to comprehend the representation of visual data and the pixel value.
in addition to the image kind, It’s a multidimensional container for things of comparable shape and size.
Pixel values for the image
A collection of small samples can be thought of as an image. These samples are referred to as pixels. To have a better understanding of an image, try zooming in as much as possible. Divided into several squares, the same can be seen. These are pixels, and when all of them are combined, they form an image. One of the simplest methods to represent an image is via a matrix.
print("The data type of the image is",image.dtype) Output: The data type of the image is uint8 uint8 is representing each pixel value being an Unsigned Integer of 8 bits. This data type ranges between 0 to 255
Image resolution is defined as the number of pixels in an image. As the number of pixels rises, the image quality improves. As we saw before, the image’s shape determines the number of rows and columns. Pixel values in images: 320 x 240 pixels (mostly suitable for small screen devices), 1024 x 768 pixels (appropriate for viewing on standard computer monitors), 720 x 576 pixels (good for viewing on standard definition TV sets with 4:3 aspect ratio), 1280 x 720 pixels (for viewing on widescreen monitors), 1280 x 1024 pixels (for viewing on full-screen monitors) Pixel values in images.
Image Pixel Values
A collection of small samples can be thought of as an image. The unit of measurement for these samples is pixels. For improved comprehension, try zooming in on a picture as much as possible. The same can be divided into several different squares. These are pixels that, when combined, make up an image.
The quality of an image decreases as the number of pixels in the image increases. The image’s shape, which we saw earlier, determines the number of rows and columns.
Viewing the Images
Let’s have a look at how to make the image appear in a window. We’ll need to create a graphical user interface (GUI) window to display the image on the screen to do so. The title of the GUI window screen must be the first parameter, and it must be specified in string format. The image can be displayed in a pop-up window using the cv2.imshow() method. However, if you try to close it, you can get stuck with its window. We can use the “waitKey” method to mitigate this.
The “waitKey” parameter has been set to ‘0’ to keep the window open until we close it. (You can specify the time in milliseconds instead of 0, indicating how long it should be open for.)
# To read image from disk, we use # cv2.imread function, in below method, img = cv2.imread("python logo.png", cv2.IMREAD_COLOR) # Creating GUI window to display an image on screen # first Parameter is windows title (should be in string format) # Second Parameter is image array cv2.imshow("The Logo", img) # To hold the window on screen, we use cv2.waitKey method, If 0 pass an parameter, then it will # hold the screen until user close it. cv2.waitKey(0) # for removing/deleting created GUI window from screen # and memory cv2.destroyAllWindows()
Output: GUI Window, Source: Author
Reconstructing the image bit planes after extracting the image bit planes
An image can be divided into several levels of bit planes. Divide an image into 8-bit (0-7) planes, with the last few planes containing the majority of the image’s data.
Image Operations Using OpenCV and Python
Checking Properties of the Input Image
import numpy as np import matplotlib.pyplot as plt img = plt.imread("my pic.jpg") plt.imshow(img) print(img.shape) print(img.size) print(img.dtype)
(1921, 1921, 3) 11070723 uint8
Basic Image Processing
import matplotlib.pyplot as plt
Dilation and Erosion of the Input Image
import cv2 import numpy as np import matplotlib.pyplot as plt img = plt.imread("baby yoda.jpg") # Taking a matrix of size 5 as the kernel kernel = np.ones((5,5), np.uint8) # first parameter is basicaly the original image, # kernel is the matrix with which image is convolved # and third parameter is the number of iterations, which will determine how much # you want to erode/dilate a given image. img_erosion = cv2.erode(img, kernel, iterations=1) img_dilation = cv2.dilate(img, kernel, iterations=1) plt.imshow(img) plt.imshow(img_erosion) plt.imshow(img_dilation)
Normal Image Source: PopularMechanic.com
Source: Author, Image after erosion effect
Source: Author, Image after dilation effect
• The concept of OpenCV in Deep Learning is applied for recognition of faces.
• Counting the number of people (foot traffic in a mall, for example)
• Counting the number of automobiles on motorways and their speeds
• Interaction-based art installations
• Anomalies (defects) are detected during the production process (the odd defective products)
• Stitching an image from a street view
• Street view image stitching
• Video/image search and retrieval
• Robot and autonomous car navigation and control
• object recognition
• Medical image analysis
• Movies – 3D structure from motion
Functionality of OpenCV
• I/O, processing and display of images and videos
• Detection of objects and features
• Computer vision based on geometry
• Computer-assisted photography
So in this article, we covered the basic Introduction about OpenCV Library and its application in real-time scenarios. We also covered other key terminologies and fields where OpenCV in deep learning is being deployed(Computer Vision) as well as implemented python code for performing some of the basic image operations(dilation, erosion, and changing image colours) with the help of the OpenCV library. Apart from that OpenCV in deep learning would also find application in a variety of industries.
About the Author
My name is Pranshu Sharma and I am a Data Science Enthusiast
Thank you so much for taking your precious time to read this blog. Feel free to point out any mistake(I’m a learner after all) and provide respective feedback or leave a comment.
For any feedback Email me at [email protected]
The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.