Devashree Madhugiri — April 4, 2022
Advanced Computer Vision Image Libraries Python

This article was published as a part of the Data Science Blogathon.


In my previous article, we understood image augmentation using AugLy, a recently introduced library from Facebook. Follow along to explore three popular image augmentation libraries in Python in this article.

An image classifier usually performs better when trained on significantly more images. A common problem in image classification models occurs when the model fails to correctly classify an image only because it was not trained on a different orientation of the same image. This can be overcome by feeding multiple possible image orientations and transformations to the model for training. However, in reality, gathering such diverse data might require more time, resources, and expertise and could be costly for a company. In such cases, image data augmentation is a popular choice for adding diversity to the existing dataset by using one or more augmentation techniques to generate various images for training. Although several Python libraries support multiple augmentation techniques, not all techniques are relevant and appropriate to train the model. A user needs to know which augmentations would help generate realistic additional data for training the model.

We can augment the image data using various techniques. It can include:

  • Augmenting image data using Geometric transformations such as flipping, cropping, rotating, zooming, etc.
  • Augmenting image data by using Color transformations such as by adjusting brightness, darkness, sharpness, saturation, etc.
  • Augmenting image data by random erasing, mixing images, etc.


Imgaug is an open-source python package that allows you to augment images in machine learning experiments. It works with a variety of augmentation techniques. It has a simple yet powerful interface and can augment images, landmarks, bounding boxes, heatmaps, and segmentation maps.

Let’s start by installing this library first using pip from PyPI.

pip install imgaug

Next, we will install the python package named ‘IPyPlot’ in the command prompt using the pip command:

pip install ipyplot

IPyPlot is a Python tool that allows for the fast and efficient display of images within Python Notebook cells. This package combines IPython with HTML to provide a quicker, richer, and more interactive way to show images. This package’s ‘plot_images’ command will be used to plot all of the images in a grid-like structure.

Also, we will import all the necessary packages needed to augment the data.

import imageio
import imgaug as ia
import imgaug.augmenters as iaa

The image path for augmentation is defined here. We’ll use a bird image as an example.

input_img = imageio.imread('../input/image-bird/bird.jpg')

Image Flipping

We can flip the image horizontally and vertically using the commands shown below. ‘Fliplr’ keyword in the following code flips the image horizontally. Similarly, the keyword ‘Flipud’ flips the image vertically.

#Horizontal Flip
hflip= iaa.Fliplr(p=1.0)
input_hf= hflip.augment_image(input_img)
#Vertical Flip
vflip= iaa.Flipud(p=1.0) 
input_vf= vflip.augment_image(input_img)
images_list=[input_img, input_hf, input_vf]
labels = ['Original', 'Horizontally flipped', 'Vertically flipped']
Image Flipping

The probability of each image getting flipped is represented by p. The probability is set to 0.0 by default. To flip the input image horizontally, use Fliplr(1.0) rather than just Fliplr (). Similarly, when flipping the image vertically, use Flipud(1.0) rather than just Flipud().

Image Rotation

By defining the rotation in degrees, we can rotate the image.

rot1 = iaa.Affine(rotate=(-50,20))
input_rot1 = rot1.augment_image(input_img)
images_list=[input_img, input_rot1]
labels = ['Original', 'Rotated Image']
Image Rotation

Image Cropping

Cropping images includes removing columns or rows of pixels from the image’s sides. This augmenter enables the extraction of smaller-sized subimages from full-sized input images. The number of pixels to be removed can be specified in absolute numbers or as a fraction of the image size.

In this case, we crop each side of the image with a random fraction taken uniformly from the continuous interval [0.0, 0.3] and sampled once per image and side. Here, we are taking a sampled fraction of 0.3 for the top side, which will crop the image by 0.3*H, where H is the height of the input image.

crop1 = iaa.Crop(percent=(0, 0.3)) 
input_crop1 = crop1.augment_image(input_img)
images_list=[input_img, input_crop1]
labels = ['Original', 'Cropped Image']
Image Cropping

Adding Noise to Images

This augmenter adds gaussian noise to the input image. The scale value is the standard deviation of the normal distribution that generates the noise.

images_list=[input_img, input_noise]
labels = ['Original', 'Gaussian Noise Image']
Adding noise to images

Image Shearing

This augmenter shears the image by random amounts ranging from -40 to 40 degrees.

shear = iaa.Affine(shear=(-40,40))
images_list=[input_img, input_shear]
labels = ['Original', 'Image Shearing']
Image Shearing

Image Contrast

This augmenter adjusts the image contrast by scaling pixel values.

contrast=iaa.GammaContrast((0.5, 2.0))
contrast_sig = iaa.SigmoidContrast(gain=(5, 10), cutoff=(0.4, 0.6))
contrast_lin = iaa.LinearContrast((0.6, 0.4))
input_contrast = contrast.augment_image(input_img)
sigmoid_contrast = contrast_sig.augment_image(input_img)
linear_contrast = contrast_lin.augment_image(input_img)
images_list=[input_img, input_contrast,sigmoid_contrast,linear_contrast]
labels = ['Original', 'Gamma Contrast','SigmoidContrast','LinearContrast']
Image Contrast

The GammaContrast function here adjusts image contrast using the formula 255*((v/255)**gamma, where v is a pixel value and gamma is evenly sampled from the range [0.5, 2.0]. SigmoidContrast adjusts image contrast using the formula 255*1/(1+exp(gain*(cutoff-v/255)) (where v is a pixel value, the gain is sampled uniformly from the interval [3, 10] (once per image), and the cutoff is sampled consistently from the interval [0.4, 0.6]. LinearContrast, on the other hand, alters image contrast using the formula 127 + alpha*(v-127)’, where v is a pixel value and alpha is sampled uniformly from the range [0.4, 0.6].

Image Transformations

The ‘Elastic Transformation’ augmenter transforms images by shifting pixels around locally using displacement fields. The augmenter’s parameters are alpha and sigma. The strength of the displacement is controlled by alpha, wherein greater values indicate that pixels are shifted further. The smoothness of the displacement is controlled by sigma, in which larger values result in smoother patterns.

elastic = iaa.ElasticTransformation(alpha=60.0, sigma=4.0)
polar = iaa.WithPolarWarping(iaa.CropAndPad(percent=(-0.2, 0.7)))
jigsaw = iaa.Jigsaw(nb_rows=20, nb_cols=15, max_steps=(3, 7))
input_elastic = elastic.augment_image(input_img)
input_polar = polar.augment_image(input_img)
input_jigsaw = jigsaw.augment_image(input_img)
images_list=[input_img, input_elastic,input_polar,input_jigsaw]
labels = ['Original', 'elastic','polar','jigsaw']
Image Transformation

While using the ‘Polar Warping’ Augmenter, cropping and padding are applied in polar representation first, then warped back to cartesian representation. This augmenter can add additional pixels to the image. These will be filled with black pixels. In addition, the ‘Jigsaw’ augmentation moves cells inside pictures in a manner similar to jigsaw patterns.

Bounding Box on Image

imgaug also provides bounding box support for images. The library can rotate all bounding boxes on an image if rotated during augmentation.

from imgaug.augmentables.bbs import BoundingBox, BoundingBoxesOnImage
bbs = BoundingBoxesOnImage([
 BoundingBox(x1=40, x2=550, y1=40, y2=780)
], shape=input_img.shape)
Bounding Box on Image


Albumentations is a fast and well-known library that integrates with popular deep learning frameworks such as PyTorch and TensorFlow. It is also a part of the PyTorch ecosystem.

Albumentations can perform all typical computer vision tasks, including classification, semantic segmentation, instance segmentation, object identification, and posture estimation. This library includes over 70 different augmentations for creating new training samples from existing data. It is commonly utilized in industry, deep learning research, machine learning contests, and open-source projects.

Let’s start by installing the library first using the pip command.

pip install Albumentations

We will import all the necessary packages needed for augmenting data with Albumentations:

import albumentations as A
import cv2

In addition to the Albumentations package, we use the OpenCV package, an open-source computer vision library that supports a wide range of image formats. Albumentations are dependent on OpenCV; thus, you already have it installed.

Image Flipping

The ‘A.HorizontalFlip’ and ‘A.VerticalFlip’ functions are used to flip the image horizontally and vertically. p is a distinct parameter that is supported by almost all augmentations. It controls the probability of the augmentation being used.

transform = A.HorizontalFlip(p=0.5)
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))

transform = A.VerticalFlip(p=1)
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))
Image Flipping Image 1


Image Scale and Rotate

This augmenter uses affine transformations at random to translate, scale, and rotate the input image.

transform = A.ShiftScaleRotate(p=0.5)
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))
Image Scale and Rotate

Image ChannelShuffle

This augmenter randomly rearranges the RGB channels of the input image.

from albumentations.augmentations.transforms import ChannelShuffle
transform = ChannelShuffle(p=1.0)
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))
Image ChannelShuffle

Image Solarize

This augmenter inverts all pixel values greater than a certain threshold in the input image.

from albumentations.augmentations.transforms import Solarize
transform = Solarize(threshold=200,  p=1.0)
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))

Invert Image

By subtracting pixel values from 255, this augmenter inverts the input image.

from albumentations.augmentations.transforms import InvertImg
transform = InvertImg(p=1.0)
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))

Augmentation pipeline using Compose

To define an augmentation pipeline, first, create a Compose instance. You must provide a list of augmentations as an argument to the Compose class. In this example, we’ll utilize a variety of augmentations such as transposition, blur, distortion, etc.

A Compose call will result in the return of a transform function that will do image augmentation.

transform = A.Compose([
    A.ShiftScaleRotate(shift_limit=0.08, scale_limit=0.5, rotate_limit=5, p=.8),
augmented_image = transform(image=input_img)['image']
plt.figure(figsize=(4, 4))
pipeline using Compose


SOLT is a Deep Learning data augmentation library that supports images, segmentation masks, labels, and key points. SOLT is also fast and has OpenCV in its backend. Complete auto-generated documentation and examples can be found here:

We will start with the installation of SOLT by using the pip command –

pip install solt

Then we will import all the necessary packages of SOLT required for augmenting the image data.

import solt
import solt.transforms as slt
h, w, c = input_img.shape
img = input_img[:w]

Here we will create a Stream instance for an augmentation pipeline. You must provide a list of augmentations as an argument to the stream class.

stream = solt.Stream([
    slt.Rotate(angle_range=(-90, 90), p=1, padding='r'),
    slt.Flip(axis=1, p=0.5),
    slt.Flip(axis=0, p=0.5),
    slt.Shear(range_x=0.3, range_y=0.8, p=0.5, padding='r'),
    slt.Scale(range_x=(0.8, 1.3), padding='r', range_y=(0.8, 1.3), same=False, p=0.5),
    slt.Pad((w, h), 'r'),
    slt.Crop((w, w), 'r'),
    slt.Blur(k_size=7, blur_type='m'),
        slt.CutOut(40, p=1),
        slt.CutOut(50, p=1),
        slt.CutOut(10, p=1),
    ], n=3),
], ignore_fast_mode=True)
fig = plt.figure(figsize=(17,17))
n_augs = 10
for i in range(n_augs):
    img_aug = stream({'image': img}, return_torch=False, ).data[0].squeeze()
    ax = fig.add_subplot(1,n_augs,i+1)
    if i == 0:


Image augmentations can help in increasing the existing dataset. There are several Python libraries currently available for image augmentations. In this article, we have explored different image augmentation techniques using three Python libraries – Imgaug, Albumentations, and Solt.

I hope you enjoyed reading this article! The next time you train a machine learning or a deep learning model, do try one of these three libraries and the techniques shared in this article to generate additional image data quickly.

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion.

About the Author

Our Top Authors

Download Analytics Vidhya App for the Latest blog/Article

Leave a Reply Your email address will not be published. Required fields are marked *