A Classic Computer Vision Project – How to Add an Image Behind Objects in a Video

Prateek Joshi 15 Jun, 2020

7 min read

Overview

Adding an image behind a moving object is a classic computer vision project
Learn how to add a logo in a video using traditional computer vision techniques

Introduction

I was thrown a challenge by one of my colleagues – build a computer vision model that could insert any image in a video without distorting the moving object. This, as you imagine, was quite an intriguing project and I had a blast working on it.

Working with videos is notoriously difficult because of their dynamic nature. Unlike images, we don’t have static objects that we can easily identify and track. The complexity level goes up several levels – and that’s where our hold on image processing and computer vision techniques comes to the fore.

I decided to go with a logo in the background. The challenge, which I will elaborate on later, was to insert a logo in a way that wouldn’t impede the dynamic nature of the object in any given video. I used Python and OpenCV to build this computer vision system – and have shared my approach in this article.

We will be using the image processing concepts and OpenCV in this article. We have a collection of comprehensive free courses and an article on these topics to get you up to speed:

Understanding the Problem Statement

This is going to be quite an uncommon use case of computer vision. We will be embedding a logo in a video. Now you must be thinking – what’s the big deal in that? We can simply paste the logo on top of the video, right?

However, that logo might just hide some interesting action in the video. What if the logo impedes the moving object in front? That doesn’t make a lot of sense and makes the editing looks amateurish.

Therefore, we have to figure out how we can add the logo somewhere in the background such that it doesn’t block the main action going on in the video. Check out the video below – the left half is the original video and the right half has the logo appearing on the wall behind the dancer:

This is the idea we’ll be implementing in this article.

Getting the Data for this Project

I have taken this video from pexels.com, a website for free stock videos. As I mentioned earlier, our objective is to put a logo in the video such that it should appear behind a certain moving object. So, for the time being, we will use the logo of OpenCV itself. You can use any logo you want (perhaps your favorite sports team?).

opencv logo

You can download both the video and the logo from here.

Setting the Blueprint for our Computer Vision Project

Let’s first understand the approach before we implement this project. To perform this task, we will take the help of image masking. Let me show you some illustrations to understand the technique.

Let’s say we want to put a rectangle (fig 1) in an image (fig 2) in such a manner that the circle in the second image should appear on top of the rectangle:

opencv

So, the desired outcome should look like this:

opencv

However, it is not that straightforward. When we take the rectangle from Fig 1 and insert it in Fig 2, it will appear on top of the pink circle:

opencv

This is not what we want. The circle should have been in front of the rectangle. So, let’s understand how we can solve this problem.

These images are essentially arrays. The values of these arrays are the pixel values and every color has its own pixel value. So, we would somehow set the pixel values of the rectangle to 1 where it is supposed to be overlapping with the circle (in Fig 5), while leaving the rest of the pixel values of the rectangle as they are.

In Fig 6, the region enclosed by blue-dotted lines is the region where we would put the rectangle. Let’s denote this region by R. We would set all the pixel values of R to 1 as well. However, we would leave the pixel values of the entire pink circle unchanged:

opencv

Our next step is to multiply the pixel values of the rectangle with the pixel values of R. Since multiplying any number by 1 results in that number itself, so all those pixel values of R that are 1 will be replaced by the pixels of the rectangle. Similarly, the pixel values of the rectangle that are 1 will be replaced by the pixels of Fig 6. The final output will turn out to be something like this:

opencv mask

This is the technique we are going to use to embed the OpenCV logo behind the dancing guy in the video. Let’s do it!

Implementing the Technique in Python – Let’s Add the Logo!

You can use a Jupyter Notebook or any IDE of your choice and follow along. We will first import the necessary libraries.

Import Libraries

Note: The version of the OpenCV library used for this tutorial is 4.0.0.

Load Images

Next, we will specify the path to the working directory where the logo and video are kept. Please note that you are supposed to specify the “path” in the code snippet below:

So, we have loaded the logo image and the first frame of the video. Now let’s look at the shape of these images or arrays:

logo.shape, frame.shape

Output: ((240, 195, 3), (1080, 1920, 3))

Both the outputs are 3-dimensional. The first dimension is the height of the image, the second dimension is the width of the image and the third dimension is the number of channels in the image, i.e., blue, green, and red.

Now, let’s plot and see the logo and the first frame of the video:

plt.imshow(logo)
plt.show()

plt.imshow(cv2.cvtColor(frame,cv2.COLOR_BGR2RGB))
plt.show()

Technique to Create Image Mask

The frame size is much bigger than the logo. Therefore, we can place the logo at a number of places. However, placing the logo at the center of the frame seems perfect to me as most of the action will happen around that region in the video. So, we will put the logo in the frame as shown below:

Don’t worry about the black background in the logo. We will set the pixel values in the black region to 1 later in the code. Now the problem we have to solve is that of dealing with the moving object appearing in the same region where we have placed the logo.

As discussed earlier, we need to make the logo allow itself to be occluded by that moving object.

Right now, the area where we will put the logo in has a wide range of pixel values. Ideally, all the pixel values should be the same in this area. So how can we do that?

We will have to make the pixels of the wall enclosed by the green dotted box have the same value. We can do this with the help of HSV (hue, saturation, value) colorspace:

Our image is in RGB colorspace. We will convert it into an HSV image. The image below is the HSV version:

The next step is to find the range of the HSV values of only the part that is inside the green dotted box. It turns out that most of the pixels in the box range from [6, 10, 68] to [30, 36, 122]. These are the lower and upper HSV ranges, respectively.

Now using this range of HSV values, we can create a binary mask. This mask is nothing but an image with pixel values of either 0 or 255. So, the pixels falling in the upper and lower range of the HSV values will be equal to 255 and the rest of the pixels will be 0.

Given below is the mask prepared from the HSV image. All the pixels in the yellow region have pixel value of 255 and the rest have pixel value of 0:

Now we can easily set the pixel values inside the green dotted box to 1 as and when required. Let’s go back to the code:

The code snippet above will load the frames from the video, pre-process it, and create HSV images and masks and finally insert the logo into the video. And there you have it!

End Notes

In this article, we covered a very interesting use case of computer vision and implemented it from scratch. In the process, we also learned about working with image arrays and how to create masks from these arrays.

This is something that would help you when you work on other computer vision tasks. Feel free to reach out to me if you have any doubts or feedback to share. I would be glad to help you.

Prateek Joshi 15 Jun, 2020

Data Scientist at Analytics Vidhya with multidisciplinary academic background. Experienced in machine learning, NLP, graphs & networks. Passionate about learning and applying data science to solve real world problems.

Advanced Computer Vision Image Python Technique

Frequently Asked Questions

Responses From Readers

Mahtab 16 Jun, 2020

It's a very grateful article I have learn something new thank you

1

Show 1 reply

Prateek Joshi 17 Jun, 2020

You're welcome Mahtab!

Nisha 18 Jun, 2020

This article is very useful for me.... So keep sharing articles related to CV.

1

Show 1 reply

Prateek Joshi 19 Jun, 2020

Thanks Nisha!

Rajesh 20 Jun, 2020

Great article. If you are using Google colab notebook, need to change cv2.imshow to cv2_imshow for the code to work. Thanks for great details.

1

Show 1 reply

Prateek Joshi 20 Jun, 2020

Thanks Rajesh for your input.

Aquish 15 Jul, 2020

Thanks for your sharing

Phi 05 Dec, 2020

Please help me. Could you please explain more for me how to know range of HSV in green box area if range from [6, 10, 68] to [30, 36, 122]? How to know [6, 10, 68] ? and how to know [30, 36, 122]?