“Computer vision and machine learning have really started to take off, but for most people, the whole idea of what a computer is seeing when it’s looking at an image is relatively obscure.” – Mike Kreiger
The wonderful field of Computer Vision has soared into a league of it’s own in recent years. There are an impressive number of applications already in wide use around the world – and we are just getting started!
One of my favorite things in this field is the idea of our community embracing the concept of open source. Even the big tech giants are willing to share new breakthroughs and innovations with everyone so that the techniques do not remain a “thing of the rich”.
One such technology is face detection, which offers a plethora of potential applications in real-world use cases (if used correctly and ethically). In this article, I will show you how to build a capable face detection algorithm using open source tools. Here is a demo to get you excited and set the stage for what will follow:
So, are you ready? Read on then!
Note: If you want to understand the intricacies of computer vision, this course – Computer Vision using Deep Learning – is the perfect place to start.
Let me pull up some awesome examples of applications where face detection techniques are being popularly used. I’m sure you must have come across these use cases at some point and not realized what technique was being used behind the scenes!
For instance, Facebook replaced manual image tagging with automatically generated tag suggestions for each picture that was uploaded to the platform. Facebook uses a simple face detection algorithm to analyze the pixels of faces in the image and compare it with relevant users. We’ll learn how to build a face detection model ourselves, but before we get into the technical details of that, let’s discuss some other use cases.
We are becoming used to unlocking our phones with the latest ‘face unlock’ feature. This is a very small example of how a face detection technique is being used to maintain the security of personal data.  The same can be implemented on a larger scale, enabling cameras to capture images and detect faces.
There are a few other lesser known applications of face detection in advertising, healthcare, banking, etc. Most of the companies, or even in many conferences, you are supposed to carry an ID card in order to get entry. But what if we could figure out a way so that you don’t need to carry any ID card to get access? Face Detection helps in making this process smooth and easy. The person just looks at the camera and it will automatically detect whether he/she should be allowed to enter or not.
Another interesting application of face detection could be to count the number of people attending an event (like a conference or concert). Instead of manually counting the attendees, we install a camera which can capture the images of the attendees and give us the total head count. This can help to automate the process and save a ton of manual effort. Pretty useful, isn’t it?
You can come up with many more applications like these – feel free to share them in the comments section below.
In this article, I will focus upon the practical application of face detection, and just gloss over upon how the algorithms in it actually work. If you want to know more about them, you go through this article.
Now that you know the potential applications you can build with face detection techniques, let’s see how we can implement this using the open source tools available to us. That’s the advantage we have with our community – the willingness to share and open source code is unparalleled across any industry.
For this article specifically, here’s what I have used and recommend using:
Let’s explore these points in a bit more detail to ensure everything is set up properly before we build our face detection model.
The first thing you have to do is check if the webcam is setup correctly. A simple trick in Ubuntu – see if the device has been registered by the OS. You can follow the steps given below:
The code in this article is built using Python version 3.5. Although there are multiple ways to install Python, I would recommend using Anaconda – the most popular Python distribution for data science. Here is a link to install Anaconda in your system.
OpenCV (Open Source Computer Vision) is a library aimed at building computer vision applications. It has numerous pre-written functions for image processing tasks. To install OpenCV, do a pip install of the library:
pip3 install opencv-python
pip install dlib pip install face_recognition
Now that you have setup your system, it’s finally time to dive in to the actual implementation. First, we will quickly build our program, then break it down to understand what we did.
First, create a file face_detector.py and then copy the code given below:
# import libraries import cv2 import face_recognition # Get a reference to webcam video_capture = cv2.VideoCapture("/dev/video1") # Initialize variables face_locations = [] while True: # Grab a single frame of video ret, frame = video_capture.read() # Convert the image from BGR color (which OpenCV uses) to RGB color (which face_recognition uses) rgb_frame = frame[:, :, ::-1] # Find all the faces in the current frame of video face_locations = face_recognition.face_locations(rgb_frame) # Display the results for top, right, bottom, left in face_locations: # Draw a box around the face cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), 2) # Display the resulting image cv2.imshow('Video', frame) # Hit 'q' on the keyboard to quit! if cv2.waitKey(1) & 0xFF == ord('q'): break # Release handle to the webcam video_capture.release() cv2.destroyAllWindows()
Then, run this Python file by typing:
python face_detector.py
If everything works correctly, a new window will pop up with real-time face detection running.
To summarize, this is what our above code did:
Simple, isn’t it? If you want to go into more granular details, I have included the comments in each code section. You can always go back and review what we have done.
The fun doesn’t stop there! Another cool thing we can do – build a complete use case around the above code. And you don’t need to start from scratch. We can make just a few small changes to the code and we’re good to go.
Suppose, for example, you want to build an automated camera-based system to track where the speaker is in real-time. According to his position, the system rotates the camera so that the speaker is always in the middle of the video.
How do we go about this? The first step is to build a system which identifies the person(s) in the video, and focuses on the location of the speaker.
Let’s see how we can implement this. For this article, I have taken a video from Youtube which shows a speaker talking during the DataHack Summit 2017 conference.
First, we import the necessary libraries:
import cv2 import face_recognition
Then, read the video and get the length:
input_movie = cv2.VideoCapture("sample_video.mp4") length = int(input_movie.get(cv2.CAP_PROP_FRAME_COUNT))
After that, we create an output file with the required resolution and frame rate which is similar to the input file.
Load a sample image of the speaker to identify him in the video:
image = face_recognition.load_image_file("sample_image.jpeg") face_encoding = face_recognition.face_encodings(image)[0] known_faces = [ face_encoding, ]
All this completed, now we run a loop that will do the following:
Let’s see the code for this:
# Initialize variables face_locations = [] face_encodings = [] face_names = [] frame_number = 0 while True: # Grab a single frame of video ret, frame = input_movie.read() frame_number += 1 # Quit when the input video file ends if not ret: break # Convert the image from BGR color (which OpenCV uses) to RGB color (which face_recognition uses) rgb_frame = frame[:, :, ::-1] # Find all the faces and face encodings in the current frame of video face_locations = face_recognition.face_locations(rgb_frame, model="cnn") face_encodings = face_recognition.face_encodings(rgb_frame, face_locations) face_names = [] for face_encoding in face_encodings: # See if the face is a match for the known face(s) match = face_recognition.compare_faces(known_faces, face_encoding, tolerance=0.50) name = None if match[0]: name = "Phani Srikant" face_names.append(name) # Label the results for (top, right, bottom, left), name in zip(face_locations, face_names): if not name: continue # Draw a box around the face cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), 2) # Draw a label with a name below the face cv2.rectangle(frame, (left, bottom - 25), (right, bottom), (0, 0, 255), cv2.FILLED) font = cv2.FONT_HERSHEY_DUPLEX cv2.putText(frame, name, (left + 6, bottom - 6), font, 0.5, (255, 255, 255), 1) # Write the resulting image to the output video file print("Writing frame {} / {}".format(frame_number, length)) output_movie.write(frame) # All done! input_movie.release() cv2.destroyAllWindows()
The code would then give you an output like this:
What a terrific thing face detection truly is. 🙂
Now, its time to take the plunge and actually play with some other real datasets. So are you ready to take on the challenge? Accelerate your deep learning journey with the following Practice Problems:
Practice Problem: Identify the Apparels | Identify the type of apparel for given images | |
Practice Problem: Identify the Digits | Identify the digit in given images | |
Practice Problem: Face Counting Challenge | Predict the headcount given a group selfie/photo |
Congratulations! You now know how to build a face detection system for a number of potential use cases. Deep learning is such a fascinating field and I’m so excited to see where we go next.
In this article, we learned how you can leverage open source tools to build real-time face detection systems that have real-world usefulness. I encourage you to build plenty of such applications and try this on your own. Trust me, there’s a lot to learn and it’s just so much fun!
As always, feel free to reach out if you have any queries/suggestions in the comment section below!
Lorem ipsum dolor sit amet, consectetur adipiscing elit,