- A group of researchers have developed an algorithm that takes 1 input video and reconstructs the facial expressions, head pose and eye motions on another person’s face
- At the core of the approach is a generative neural network
- The results are truly mind blowing. Previous efforts in this field pale in comparison to what this approach has done
Artificial Intelligence is a wonderful thing, if applied correctly. It has diverse applications and it is well and truly transforming our lives in a positive way (like healthcare). But there can be certain applications, like the one you will read about below, that are a mix between genius and scary. They have the potential to be game changing, and only time will tell if it’ll be a good or bad thing.
These researchers are the first to have successfully transferred the full 3 dimensional head pose, expressions, eye motions, etc. of a face, into the face of a different actor.. The results are simply mind blowing.
How does this approach work? When a video is given as as the input, the algorithm first tracks the source and target actor(s) using a facial reconstruction approach. The resulting output represents dimensions such as the pose of the head, the facial expressions and the motion of the eyes. The level of detail is simply staggering.
At the core of the algorithm is a generative neural network with a space-time architecture. The realism in the end result has been achieved through adversarial training. This approach only requires a few minutes of training on the input video source. With the ability to freely recombine source and target parameters, the developers have been able to demonstrate a large variety of video rewrite applications without explicitly modeling hair, body or background.
The researchers then compared their algorithm to previous approaches and showed how well their algorithm works. Previous efforts in this regard have involved recreating facial expressions but they pale in comparison to this study. No one yet had successfully been able to blend the background, and reconstruct head poses along with minute details like eye blinking, etc.
Additionally, the algorithm allows you to interactively edit the input video as well. Change the shape of the face, expand the facial expressions, remove the hair, close one eye, among other things.
To visually see how this algorithm works, check out the below video released by the developers:
Our take on this
There are two sides to consider here – one from the machine learning perspective and one from the ethical point of view. From the ML view, this is a truly awesome and novel approach. The fact that machines can now fully reconstruct 3D facial expressions with such minute details is truly awesome. From film making to medical imaging, this can potentially be a very useful tool.
From the ethics side, this is a scary prospect. With the amount of fake news and fake videos doing the rounds recently, this could make things worse. Time will tell how this approach is received and applied in practical real-life cases. What is your take on this? Are you excited or scared by it’s implications? Use the comments section to tell us your thoughts.
Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!
You can also read this article on Analytics Vidhya's Android APP