NVIDIA’s Deep Learning AI Trains Robots to Copy and Execute Human Actions

Pranav Dar 21 May, 2018 • 2 min read


  • Researchers at NVIDIA have developed a deep learning system that enables robots to learn from humans
  • The algorithm is powered by several neural networks that perceive objects, understand and train themselves, and then execute the actions they saw the human performing
  • These neural networks were trained on NVIDIA’s Titan X GPUs
  • The research paper and video are also available for the community (links included below)



Hollywood, meet deep learning. With each passing week, we see AI reach closer to our level of intelligence. No longer are those Terminator movies looking like a pipe dream or a hallucination. Robots are getting smarter, thanks to the breakthroughs achieved by deep learning.

NVIDIA has developed a deep learning system that enables robots to learn and teach themselves, simply by observing human actions. In the initial demonstration, we were shown how robots detected objects (coloured boxes and a toy car in this case), picked them up and moved them.

The deep learning framework consists of a series of neural networks that have been built to perform perception of objects, generation of programs and their execution as well. In other words, the system observes a human’s action, learns from them, and then replicates it by itself. It’s like a sci-fi movie come to life!

These neural networks were trained on NVIDIA’s Titan X GPUs. The below flow chart illustrates how the system learns and executes the actions it sees:

  • Firstly, a human demonstrates a simple task which is to be learned by the robot
  • The robot would see the task via a camera and then infer positions and relationships of objects in a scene
  • The neural network would then generate a plan to explain how to recreate perceptions
  • Ultimately, the execution network would carry the task out

The researchers have also made their research paper available to the public and you can view it here. The paper will be presented this week by the researchers at a robotic and automation conference in Brisbane, Australia.

Watch the below video, released by NVIDIA, that demonstrates this remarkable system in action:


Our take on this

I feel we are seeing a real cornerstone point in AI right now. Thanks to GPUs and TPUs, organizations (at least the big ones), can essentially use tons of data to train and test deep learning algorithms. The gaming community has been key in training many popular algos recently and this system might see a use there as well.

This latest study has applications not just for house tasks (like picking and moving stuff around the house), but can also be used in elderly homes to assist people, in the manufacturing industry and even to do environmental friendly tasks. As a data scientist (or an aspiring one), you should go through the above mentioned research paper to understand the structure and logic of how NVIDIA built this model.


Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!


Pranav Dar 21 May 2018

Senior Editor at Analytics Vidhya.Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers


  • [tta_listen_btn class="listen"]