DeepMind’s Latest Research – Building a Neural Network that Navigates without a Map
- DeepMind’s is using images from Google Street View to train it’s neural network AI
- It does not need a map to navigate, it learns by itself by using visual queues
- The NN consists of three parts which you can read in detail below
In today’s digitally connected world, we have come to rely on Google Maps and other GPS elements to guide us to our destination. Or if we are familiar with a place, we use our memories and visual queues (familiar objects we recognise from previous visits) to get to the end point. But what if there was no GPS and it was a completely new place?
DeepMind’s research team has built an interactive navigation environment that makes use of first-person point-of-view pictures from Google Street View to train it’s AI model. The team built a deep neural network “artificial agent” that continuously learns to navigate multiple cities using information it gathers from Street View images.
The neural network model is “rewarded”. The more it is exposed to the visual environment, the better the model gets. Once it gets the hang of a few cities, it can adapt to a new city very quickly. The NN is made up of three parts:
- a convolutional network: used for processing images and extracting visual features
- a locale-specific recurrent neural network: to memorise and learn about the environment
- a locale-invariant recurrent network: to produce the navigation policy over the agent’s actions
They are keen to stress that this study is about navigation in general rather than self-driving. If you read their research paper (link below), you’ll notice that they have not used any techniques to manipulate or model the control of vehicles or used any information about traffic.
Our take on this
The aim of this study was to train the neural network to navigate the way that humans do and it has produced excellent results so far. The model was trained using deep reinforcement learning, taken queues from recent papers like learning to navigate in complex 3D mazes and reinforcement learning with unsurprised auxiliary tasks.
But these studies were conducted on relatively small data while DeepMind has the capability to use real-life visual environments (hence, images from Google Street View). Do go through their paper and let us know your views in the comments below!
Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!