- Researchers from the University of Montreal and Facebook’s AI department (FAIR) have curated and open sourced the “Talk the Walk” dataset
- The dataset consists of 3 elements – maps of certain parts on New York, 360 degree images of these locations, and sample conversations between humans guiding each other (10k dialogues)
- They have also released baseline models to help you get started on this challenge
Existing Natural Language Techniques (NLP) focus mostly on transcribing what humans say, rather than understanding what’s being said. Even with the release of advanced chatbot technologies like Google Duplex and Microsoft’s Xiaoice, this is a challenge that has eluded researchers so far.
This has prompted a group of researchers from the University of Montreal and Facebook’s AI department (FAIR) to curate a dataset called “Talk the Walk” that aims to teach the machine to understand language in the same way that a human does. The researchers have of course open sourced the dataset and opened up the challenge to the wider machine learning community.
The dataset is essentially made up of three elements:
- Maps of certain parts of New York
- 360 degree images of locations on the map captured through camera sensors
- The NLP task – a sample of conversations between people guiding each other to specific locations (10k dialogues)
The idea behind this research is to get two agents talking to each other – a “tourist”, and a “guide”. The “tourist” has access to the 360 degree images of the locations but not the map, and the “guide” has access to the map but the images. Can you distinguish which is the human and the machine in this case? Below is a sample screenshot from the dataset:
The researchers have also released baseline results of the experiments they ran. Watch the below video, released by Facebook, which illustrates their approach to the problem:
Below are a few resources to get you started on this challenge:
Our take on this
This is one the most difficult challenges you’ll see anywhere. It combines so many machine learning tasks that it can become daunting. One of the authors of the research paper himself admitted that breakthroughs in this study might be a few years away. But when it does happen, it has the potential to be a game changer in the NLP as well as navigational guidance domains.
But don’t let that deter you! Download the dataset, and try to understand all that it has. If you don’t understand something, use the comments section below to ask. Play around with parts of the data and publish your findings and analysis online. You never know where inspiration might strike.
Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!
You can also read this article on our Mobile APP