DataHack Radio #15: Exploring the Applications & Potential of Reinforcement Learning with Xander Steenbrugge

pranavdar Last Updated : 13 Jun, 2019
5 min read

Introduction

“If intelligence was a cake, supervised learning would be the icing on the cake, and reinforcement learning would be the cherry on the cake.” – Yann LeCun, Founding Father of Convolutional Nets

Reinforcement learning algorithms have been knocking on the door of industrial applications in recent years. Will they finally blow the door wide open in 2019? What are some of the biggest obstacles holding back reinforcement learning? And is there a ceiling we can put on where RL will take us in the future?

We welcome 2019 on DataHack Radio with a stellar episode #15 featuring Xander Steenbrugge, as he navigates us through the wide-ranging and intricate world of reinforcement learning. And yes, those above questions have been very expertly handled in this episode.

Xander has a knack of taking the most complex topics and breaking them down into easy-to-understand concepts, a truly invaluable asset. I came across Xander thanks to his popular YouTube channel ‘Arxiv Insights’ and truly appreciated his presentation style when I saw him live at DataHack Summit 2018. His ability to explain challenging subjects is on full display in this episode as well.

This article aims to highlight the key aspects discussed in this episode, including Xander’s thoughts about reinforcement learning and related topics. I encourage you to listen to the full episode where Xander elaborates on his RL theories and ideas in much more detail. Happy listening!

You can subscribe to DataHack Radio on any of the below platforms to receive notifications every time a new episode is published or to trawl through our archives:

 

Xander Steenbrugge’s Background

Xander pursued civil engineering from the University of Ghent, Belgium during his graduation. His entire education background was focused on electronics, such as making transistors, microcircuits, etc. He picked up coding with the aim of making the idea to execution phase much faster than it was when working purely with electronics.

Not surprisingly, Xander’s final thesis for his Master’s degree (finished in 2015) was on brain-computer interfaces that could perform brainwave (EEG) classification. You might have seen an application of this system on YouTube – when a patient puts on a headset wired with this mechanism, he/she can move the cursor on a connected computer screen with his/her thoughts.

There was a ton of pre-processing work involved since the EEG signal data had a ton of noise. Once the data was cleaned, Xander performed manual feature extraction before feeding the data to a machine learning classifier. Neural networks were still in their relative infancy back when Xander was working on his project. Given the same data now, he would love to straightaway apply CNNs (convolutional neural networks) to the EEG signals. Fascinating stuff!

 

Foray into the World of Reinforcement Learning

Xander, working as a machine learning consultant, came across the 2015 paper by DeepMind where they introduced the DQN algorithm. The fact that you could play any kind of game with the same algorithm? That breakthrough really intrigued Xander and led him to explore this wonderfully complex field of reinforcement learning. Here’s his take on this line of work at a very simplistic level:

“It’s not as difficult as it seems. It’s supervised learning, but with a few tweaks.” – Xander

Is it really that simple, you ask? Here’s a summary of Xander explaining his thought process by putting these two types of learning in contrast:

The difference between supervised learning and reinforcement learning is that for RL, we have an agent that is moving around in an environment with the ability to take actions (like moving in a specific direction). This agent could be an algorithm, or a person, or an object. The action it takes affect the input that comes from the environment. Only once the agent is put through a few iterations can we tell how far away it is from achieving the end goal. When it comes to supervised learning, the input and output are already very well defined from the start.

“A reinforcement learning system can learn to do something that we as humans don’t know how to do.” – Xander

 

Current State of Reinforcement Learning in the Industry

It’s no secret that progress in reinforcement learning has been slower than other domains. The idea to execution phase Xander referred to earlier takes a lot of time in RL. In academia, these agents are trained in simulations (like the ATARI game environment) because these algorithms are very “sample inefficient”. In other words, we need to show these agent a whole host of examples before they learn something substantial.

When we get to a real-world setting, that amount of data is more often than not sparse (as a lot of data scientists will relate to!). Additionally, we would need the algorithm to generalize to a different setting depending on what the requirement is. These are two major challenges that have held up reinforcement learning’s penetration into commercial products and services.

Having said that, Xander mentioned a really cool use case where reinforcement learning has successfully been applied – robotics farming. Listen to the podcast to understand the granular aspects of how this technology works.

“We are at the start of a very big revolution, where we could go from hard-coded robots to smart learning robots.” – Xander

Another interesting nugget from the podcast – most of the research is still focused on single-agent reinforcement learning (in comparison to multi-agent RL) as there are still a plethora of problems left to solve there.

 

Challenges in Reinforcement Learning

Here are two major obstacles we face with the current state of reinforcement learning:

  1. There are plenty of frameworks and tool kits that exist for supervised learning. Things like pretrained models make life a whole lot easier for anyone wanting to understand how a particular technique works. But having something like that for reinforcement learning is next to impossible in the current scenario. Can you imagine transfer learning in the context of RL? Everyone is currently using their own custom libraries and tools in research
  2. As Xander mentioned above, most of the research is being done in simulated environments. The question of training an agent on a lot less data in a practical environment still needs to be solved

 

Resources to Get Started with Reinforcement Learning

Reinforcement learning is a huge field encompassing multiple topics and subjects. Right now, there’s no one singular platform that offers you a straight path into this space. According to Xander, first understanding supervised learning from scratch is a good idea since reinforcement learning builds upon that foundation. So familiarize yourself with how an image classifier works before jumping into RL concepts.

Xander’s learning journey started from this blog post by Andrej Karpathy, called ‘Pong from Pixels’. It’s a slightly lengthy read, but clearly illustrates how one can go from supervised learning to reinforcement learning. If you’re looking for a more visually appealing guide, check out Xander’s ‘Introduction to Reinforcement Learning’ video:

Here’s another excellent introduction to RL for beginners by Faizan Shaikh. You should also check out OpenAI’s educational resource on RL titled Spinning Up. It’s a comprehensive list of resources and topics – it has personally been super helpful for me.

 

End Notes

A very pleasant and crisp introduction to reinforcement learning in under 50 minutes. I had very little idea about multi-agent reinforcement learning before this podcast so that was a really cool section for me. Xander’s list of resources that I shared above is good enough to get your hands dirty so I hope to see a lot more of you from our community taking up RL in the near future.

An exquisite podcast to kick off 2019. There’s a lot more coming on DataHack Radio this year so keep your learning hats on and till then, happy listening!

Responses From Readers

Clear

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details