OpenAI Five – A team of 5 Algorithms is Beating Human Opponents in a Popular Game

datascience22 26 Jun, 2018

2 min read

Overview

OpenAI Five is a group of 5 neural networks designed and developed to beat human opponents in the Dota 2 game
The algorithm plays 180 years worth of games against itself every single day!
The training process requires 256 GPUs and 128,000 CPU cores to properly design the neural networks

Introduction

Artificial Intelligence competing against humans in a game has become the norm these days. At this point in the evolution of AI, we assume that AI is advanced enough to figure out every single move in a game, find loop holes, and then create world record scores.

But so far these games have been geared more towards the strategic side – like chess and Go. The team at OpenAI, a venture co-founded by Elon Musk, has developed a team of algorithms (called OpenAI Five) that competes against humans in the popular game Dota 2. What makes their approach towards AI different is that Dota 2 requires real-time decision making, instead of pondering on what move to make next. According to a blog post by OpenAI, this game runs at 30 frames per second for an average of 45 minutes, resulting in 80,000 ticks per game.

OpenAI Five is basically a group of 5 neural networks. It plays 180 years worth of games against itself every single day! Of course this amount and level of training doesn’t come without proper computational resources. OpenAI Five trains itself using a scaled-up version of a class of reinforcement learning algorithms called Proximal Policy Optimization (PPO). This training process is executed on 256 GPUs and 128,000 CPU cores.

So how does this group of neural networks recognize and build strategies in real-time? Each of the five neural networks contain a single-layer, 1024 unit LSTM (Long Short Term Memory) that analyzes the real-time state of the game and then performs actions. “OpenAI Five views the world as a list of 20,000 numbers, and takes an action by emitting a list of 8 enumeration values.”

OpenAI Five has a few restrictions placed on it, which you can read about in their blog post. OpenAI will be playing a match against top Dota 2 players on July 28th and then participating in a tournament in August to benchmark their progress.

Check out the below video released by the OpenAI team, where they show OpenAI Five in action:

Our take on this

While it’s definitely a good sign that reinforcement learning is moving towards real-time decision making applications, expectations are tempered at this point. As the team repeatedly mentions in the post, Dota is an incredibly complex game with dozens of decisions possible in a single frame. It’s going to take a lot more learning and tweaking before OpenAI Five (or whatever comes next) is able to comprehensively get one over human players.

I recommend (again) going through OpenAI’s post to read about the algorithm in detail.