DeepMind is Using ‘Neuron Deletion’ to Understand Deep Neural Networks

Pranav Dar 26 Mar, 2018 • 2 min read


  • DeepMind’s latest research attempts to demystify deep neural networks
  • Researchers developed image classification models and then remove neurons systematically to understand how each deletion affected the outcomes
  • Results found that networks which correctly classify unseen images are more resilient to neuron deletion



Neural networks have always been a tricky subject to understand. Deep neural networks are beyond the scope of most people. They consist of multiple neurons and which are used for various and diverse applications in the industry.

But these multiple hidden neurons is what has given them the ‘black box’ stigma.

In a blog post, researchers at DeepMind have explained how they went about understanding and judging the performance of a neural network by deleting individual neurons one by one, as well as in groups. The researchers developed image classification models and then removed several neurons. Then they measured how each deletion affected the outcome of the model.

According to DeepMind, their findings yielded two outcomes:

  • Although many previous studies have focused on understanding easily interpretable individual neurons (e.g. “cat neurons”, or neurons in the hidden layers of deep networks which are only active in response to images of cats), we found that these interpretable neurons are no more important than confusing neurons with difficult-to-interpret activity.
  • Networks which correctly classify unseen images are more resilient to neuron deletion than networks which can only classify images they have seen before. In other words, networks which generalise well are much less reliant on single directions than those which memorise.

In the above image, the top most neuron (which is greyed out) has been deleted. On DeepMind’s blog post, you can delete each neuron and see how it affects the output results.

You can read DeepMind’s official research paper on the topic here. They are scheduled to present this next month at the International Conference on Learning Representations (ICLR).


Our take on this

Deep neural networks have been hard to interpret so this is a nice start towards demystifying them by one of the leading research companies.Their results imply that individual neurons are much less important than we would have initially thought.

I highly encourage you to go through their blog post and the research paper to understand each step that was taken in performing this.


Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!


Also, go ahead and participate in our Hackathons, including the DataHack Premier League and Lord of the Machines!


Pranav Dar 26 Mar 2018

Senior Editor at Analytics Vidhya. Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers


V V CHAKRADHAR 26 Mar, 2018

Hey Pranav , Has there been any research put in finding a generalized neural network which can help in any type of classification (audio , image , text) ??