‘Eye in the Sky’ is a Machine Learning Project that Detects Violent People in Crowds

Pranav Dar 10 May, 2019 • 2 min read


  • Researchers have developed a deep learning model that analyzes and identifies violent individuals in crowds
  • The framework is fitted into a drone; it uses human pose estimation to identify and predict actions
  • Initial results are promising – the model predicts violent poses with 94% accuracy



Human pose estimation is a trending topic in machine learning these days. Currently most data scientists work on the basic side of this field, where they attempt to look at a few people to gauge their poses. But a group of researchers have now come up with a wonderful real-life use case for this – a drone that identifies violent individuals in crowds.

This project has been developed by researchers from the University of Cambridge, India’s National Institute of Technology and the Indian Institute of Science. They have even released a research paper, called ‘Eye in the Sky: Real-Time Drone Surveillance System (DSS) for Violent Individuals Identification using ScatterNet Hybrid Deep Learning Network’, that details the deep learning framework that went into making this system.

The concept is pretty straightforward – you take a drone that is equipped with a camera and insert the deep learning model into it. This gives the drone the ability to scan a crowded place and attempt to spot if any individuals are going to turn violent.

So how does the technology work behind the scenes? First, the model uses the Feature Pyramid Network to identify humans from aerial shots. Then the region in the image where the human is identified is used by the ScatterNet Hybrid Deep Learning network for human pose estimation. This is where it gets really awesome – according to the paper, “the orientations between the limbs of the estimated pose are finally used to identify the violent individuals”!

The initial results show a lot of promise. When asked to identify violent poses, the model posted an accuracy of 94%. But if there are 10 people in the camera frame at a time, the accuracy drops down to 79%. There is still work to be done and I expect the results to prop up in the next update.

The below video shows how this technology works in real-life scenarios:


Our take on this

This quite a significant breakthrough in crowd detection and surveillance. It’s relatively inexpensive because the technology can be added to a simple drone and it’s good to go. It’s still very much at a nascent stage (as the above accuracy results showed) but the applications are HUGE. I can see it being used in concerts, school and college events, sports stadiums, among other things.

From a data science point of view, this is a wonderful example of computer vision and pattern recognition. The algorithm has been very well explained in the research paper and even a newcomer to this field will be able to grasp most of the steps. Take out a few minutes and go through it – you’ll benefit immensely from the experience!


Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!


Pranav Dar 10 May 2019

Senior Editor at Analytics Vidhya. Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers


  • [tta_listen_btn class="listen"]