Twitter has revealed that it’s using machine learning to crop photos to show the most interesting parts. In the current functionality, photos are cropped off without recognising what’s in them and what should be shown on the user’s timeline.
In this blog post, researcher Lucas Theis and machine learning lead Zehan Wang explained how they initially used only facial recognition to crop the photos. As you can imagine, this had it’s limitations. Pictures of cats, dogs, scenery, etc. were not being picked up by the algorithm, hence reducing the accuracy of the model.
So to drill down into all the aspects in a picture, the team decided to crop the images using “saliency”. To define what features are salient, they spent time gathering data from eye studies to understand eye-tracking, which records what parts of a picture people look at first. They then built a neural network model based on these features.
But the real challenge was – how do you do this in real-time? Millions of images are uploaded to Twitter so how do you use this neural network algorithm to crop photos without affecting the load time for the end user?
The team used two techniques to reduce the size of the neural network and also it’s computational requirements. The first is called knowledge distillation which they used to train a smaller network to imitate the more powerful one. With this, an ensemble of large networks is used to generate predictions on a set of images. These predictions, together with some third-party saliency data, are then used to train a smaller, faster network.
The second technique they used was pruning to iteratively remove features that were not helping the performance of the neural network (and were costly to compute as well). Using these two techniques together enabled the model to analyse and crop the picture 10 times faster than before (basically, in real-time as soon as an image was uploaded to Twitter).
In the above images, you can see the contrast between images uploaded before the neural network was applied (on the left), and the same images after the model does it’s work and focuses on the important aspects.
Our take on this
Not every machine learning technique has to lead to a breakthrough, as this tweak from Twitter shows. It will help users laser down their focus on what stands out immediately in the image rather than waste time in enlarging it. It’s a welcome addition and Twitter is currently rolling out the changes for everyone.