18 Minutes, $40 – Fast.ai’s Algorithm Beat Google’s Model in a win for Every Data Scientist!
- Students from the popular platform Fast.ai, along with Jeremy Howard, designed an algorithm that beat Google’s code according to a popular benchmark
- The benchmark used was DAWNbench, and the dataset the students used was Imagenet
- The cost of training the model and using publicly available machines (16 AWS cloud instances) came out to be just $40!
Anyone familiar with deep learning has heard of Fast.ai. It’s an open platform built by Rachel Thomas and Jeremy Howard with the aim of teaching coders to pick up deep learning concepts through a series of videos.
And now students from Fast.ai, along with Jeremy, have designed an algorithm that has outperformed Google’s code according to the DAWNBench benchmark. This is a very popular benchmark that measures the training time, cost and other aspects of deep learning models. In this particular case, the Imagenet database was used by the Fast.ai researchers.
The researchers managed to train their model on Imagenet to a 93% accuracy is an impressive 18 minutes. The hardware they used, detailed in their blog post, was 16 public AWS cloud instances, each with 8 NVIDIA V100 GPUs. They built the algorithm using the fastai and PyTorch libraries.
Fast.ai entered Stanford’s DAWNBench competition in the first place because they wanted to show the community that you don’t need to have tons of resources to drive innovation. In fact four months ago, a bunch of researchers from Fast.ai performed very impressively on both the CIFAR-10 and Imagenet datasets, coming second only to Google’s TPU Pod cluster (which is, unsurprisingly, not available to the general public).
So this time around, Jeremy and his team used multiple publicly available machines. The total cost of putting the whole thing together came out to be just $40! Jeremy has described their approach, including techniques, in much more detail here.
Our take on this
Why is Fast.ai’s win so important? Because it dispels the notion (at least for now) that you need tons of computational power to build workable deep learning models. There is a perception that the likes of Google, with their almost unlimited access to GPUs and TPUs, are the only ones who can truly rule the roost when it comes to ML innovation.
But the question is how does this translate into real world scenarios. It’s great to see algorithms outperforming other competitors but until that can be put into a practical use case, it just remains stuck in the research stage. I’m curious to see how Jeremy and his team plan to utilize this.
Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!