IBM’s Machine Learning Library is 46 Times Faster than TensorFlow!

Pranav Dar 25 Mar, 2018 • 2 min read

Overview

  • IBM’s new Machine Learning library ran a logistic regression model 46 times faster than TensorFlow
  • The model was trained on 4.2 billion examples and 1 million variables
  • Google’s TensorFlow took 70 minutes, IBM’s library took 91.5 seconds

 

Introduction

The race to become the quickest and most efficient library is now in full flight. IBM claims that performing machine learning tasks on it’s POWER servers is an incredible 46 times quicker than on TensorFlow used in Google Cloud.

Earlier this year, a Google software engineer wrote a blog post on how they used Google Cloud Machine Learning and TensorFlow for solving click prediction problems. They trained their deep neural network model “to predict display ad clicks on Criteo Labs clicks logs. These logs are over 1TB in size and include feature values and click feedback from millions of display ads”.

For them, data preprocessing took about an hour, followed by training the model for 70 minutes. The evaluation loss was reported to be 0.13. They did manage to reduce this evaluation loss and get more accurate results but that was achieved at the cost of increasing training time.

But IBM blew those results out of the water. Their training algorithm, running on POWER9 servers and GPUs, outperformed Google Cloud Platform’s in the initial training phase.

The IBM researchers trained their model on the Criteo Labs click logs, the same data source used by Google earlier. It contains 4.2 billion training examples and 1 million variables. They trained it using logistic regression (again, the same technique used by Google). However, IBM used a different ML library – Snap Machine Learning.

IBM’s model completed the same logistic regression in 91.5 seconds! That’s a remarkable 46 times faster than Google’s previous attempt.

IBM posted the below comparisons between their Snap ML library and the other competitors:

You can read more about the Snap Machine Learning library in IBM’s research paper here.

 

Our take on this

A 46 times improvement over TensorFlow is truly impressive. Of course a point to be noted here is that these 2 models were not run on similar hardware configurations so we can’t validate IBM’s results until they publicly release more information.

Having said that, IBM has definitely caught the machine learning world’s attention and has given them an opportunity to introduce their POWER9 servers and the Snap ML library to the general public.

 

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

 

Also, go ahead and participate in our Hackathons, including the DataHack Premier League and Lord of the Machines!

 

Pranav Dar 25 Mar 2018

Senior Editor at Analytics Vidhya. Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Bhasha
Bhasha 23 Mar, 2018

What's the name of the IBM Machine Learning Library?

Hrishi
Hrishi 24 Mar, 2018

If even a linear classifier with logistic loss worked for this problem comparing with Tensorflow is not fair since a neural network is clearly not needed and then Tensrflow is not the right tool. Benchmarking with Vowpal Wabbit or Bid Mach would've been more appropriate. You should have brought this point out to help your readers see the results in the right context instead of repeating what IBM say in their paper.

Yk
Yk 24 Mar, 2018

These r not comparable. Why comparing ml library to deep learning library - assuming most Commonly used for purposes. So you have more data on ? Solver hyper parameter Tpa-scd Inter socket cocoa - very odd way of solving that specific problem Please read 4.1 specifically meta+rdd and persistent memory and see if you can suppress a smile.

Dnyandeo Patil
Dnyandeo Patil 24 Mar, 2018

Google is using technology of Nvidia GPU server whereas IBM have their own new brain chip which is best in anlysis as per cognition process and USA defence are using same technology for Miltry AI purpose. IBM Watson is remarkably successful in USA.

asdfasf
asdfasf 24 Mar, 2018

Where download the library?

Patrice
Patrice 28 Mar, 2018

Great. IBM's is 46 times faster, but it'll cost $1M per year to try it. Google, what, will make it available for everyone to try. Who's got 1M variable anyway.

Peter Hogon
Peter Hogon 05 Apr, 2018

Sorry for being negative, but looking at the paper, looks like typical IBM manipulative marketing stuff. First of all, you compare V100 to sklearn's CPU implementation. Is it really surprising that the V100 will be faster? Not to mention that the TF example processes the data completely differently than the other two, so it ends up running slower than sklearn. The paper even somewhat mentions this, but in separate parts so it's not really visible upon first glance SnapML: the data is copied into the GPU memory in full TensorFlow: processes data in batches reading from disk each time. Am I missing something, or is this intentionally misleading? If you want to compare the performance of two libraries, measure their actual performance.