Top 7 Data Science & Machine Learning GitHub Repositories in March 2018
I live GitHub! Not only can you follow the work happening in different domains, but you can also collaborate on multiple open source projects. All tech companies, from Google to Facebook, upload their open source project codes on GitHub so the wider coding / ML community can benefit from it.
But, if you are too busy, or find following GitHub difficult, we bring you a summary of top repositories month on month. You can keep yourself updated with the latest breakthroughs and even replicate the code on your own machine!
This month’s list includes some awesome libraries. From Google Brain’s AstroNet to an artificial neural network visualizer, we have curated a list of unique repositories that will expand your machine learning horizons.
Are you ready? Let’s look at last month’s top 7 then!
‘Person Blocker’ is a python library that automatically blocks out entire people in images using a pre-trained neural network. The algorithm uses Mask R-CNN that is pre-trained on the MS COCO dataset. And the cherry on top? No GPU required!
And not just people, the algorithm is able to block out entire objects as well. The algorithm recognizes 80 different types of objects, including vehicles, animals, electronic gadgets, among other things.
You can read more about this library on Analytics Vidhya’s blog here.
Back in December 2017, the Google Brain team revealed it had discovered 2 new planets by applying Astronet – it’s deep neural network model for working on astronomical data. It was a monumental discovery that went to show the far-reaching impacts of machine learning in today’s world.
Now, Google Brain has released the entire code that went into making that technology and they’ve made it available for everyone. The model is based on a convolutional neural network (CNN).
We have you covered on this AVBytes article regarding AstroNet.
ANN Visualizer is a python library that enables us to visualize an Artificial Neural Network using just a single line of code. It is used to work with Keras and makes use of python’s ‘graphviz’ library to create a neat and presentable graph of the neural network you’re building.
Check out Analytics Vidhya’s detailed coverage of this awesome library here.
Any python novice will tell you how flexible and powerful the pandas library is. Being a data scientist, you need to be equally flexible and think of different ways to approach a problem. The ‘Fast Pandas’ repository aims to benchmark the different available methods in such situations.
This is a very useful library and one we highly recommend trying out at least once.
It’s available with GPU acceleration and also automatically supports WebGL. You can import existing pre-trained models and also re-train entire existing ML models within your web browser.
Check out our coverage of this here.
Caffe64 is a simple, small yet incredibly functional neural network library. We all know how onerous it is to install a neural network library. According to the developers, Caffe64 ditches all the hard work and is the “easiest to compile and most lightweight neural network library, period“.
If you’ve used caffe before, this will be a piece of cake for you!
TensorFlow Hub is a library to foster the publication, discovery, and consumption of reusable parts of machine learning models. In particular, it provides modules, which are pre-trained pieces of TensorFlow models that can be reused on new tasks. By reusing a module on a related task, you can:
- train a model with a smaller dataset
- improve generalization
- significantly speed up training
Have you used any of these libraries before? How was your experience? Let us know in the comments section below!