Top 7 Data Science & Machine Learning GitHub Repositories in March 2018

Pranav Dar 31 May, 2020 • 4 min read

Introduction

I live GitHub! Not only can you follow the work happening in different domains, but you can also collaborate on multiple open source projects. All tech companies, from Google to Facebook, upload their open source project codes on GitHub so the wider coding / ML community can benefit from it.

But, if you are too busy, or find following GitHub difficult, we bring you a summary of top repositories month on month. You can keep yourself updated with the latest breakthroughs and even replicate the code on your own machine!

This month’s list includes some awesome libraries. From Google Brain’s AstroNet to an artificial neural network visualizer, we have curated a list of unique repositories that will expand your machine learning horizons.

Are you ready? Let’s look at last month’s top 7 then!

You can check out the top 5 repositories that we picked out in January here and February here.

 

Person Blocker

‘Person Blocker’ is a python library that automatically blocks out entire people in images using a pre-trained neural network. The algorithm uses Mask R-CNN that is pre-trained on the MS COCO dataset. And the cherry on top? No GPU required!

And not just people, the algorithm is able to block out entire objects as well. The algorithm recognizes 80 different types of objects, including vehicles, animals, electronic gadgets, among other things.

You can read more about this library on Analytics Vidhya’s blog here.

 

AstroNet

                                                                                         Source: Yahoo

Back in December 2017, the Google Brain team revealed it had discovered 2 new planets by applying Astronet – it’s deep neural network model for working on astronomical data. It was a monumental discovery that went to show the far-reaching impacts of machine learning in today’s world.

Now, Google Brain has released the entire code that went into making that technology and they’ve made it available for everyone. The model is based on a convolutional neural network (CNN).

We have you covered on this AVBytes article regarding AstroNet.

 

ANN Visualizer

ANN Visualizer is a python library that enables us to visualize an Artificial Neural Network using just a single line of code. It is used to work with Keras and makes use of python’s ‘graphviz’ library to create a neat and presentable graph of the neural network you’re building.

Check out Analytics Vidhya’s detailed coverage of this awesome library here.

 

Fast Pandas

Any python novice will tell you how flexible and powerful the pandas library is. Being a data scientist, you need to be equally flexible and think of different ways to approach a problem. The ‘Fast Pandas’ repository aims to benchmark the different available methods in such situations.

This is a very useful library and one we highly recommend trying out at least once.

 

TensorFlow.js

TensorFlow.js is an open-source library that you can use to train and build machine learning models in your web browser, using JavaScript and APIs. If you’re familiar with Keras, the high level layers API will seem very familiar to you.

It’s available with GPU acceleration and also automatically supports WebGL. You can import existing pre-trained models and also re-train entire existing ML models within your web browser.

Check out our coverage of this here.

 

Caffe64

Caffe64 is a simple, small yet incredibly functional neural network library. We all know how onerous it is to install a neural network library. According to the developers, Caffe64 ditches all the hard work and is the “easiest to compile and most lightweight neural network library, period“.

If you’ve used caffe before, this will be a piece of cake for you!

 

TensorFlow Hub

TensorFlow Hub is a library to foster the publication, discovery, and consumption of reusable parts of machine learning models. In particular, it provides modules, which are pre-trained pieces of TensorFlow models that can be reused on new tasks. By reusing a module on a related task, you can:

  • train a model with a smaller dataset
  • improve generalization
  • significantly speed up training

 

Have you used any of these libraries before? How was your experience? Let us know in the comments section below!

 

Participate in the McKinsey Analytics Online Hackathon to win an all-expenses paid trip to an international analytics conference!

Pranav Dar 31 May 2020

Senior Editor at Analytics Vidhya. Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Frank francisco
Frank francisco 12 Apr, 2018

Check out etherscan ml for a solid blockchain machine learning repo built on ethereum. Not related but a big fan.

Data Science Training In Pune
Data Science Training In Pune 12 Apr, 2018

In this article shows multiple domains...plz moreover information about data science.

Sanil
Sanil 12 Apr, 2018

Analytics Vidhya is doing great job of making this information easily available. Requesting to post more R related stuff too. Thanks.

Jacob
Jacob 13 Apr, 2018

How did you select these as the "top 5"? Is this data driven and if so how precisely is it data driven?

Rahul
Rahul 13 Apr, 2018

Is there python library available to analyse high-dimensional hyperspectral data? I know about spectral-python, but it is not that good.

Don Carpenter
Don Carpenter 13 Apr, 2018

This is garbage. Sorry but you say it yourself, Person blocker is JUST MaskRCNN with COCO and a filter. It achieves nothing, brings nothing new and is honestly useless as-is.

Varsha Kulkarni
Varsha Kulkarni 18 Apr, 2018

Nice article

Related Courses