Top 5 Data Science & Machine Learning Repositories on GitHub in Jan 2018
Breakthroughs in data science and machine learning are happening at a break-neck pace. If you are working in this field, it’s extremely important to keep yourself updated with what’s new.
Following GitHub repositories is one such way to do so. You can see the latest developments, interesting projects and their applications. I can not tell how much learning can happen through this.
You can download the code and run it on your own machine or simply just keep it as a reference point for your project. Whatever the application, GitHub communities are invaluable resources.
In this post, we look at 5 GitHub repositories created in January 2018 that you must follow. This is part of a series from Analytics Vidhya that will run every month.
Detectron is a software system developed by Facebook’s AI Research team (FAIR) that “implements state-of the art object detection algorithms”. It is written in Python and leverages the Caffee2 deep learning framework underneath.
Along with the Python code, FAIR has also released performance baselines for over 70 pre-trained models. Once the model(s) is trained, it can be deployed on the cloud and even on mobile devices.
Detectron has been covered by us here.
This is a replica of the AlphaZero methodology developed in Python. The author has written the code to train an algorithm to play the Connect4 game. It’s not quite as complex as the famed ‘Go’ game, but there are 4,531,985,219,092 possible game positions so it’s perfect for this situation.
The main advantages of this repository are two-fold, namely:
- How you can build a replica of the AlphaZero methodology to play the game Connect4
- How you can adapt the code to plug in other games
Run it and you will see the beauty in AlphaGo!
Caire is a content-aware image resizing library. Currently, most applications either give you the option of cropping an image or changing it’s aspect ratio. This often leads to either the main parts being left out or the image becoming blurred. This is where Caire comes into play.
It has support for both shrinking and enlarging any image, resizing it horizontally or vertically and does not require any third party library. It uses edge detection to generate an energy map of the image. Based on that, it finds seams in the image and uses it’s algorithm accordingly. The process of how this works has been illustrated in the three images below:
Covered by Analytics Vidhya here, this is an open-source Python implementation inspired by DeepMind’s AlphaGo. It’s a Neural Network based AI, developed using Tensorflow.
The goals of this project, as described by the authors, are listed below:
- Provide a clear set of learning examples using Tensorflow, Kubernetes, and Google Cloud Platform for establishing Reinforcement Learning pipelines on various hardware accelerators.
- Reproduce the methods of the original DeepMind AlphaGo papers as faithfully as possible, through an open-source implementation and open-source pipeline tools.
- Provide our data, results, and discoveries in the open to benefit the Go, machine learning, and Kubernetes communities.
You can access the entire Python code on this GitHub repository.
Alpha Pose is a remarkably accurate tool to estimate the poses of multiple people (you can see this in their GitHub’s GIFs). It’s the first open-source systems that has achieved 70+ mAP on the COCO dataset 80+ mAP on the MPII dataset. Additionally, the authors have also developed ‘Pose Flow’, which is an online pose tracker.
And here are two bonus repositories for you!
VisualDL is a tool that can visualize the entire deep learning process for us. It’s an incredibly powerful visualization tool that helps us design deep learning jobs. VisualDL was built to support Python. Just by adding a few lines of Python code and inserting them into our neural network model, we can generate plenty of visualizations to understand the framework. VisualDL has also been written in low level C++.
Currently, VisualDL provides four components (more will be added soon):
You can read more about these components, and how VisualDL works, in our post here.
There are a ton of things to do when starting a TensorFlow project. The underlying idea behind this repository is to wrap up thonse things into a simple and well-defined structure. The TensorFlow Project Template combines simplicity, best practices for creating and maintaining folder structure and excellent OOP design.
Do you know of any other repositories created last month that we should be aware of? Feel free to let us know in the comments below.