Pranav Dar — February 23, 2018
AVbytes

Overview

  • Feature Labs has launched a set to tools to make machine algorithms train quicker
  • Automated feature engineering is at the heart of it
  • Tested in three competitions and took only 1/10th of the time required by a human competitor
  • Designed to work with common frameworks like Pandas for data preparation and skikit-learn for ML
  • Works with both python 2 and 3

 

Introduction

Feature engineering has been at the core of any hackathon winning solution. It has become the defacto go-to option when you’re looking to differentiate your solution from the competition. But it’s often difficult to engineer new features from the dataset you’ve been given. It’s a time (and energy) consuming process.

This is where the tool set from Feature Labs comes into play. Developed by the folks at Feature Labs, ‘Featuretools‘ is an open-source framework for automating feature engineering.

The company has developed this by using a process called Deep Feature Synthesis (DFS). According to Feature Labs CEO, Max Kanter, DFS creates features from raw relational and transactional datasets, like visits to a website or abandoned cart items, and automatically understands and converts that into a predictive signal. The above image gives you a general idea of how the tool works.

It can be integrated into both python 2 and 3. It has been designed to work with common frameworks like Pandas for data preparation and skikit-learn for machine learning.

According to their official website, the tool was “tested against 1000 data scientists in three world wide competitions. On average, Feature Labs performed as well as as well as top human competitors and only required 1/10th of the time”.

Early customers of the company include Spanish bank BBVA and developers at MIT. In fact, they’ve published a case study on how BBVA used Featuretools to create a credit card fraud detection system. You can view it here.

 

Our take on this

Feature engineering is one of the mose important steps in any machine learning pipeline. Whether it’s differentiating your ML algorithm in a hackathon, or creating features to mine the most out of your data as an organization, it’s a critical technique.

This release will not only save a lot of time for the user (or company), it will enable them to shift their focus to other areas of the data science life cycle. The fact that it’s available for python and can be used with common frameworks is a huge plus.

 

Subscribe to AVBytes here to get regular data science, machine learning and AI updates in your inbox!

 

About the Author

Pranav Dar

Senior Editor at Analytics Vidhya. Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

Our Top Authors

  • Analytics Vidhya
  • Guest Blog
  • Tavish Srivastava
  • Aishwarya Singh
  • Aniruddha Bhandari
  • Abhishek Sharma
  • Aarshay Jain

Download Analytics Vidhya App for the Latest blog/Article

3 thoughts on "Perform Automated Feature Engineering in Python with Featuretools"

Fawad Mahdi
Fawad Mahdi says: February 26, 2018 at 3:20 pm
This is truly awesome. Will save a whole lot of time, but will be interesting to see its practical implementation. Has it been released already? Reply
Pranav Dar
Pranav Dar says: February 26, 2018 at 6:26 pm
Hi Fawad, Yes it's available on Feature Lab's website (link is in the article above). Reply
Nick Bernini
Nick Bernini says: March 18, 2018 at 5:21 pm
Can you guys write a demo post on this? I’ve gone through their examples on git but am looking for more information. Reply

Leave a Reply Your email address will not be published. Required fields are marked *