Failing Fast with Deep Learning | DataHack Summit 2018

Failing Fast with Deep Learning

The Problem

As technology becomes cheaper and more available, we start taking it for granted. Nowhere is this more true than in machine learning. As machines become cheaper and data becomes more and more voluminous, our approach to specific machine learning problems often, and understandably, becomes haphazard. Since GPUs are much cheaper and more widely available than ever before, we implicitly believe that throwing enough artificial neurons at a problem will eventually solve it.

While this by itself may be true, it is not uncommon for ML practitioners to realize – unfortunately only in hindsight – that most of the iterations required to build a successful predictive model were unnecessary. Ironically, these ‘missteps’ are often what lead us to the correct answer. Solving a machine learning problem is like traversing a minefield, where the safest path can only be determined by blowing up a significantly large number of mines. You can only figure out the right approach after making a bunch of mistakes. Since there is no general rule for determining a ‘best model’, most things in deep learning can only be solved with trial and error. To a large extent, this ‘see what sticks’ approach cannot be avoided. However it can be curbed significantly, with a structured approach to running machine learning experiments. This structured approach is what this talk is about.

The Solution

The building blocks of neural networks and the science behind them, including that of their efficiency and trainability, are already very well understood. The heuristics required to ascertain reasonable convergence and predictive accuracy have also been studied in detail. On a very high level, these best practices are simply a result of studying and understanding the underlying mathematics of neural networks. However, the lack of a structured approach prevents us from fully utilizing these best practices. The ideal way of managing machine learning experiments is with a lab journal. Each machine learning experiment can be reasonably characterized by a hypothesis, a procedure and finally drawing inferences from it’s results. A well kept journal would help practitioners from repeating mistakes, and narrowing down to the right approach.

The Tools

This talk will introduce a lab journal powered by Python, and optimized for deep learning experiments. It will allow users to log experiments carried out on sklearn estimators and keras models. The journal also behaves like a hyperparameter grid manager, which also alerts the user if the user accidentally re-runs the same experiment on the same data with the same parameters. It will have some meta-learning features which allow for an end-to-end approach to machine learning experiments.

Speaker

Jaidev Deshpande

Jaidev Deshpande is a Senior Data Scientist at Gramener. With six years of experience in the data science field, he specializes in building end-to-end machine learning pipelines for data driven products. Additionally, he is an active contributor to the scientific Python software stack.

Buy Ticket