Hack Session: Deploy DL models in production using PyTorch

Nov 15, 2019


Auditorium 1

60 minutes

DL Deployment

Delivering AI predictions comes with a wide variety of challenges, such as:

  • The data has to reside on the client side, which requires the model to run on devices like mobile phones, IoT devices
  • Handling multiple user requests
  • Handling applications that can have near real-time requirements, when the model inference times can be in few seconds

PyTorch has been extremely popular among researchers, but production teams had a tough time to convert the latest research to a production-friendly environment. From PyTorch 1.0, the community and several teams from companies like Facebook, Microsoft has taken significant efforts to make it easier and seamless for production usage. In this hack session, we will look at different approaches on how teams can put their models in production.

Key Takeaways:

  1. Deploy PyTorch model using Flask
  2. Handle multiple user requests
  3. Understand how to use torch script for saving the trained model as a graph and loading it in another language like C++
  4. Reducing inference time by using Quantization techniques
  • Vishnu Subramanian

    AI Researcher and Consultant

Copyright 2019 Analytics Vidhya. All rights reserved