Hack Session: Video Encoding & Classification using Deep Learning

Nov 13, 2019


Auditorium 1

55 minutes

Computer Vision

Working on more complex datasets is becoming everyday’s life of a Machine Learning Engineer. Since the first Youtube-8M Kaggle competition, video classification has been gaining momentum  Also, being able to learn a robust representation of a video is not an easy task as it is a complex mix of sequential data (like time series) and images (RGB tensor). 
I will cover both unsupervised and supervised tasks using different types of Deep Learning algorithms (CNN, RNN, …). Finally, I will end up with some insights on multi-modal video classification (using images & textual information like descriptions/titles).


Key Takeaways:

  • Learn the key concepts of the different deep learning architectures (RNN, CNN, …) 
  • Transfer Learning using SOTA algorithms (InceptionV3, …)
  • Video encoding for video similarity and video classification
  • Multi-modal classification
  • Axel de Romblay

    Machine Learning Engineer


