There are only a handful of machine learning conferences in the world that attract the top brains in this field. One such conference, which I am an avid follower of, is the International Conference on Machine Learning (ICML).
Folks from top machine learning research companies, like Google AI, Facebook, Uber, etc. come together and present their latest research. It’s a conference any data scientist would not want to miss.
ICML 2019, held last week in Southern California, USA, saw records tumble in astounding fashion. The number of papers received and the number of papers accepted at the conference – both broke all previous records. Check out the numbers:
Source: Medium
A panel of hand-picked judges is charged with picking out the best papers from this list. Receiving this best paper award is quite a prestigious achievement – everyone in the research community strives for it!
And decrypting these best papers from ICML 2019 has been an eye-opener for me. I love going through these papers and breaking them down so our community can also partake in the hottest happenings in machine learning.
In this article, we’ll look at Google AI’s best paper from the ICML 2019 conference. There is a heavy focus on unsupervised learning so there’s a lot to unpack. Let’s dive right in.
You can also check out my articles on the best papers from ICLR 2019 here.
Our main focus is on the first paper from the Google AI team. So let’s check out what Google has put forward for our community.
Note: There are certain unsupervised deep learning concepts you should be aware of before diving into this article. I suggest going through the below guides first in case you need a quick refresher:
Let’s first understand what disentangled representations are. Here is Google AI’s succinct and simple definition of the concept:
The ability to understand high-dimensional data, and to distill that knowledge into useful representations in an unsupervised manner, remains a key challenge in deep learning. One approach to solving these challenges is through disentangled representations, models that capture the independent features of a given scene in such a way that if one feature changes, the others remain unaffected. – Google AI
As the paper says, in representation learning, it is often assumed that real-world observations x, like images or videos, are generated by a two-step generative process:
In other words, a lower dimensional entity, which is mapped to the higher-dimensional space of observation, could be used to explain a high-dimension observation.
The objective of this research is to point out the areas of improvement for future work to make unsupervised disentangled methods better.
The authors have released a reproducible large-scale experimental study on seven different datasets, including 12,000 models that were trained covering the most prominent methods and evaluation metrics.
There is currently no single formalized notion of disentanglement which is widely accepted. So, the key intuition is that a disentangled representation should separate the distinct, informative factors of variations in the data.
The current state-of-the-art approaches for unsupervised disentanglement learning are largely based on Variational Autoencoders (VAEs). A specific distribution P(z) is assumed on a latent space and then a deep neural network is used to parameterize the conditional probability P(x|z).
Similarly, the distribution P(z|x) is approximated using a variational distribution Q(z|x). The model is then trained by minimizing a suitable approximation to the negative log-likelihood.
Google AI researchers have challenged the commonly held assumptions in this field. I have summarized their contributions below:
Visualization of the ground-truth factors of the Shapes3D data set: Floor color (upper left), wall color (upper middle), object color (upper right), object size (bottom left), object shape (bottom middle), and camera angle (bottom right)
I have taken this section from within the paper itself. If you have any queries, you can reach out to me in the comments section below the article and I’ll be happy to clarify them.
Considered methods:
All the considered methods augment the VAE (Variational Autoencoders) loss with some regularizer.
Considered metrics:
Datasets:
The Scream Painting
This is the part that will get every data scientist out of their seats! The researchers have showcased their results by answering a set of questions.
Total correlation based on a fitted Gaussian of the sampled (left) and the mean representation (right) plotted against regularization strength for Color-dSprites and approaches (except AnnealedVAE). The total correlation of the sampled representation decreases while the total correlation of the mean representation increases as the regularization strength is increased
(left) FactorVAE score for each method on Cars3D. Models are abbreviated (0=β- VAE, 1=FactorVAE, 2=β-TCVAE, 3=DIP-VAE-I, 4=DIP-VAE-II, 5=AnnealedVAE). The scores are heavily overlapping. (right) Distribution of FactorVAE scores for FactorVAE model for different regularization strengths on Cars3D.
Statistical efficiency of the FactorVAE Score for learning a GBT downstream task on dSprites.
The Google AI team continues to nail its machine learning research. They continue to be on top of the latest advacements, this year’s International Conference of Machine Learning.
The second paper selected is based on how the results could be made better in Gaussian Process Regression, you can check out the paper through the link provided in this article.
Let me know about your views on the Google AI research paper in the comments section below. Keep learning!
inspiring article.
This is indeed an attention getter! Haven't read the paper yet but, going on the assumption that Google AI haven't made a gross methodological error, which is reasonable, the questions come to mind fast and furious! At the top of the list is what this means for already accepted methods and results. What is the unknown correlational factor for the mean? Had Google AI found it, this paper wouldn't exist. We're leaning heavily on Google AI's reputation for our philosophical comfort right now. It's trivially true to say that the models/features are correlated because they are in a set of observations, and are a product of certain mathematical operations. Maybe not so trivial? I feel like some kind of New Age Woo-meister for even thinking of the Observer Effect here.