An Introduction to Few-Shot Learning
This article was published as a part of the Data Science Blogathon.
In this article, I will give a short introduction to few-shot learning. Few-shot learning means making classification or regression based on a very
small number of samples. Before getting started, let’s play a game.
Consider the above support set. The left two images are Armadillos and the right two images are pangolins. You may have never heard of Armadillo or Pangolin, but it doesn’t matter. You just want to pay attention to their differences and try to distinguish the two animals. If you don’t know the
difference, I can give you some hints. Look at their ears and the size of the scales. Now I give you a query image.
Do you think it is Armadillo or Pangolin? Now, you can say that this is a pangolin. Humans can learn to distinguish the two animals using merely 4 training samples. For a human, making a prediction based on 4 training samples is not hard. But can computers do this as well? If a class has only two samples, can computers make the correct prediction? This is harder than the standard classification problem.
The number of samples is too small for training a deep neural network. This is where few-shot learning plays a role.
Table of contents
- Few-shot learning
- Support set vs Training set
- Few-shot learning vs Supervised learning
- Terminologies in few-shot learning
- Prediction accuracy of few-shot learning
- Basic idea behind few-shot learning
- Applications of few-shot learning
- Datasets for few-shot learning
- Frequently Asked Questions
- End Notes
Few-shot learning is the problem of making predictions based on a limited number of samples. Few-shot learning is different from standard supervised learning. The goal of few-shot learning is not to let the model recognize the images in the training set and then generalize to the test set. Instead, the goal is to learn. “Learn to learn” sounds hard to understand. You can think of it in this way.
I train the model on a big training set. The goal of training is not to know what an elephant is and what a tiger is. Instead, the goal is to know the
similarity and difference between objects.
After training, you can show the two images to the model and ask whether the two are the same kind of animals. The model has similarities and differences between objects. So, the model is able to tell that the contents in the two images are the same kind of objects. Take a look at our training data again.
The training data has 5 classes that do not include the squirrel class. Thus, the model is unable to recognize squirrels. When the model sees the two images, it does not know they are squirrels. However, the model knows they look alike. The model can tell you with high confidence that they are the same kind of objects.
For the same reason, the model has never seen a rabbit during training. So, it does not know the two images are rabbits. But the model knows the
similarity and difference between things. The model knows that the contents of the two images are very alike. So, the model can tell that they are the same object.
Then I show the above two images to the model. While the model has never seen pangolin and dog. The model knows the two animals look quite different. The model believes they are different objects.
Support set vs Training set
Support set is meta learning’s jargon. The small set of labeled images is called a support set. Note the differences between the training set and the support set. The training set is big. Every class in the training set has many samples. The training set is big enough for learning a deep neural network. In contrast, the support set is small. Every class has at most a few samples. In the training set, if every class has only one sample, it is impossible to train a deep neural network. The support set can only provide additional information at test time. Here is the basic idea of few-shot learning.
We do not train a big model using a big training set. Rather than training the model to recognize specific objects such as tigers and elephants in the training set, we train the model to know the similarity and differences between objects.
Let’s see what few-shot learning and meta-learning are.
You may have heard of meta-learning. Few-shot learning is a kind of meta-learning. Meta-learning is different from traditional supervised learning. Traditional supervised learning asks the model to recognize the training data and then generalize to unseen test data. Differently, meta learning’s goal is to learn.
How to understand learn to learn?
You bring a kid to the zoo. He’s excited to see the fluffy animal in the water which he has never seen before. He asked you, what’s this? Although he has never seen this animal before, he is a smart lid and can learn by himself. Now, you give the kid a set of cards. On every card, there is an animal and its name. The kid has never seen the animal in the water. He has never seen the animals on the cards, either. But the kid is so smart that by taking a look at all the cards, he knows the animal in the water. The animal in the water is most similar to the animal on the card. Teaching the kid to learn by himself is called meta-learning.
Before going to the zoo, the kid was already able to learn by himself. He knew the similarity and differences between animals. In meta-learning, the unknown animal is called a query. You give him a card and let him learn by himself. The set of cards is the support set. Learning to learn by
himself is called meta-learning. If there is only one card for every species, the kid learns to recognize using only one card. This is called one-shot
Few-shot learning vs Supervised learning
Here I compare traditional supervised learning with few-shot learning. Traditional supervised learning is to train a model using a big training set.
After the model is trained, we can use the model for making predictions. We show a test sample to the model and it recognizes it. Few-shot learning is a different problem. The query sample is never seen before. The query sample is from an unknown class. This is the main difference from traditional supervised learning.
Terminologies in few-shot learning
Let’s see few important terminologies.
k-way means the support set has k classes.
n-shot means every class has n samples.
The support set is called k-way and n-shot.
Prediction accuracy of few-shot learning
When performing few-shot learning, the prediction accuracy depends on the number of ways and the number of shots.
As the number of ways increases, the prediction accuracy drops.
You may ask why does this happen?
Let’s look at the same example. Now, the kid has given 3 cards and asked to choose one out of three. This is 3-way 1-shot learning. What if the kid was given 6 cards? This would be 6-way 1-shot learning. Which one do you think is easier, 3-way or 6-way?
Obviously, 3-way is easier than 6-way. Choosing one out of 3 is easier than choosing one out of six.
Thus, 3-way has higher accuracy than 6-way.
As the number of shots increases, the prediction accuracy improves. The phenomenon is easy to interpret. With more samples, the prediction becomes easier. Thus, 2-shot is easier than 1-shot.
Basic idea behind few-shot learning
The basic idea of few-shot learning is to train a function that predicts similarity.
Denote the similarity function by sim(x, x’).
It measures the similarity between the two samples, x, and x’.
If two samples are the same, the similarity function returns 1. i.e., sim(x, x’)=1
If the samples are different, they return 0. i.e., sim(x, x’)=0
After training, the learned similarity function can be used for making predictions for unseen queries. We can use the similarity function to compare the query with every sample in the support set and calculate the similarity scores. Then, find the sample with the highest similarity score and use it as the prediction.
Now I demonstrate how few-shot learning makes a prediction.
Given the query image, I want to know what the image is.
We can compare the query with every sample in the support set.
Compare the query with cat, the similarity function outputs a similarity score of 0.6
The similarity score between the query and the monkey is 0.4
Similarly, for the lion, it is 0.2 and for the dog, it is 0.9
Among those similarity scores, 0.9 is the biggest. Thus, the model predicts the query is a dog. One-shot learning can be performed in this way. Given a support set, we can compute the similarity between the query and every sample in the support set to find the most similar sample.
Applications of few-shot learning
Few-shot learning has a wide range of applications in the trending fields of data science such as computer vision, robotics, and much more. They can be used for character recognition, image recognition, and classification approaches. They perform well for some applications of NLP such as translation, text classification, sentiment analysis, etc. In robotics, they can be used to train robots with a small number of training sets.
Datasets for few-shot learning
If you do research on meta-learning, then you will need datasets for evaluating your model.
Here I introduce 2 datasets that are most widely used in research papers.
Omniglot is the most frequently used dataset. It is a hand-written dataset. You can get it here
Another commonly used dataset is Mini-ImageNet. Mini-ImageNet
Frequently Asked Questions
A. Few-shot learning refers to a machine learning paradigm where a model is trained to make accurate predictions with only a small number of examples per class. This approach enables the model to generalize well to new, unseen data despite having limited training data. It’s particularly useful for tasks where collecting abundant labeled data is challenging or time-consuming.
A. Zero-shot learning is a concept where a model can generalize to new classes it has never seen during training. For instance, if a model is trained to recognize various dog breeds but hasn’t encountered a specific breed like “Xoloitzcuintli,” it can still make accurate predictions about it by leveraging shared attributes and information from related classes. This ability to infer without explicit training on a class characterizes the essence of zero-shot learning.
By now, I am sure, you would have an idea of few-shot learning and meta-learning concepts. Take up problems, apply to code, and see the fun. Keep
The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.