Frequently Asked Interview Questions on Naive Bayes Classifier

Aman Preet Gulati 08 Nov, 2022

5 min read

This article was published as a part of the Data Science Blogathon.

Introduction

We, as data science and machine learning enthusiasts, have learned about various algorithms like Logistic Regression, Linear Regression, Decision Trees, Naive Bayes, etc. But at the same time, are we preparing for the interviews? As we know, the end goal is to land our dream job for the companies we are aiming for. Henceforth, knowing how the questions are turned and twisted by the interviewer is very much important to answer in the most efficient reason; I’m starting with the series of the Top 10 most frequently asked interview questions on various machine learning algorithms.

In this article, we will be covering the top 10 interview questions on the Naive Bayes classifier, but we are not gonna jump straight over those tricky questions; instead, let’s first have some high-level understanding of this algorithm so that one will be able to understand the concept behind it.

Naive Bayes classifier

Naive Bayes is considered to be the top choice while dealing with classification problems, and it has it’s rooted in the concept of probabilities. Specifically, this algorithm is the by-product of the Bayes Theorem. But you must be thinking that if it is based on Bayes theorem, why is this Naive term in the prefix position as “Naive” means “Dumb”? So is this algorithm dumb or useful?

The answer is simple and pretty straightforward; this algorithm is not at all Naive but, at times, quite useful and simple when compared to other complex algorithms. The reason it is known to be the naive Bayes is because of its general assumptions, which takes us to our very first interview question:

Interview Questions on Naive Bayes

1. What is the basic assumption in the case of the Naive Bayes classifier?

If one wants to give the short answer, then they can simply say – “Features are independent.” But this will not be sufficient; hence we need to explain the answer briefly: In Naive Bayes, it assumes beforehand that all the features are independent of each other, and it treats all of them separately, which gives each feature an equal contribution to the final result. This assumption is known as the I.I.D assumption.

2. What are the possible advantages of choosing the Naive Bayes classifier?

As it works independently with each feature, we can use it with large datasets for making generalized models.
It has very much less sensitive to other features, i.e.; it is not much affected by other components because of its Naive nature.
It tends to work efficiently with both continuous and discrete types of datasets and is well-versed in handling categorical features in data.
When we have a dataset with very less training data, then we can call up the Naive Bayes classifier in this scenario it outperforms other models.

3. What disadvantages of Naive Bayes can make you remove it from your analysis?

As we say that there are always two sides to a coin, the advantage of naive Bayes can also be a disadvantage at some stages. As it treats all the predictors independently, for that reason, we are not able to use it in all real-world cases.
This algorithm faces a very major problem named the “Zero Frequency problem,” in which it assigns zero probabilities to all the categorical variables whose categories were not present in the training dataset, which introduces a lot of bias in the model.
As the features are highly correlated, it affects the model performance negatively.

4. Is feature scaling required in Naive Bayes?

A sure short answer should be: As the Naive Bayes classifier is not dependent on the distance. Still, the probability hence for that reason feature scaling is not required, i.e, Any algorithm which is not dependent on distance will not require feature scaling.

5. Impact of missing values on naive Bayes?

Naive Bayes is one of the algorithms that can handle the missing data at its end. Only the reason is that in this algo, all the attributes are handled separately during both model construction and prediction time If data points are missing for a certain feature, then it can be ignored when a probability is calculated for a separate class, which makes it handle the missing data at model building phase itself.Do refer to this amazing tutorial for a better understanding

6. Impact of outliers?

Naive Bayes is highly impacted by outliers and completely robust in this case (depending on the USE case we are working on). The reason is the NB classifier assigns the 0 probability for all the data instances it has not seen in the training set, which creates an issue during the prediction time, and the same goes with outliers also, as it would have been the same data that the classifier has not seen before.

7. What are different problem statements you can solve using Naive Bayes?

Naive Bayes is a probabilistic-based machine learning algorithm, and it can be used widely in many classification tasks:

Sentiment Analysis
Spam classification
Twitter sentiment analysis
Document categorization

8. Does Naive Bayes fall under the category of the discriminative or generative classifier?

The straightforward answer is: Naive Bayes is a generative type of classifier. But this information is not enough. We should also know what a generative type of classifier is.Generative: This type of classifier learns from the model that generates the data behind the scene by estimating the distribution of the model. Then it predicts the unseen data. Henceforth, the same goes for the NB classifier, as it learns from the distribution of data and doesn’t create a decision boundary to classify components.

9. What do you know about posterior and prior probability in Naive Bayes

Prior probability: This can also be tagged as an initial probability. It’s the part of Bayesian statistics where it is the probability when the data is not even collected. That’s why it is known as “Prior” probability. This probability is the outcome vs. the current predictor before the experiment is performed.Posterior probability: In simple words, this is the probability that we get after a few experiment trials. It is the ascendant of prior probability. For that reason, it is also known as updated probability.

10. How does Naive Bayes treats categorical and numerical values?

We have two separate and dedicated distributions for both categorical and numerical values to deal with either type of value. They are mentioned below:

Categorical values: In this case, we can get the probability for categorical variables by using Multinomial or Bernoulli Distribution.
Numerical values: In this situation, we can estimate the probability by using Normal or Gaussian distribution.

Conclusion

So we are in the last section of this article and have reached here after completing the top 10 interview questions on the NB classifier. This segment usually briefly discusses everything so we can list our learnings in a nutshell.

Firstly we started this small journey by introducing the concept behind the Naive Bayes algorithm, and straight after that, we discussed the assumptions, advantages, and disadvantages of the Naive Bayes classifier.
Then we move on to tricky questions like, will this algorithm be affected by outliers and missing values? Or will the feature scaling is the required step while analyzing this classifier?
At last, we covered some more questions based on the mathematical intuition behind this algorithm, like How naive Bayes treats categorical and numerical values, what is posterior and prior probability, and last, whether the NB classifier is under a generative or discriminative category.

I hope you liked my article on the Top 10 most frequently asked interview questions on the Naive Bayes classifier. If you have any opinions or questions, then comment below.

Connect with me on LinkedIn for further discussion.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.