Linear Regression is still the most prominent statistical technique used in the data science industry and academia to explain relationships between features. So, we at Analytics Vidhya, designed this comprehensive set of interview questions for our Skill Test participants to test their knowledge of linear regression techniques.

If you are one of those who missed out on this skill test in real-time, here are the questions and solutions for you to try answering and grading yourself. Note that these are important linear regression interview questions for data analyst and data scientist jobs.

A total of 1,355 people registered for the linear regression skill test. It was specially designed to include many of the most important linear regression interview questions covering various related topics, such as linear models, coefficients, intercepts, etc. Below is the distribution of the scores of the participants:

More than 800 people participated in the skill test. The highest score obtained was 28. Here is the leaderboard for the participants who took the test., so you may see where you stand.

Here are some resources to get in-depth knowledge of the subject.

- 5 Questions which can teach you Multiple Regression (with R and Python)
- Going Deeper into Regression Analysis with Assumptions, Plots & Solutions
- 7 Types of Regression Techniques you should know!

Are you a beginner in Machine Learning? Do you want to master the fundamental concepts of Machine Learning and Artificial Intelligence? Here are some beginner-friendly tutorial-based courses to assist you in your journey of becoming a data scientist:

A) TRUE

B) FALSE

**Solution: (A)**

Yes, Linear regression is a supervised learning algorithm because it uses true labels for training. A supervised machine learning model should have an input variable (x) and an output variable (Y) for each example.

A) TRUE

B) FALSE

**Solution: (A)**

**Linear Regression** has dependent variables that have continuous values.

A) TRUE

B) FALSE

**Solution: (A)**

True. A Neural network which is a component of deep learning, can be used as a *universal* approximator, so it can definitely implement a linear regression algorithm.

A) Least Square Error

B) Maximum Likelihood

C) Logarithmic Loss

D) Both A and B

**Solution: (A)**

In linear regression, we try to minimize the least square errors of the model to identify the line of best fit.

A) AUC-ROC

B) Accuracy

C) Logloss

D) Mean-Squared-Error

**Solution: (D)**

Since linear regression gives output as continuous values, so in such cases, we use mean squared error or r-squared metric to evaluate the model performance. The remaining options are used in case of a classification problem that can be solved by logistic regression or decision trees.

A) TRUE

B) FALSE

**Solution: (A)**

True, In the case of lasso regression, we apply an absolute penalty which makes some of the coefficients zero.

A) Lower is better

B) Higher is better

C) A or B depending on the situation

D) None of these

**Solution: (A)**

Residuals refer to the error values of the model. Therefore lower residuals that have normal distribution are desired.

Now Imagine that you are applying linear regression by fitting the best-fit line using the least square error on this data. You found that the correlation coefficient for one of its variables (Say X1) with Y is -0.95.

**Which of the following is true for X1?**

A) Relation between the X1 and Y is weak

B) Relation between the X1 and Y is strong

C) Relation between the X1 and Y is neutral

D) Correlation can’t judge the relationship

**Solution: (B)**

The absolute value of the correlation coefficient denotes the strength of the relationship. Since the absolute correlation is very high, we infer that the relationship is strong between X1 and Y.

If you are given the two variables V1 and V2**,** which follow the below two characteristics:

1. If V1 increases, then V2 also increases

2. If V1 decreases, then V2 behavior is unknown

A) Pearson correlation will be close to 1

B) Pearson correlation will be close to -1

C) Pearson correlation will be close to 0

D) None of these

**Solution: (D)**

We cannot comment on the correlation coefficient by using only statement 1. We need to consider both of these two statements. Consider V1 as x and V2 as |x|. The correlation coefficient would not be close to 1 in such a case.

A) TRUE

B) FALSE

**Solution: (B)**

Pearson correlation coefficient between 2 variables might be zero even when they have a relationship between them. If the correlation coefficient is zero, it just means that they don’t move together. We can take examples like y=|x| or y=x^2.

B) Perpendicular offset

C) Both, depending on the situation

D) None of above

**Solution: (A)**

We always consider residuals as vertical offsets. We calculate the direct differences between the actual value and the Y labels. Perpendicular offsets are useful in the case of dimensionality reduction techniques like PCA.

A) TRUE

B) FALSE

**Solution: (B)**

With a small training dataset, it’s easier to find a hypothesis to fit the training data exactly, i.e., overfitting.

- We don’t have to choose the learning rate.
- It becomes slow when the number of features is very large.
- There is no need to iterate.

A) 1 and 2

B) 1 and 3

C) 2 and 3

D) 1,2 and 3

**Solution: (D)**

Instead of gradient descent, a Normal Equation of linear algebra can also be used to find coefficients. Refer to this article to read more about the normal equation.

Below graphs show two fitted regression lines (A & B) on randomly generated data. Now, I want to find the sum of residuals in both cases, A and B.

**Note:**

- Scale is the same in both graphs for both axes.
- X-axis is the independent variable, and Y-axis is the dependent variable.

A) A has a higher sum of residuals than B

B) A has a lower sum of residual than B

C) Both have the same sum of residuals

D) None of these

**Solution: (C)**

The sum of residuals will always be zero; therefore, both have the same sum of residuals.

**Context for Questions 15-17:**

Suppose you have fitted a complex regression model on a dataset. Now, you are using Ridge regression with penalty x.

A) In the case of a very large x, bias is low

B) In the case of a very large x, bias is high

C) We can’t say about bias

D) None of these

**Solution: (B)**

If the penalty is very large, it means the model is less complex; therefore, the bias would be high.

A) Some of the coefficients will become absolute zero

B) Some of the coefficients will approach zero but not absolute zero

C) Both A and B depending on the situation

D) None of these

**Solution: (B)**

In Lasso, some of the coefficient values become zero, but in the case of Ridge, the coefficients become close to zero but not zero.

A) Some of the coefficients will become zero

B) Some of the coefficients will be approaching zero but not absolute zero

C) Both A and B depending on the situation

D) None of these

**Solution: (A)**

As already discussed, lasso applies an absolute penalty, so some of the coefficients will become zero.

A) Linear regression is sensitive to outliers

B) Linear regression is not sensitive to outliers

C) Can’t say

D) None of these

**Solution: (A)**

The slope of the regression line will change due to outliers in most cases. So Linear Regression is sensitive to outliers.

A) Since there is a relationship means our model is not good

B) Since there is a relationship means our model is good

C) Can’t say

D) None of these

**Solution: (A)**

There should not be any relationship between predicted values and residuals. If there exists any relationship between them, it means that the model has not perfectly captured the information in the data points.

**Context for Questions 20-22:**

Suppose that you have a dataset D1 and you design a linear model of degree 3 polynomial and find that the training and testing error is “0” or, in other words, it perfectly fits the data.

A) There is a high chance that degree 4 polynomial will overfit the data

B) There is a high chance that degree 4 polynomial will underfit the data

C) Can’t say

D) None of these

**Solution: (A)**

Since degree 4 will be more complex(overfitting the data) than the degree 3 model, it will again perfectly fit the data. In such a case, the training error will be zero, but the test error may not be zero. Polynomial regression is useful for non-linear data.

A) It is a high chance that degree 2 polynomial will overfit the data

B) It is a high chance that degree 2 polynomial will underfit the data

C) Can’t say

D) None of these

**Solution: (B)**

If a degree 3 polynomial fits the data perfectly, it’s highly likely that a simpler model (degree 2 polynomial) might underfit the data.

A) Bias will be high, and variance will be high

B) Bias will be low, and variance will be high

C) Bias will be high, and variance will be low

D) Bias will be low, and variance will be low

**Solution: (C)**

Since a degree 2 polynomial will be less complex as compared to degree 3, the bias will be high, and the variance will be low.

**Context for Question 23:**

Below are three graphs, A, B, and C, between the cost function and the number of iterations, I1, I2, and I3, respectively.

A) l2 < l1 < l3

B) l1 > l2 > l3

C) l1 = l2 = l3

D) None of these

**Solution: (A)**

In the case of a high learning rate, the step will be high, the objective function will decrease quickly initially, but it will not find the global minima, and the objective function starts increasing after a few iterations. In the case of a low learning rate, the step will be small. So the objective function will decrease slowly.

**Context for Questions 24-25:**

We have been given a dataset with n records in which we have an input attribute as x and an output attribute as y. Suppose we use a linear regression method to model this data. To test our linear regressor, we split the data in the training set and test a set randomly.

A) Increase

B) Decrease

C) Remain constant

D) Can’t Say

**Solution: (D)**

Training error may increase or decrease depending on the values that are used to fit the model. If the values used to train contain more outliers gradually, then the error might just increase.

A) Bias increases, and Variance increases

B) Bias decreases, and Variance increases

C) Bias decreases, and Variance decreases

D) Bias increases, and Variance decreases

E) Can’t Say False

**Solution: (D)**

As we increase the size of the training data, the bias would increase while the variance would decrease.

**Context for Question 26:**

Consider the following data where one input(X) and one output(Y) are given.

A) Less than 0

B) Greater than zero

C) Equal to 0

D) None of these

**Solution: (C)**

We can perfectly fit the straight line on the following data so that the mean error will be zero.

**Context for Questions 27-28:**

Suppose you have been given the following scenario for training and validation error for Linear Regression.

Scenario |
Learning Rate |
Number of iterations |
Training Error |
Validation Error |

1 | 0.1 | 1000 | 100 | 110 |

2 | 0.2 | 600 | 90 | 105 |

3 | 0.3 | 400 | 110 | 110 |

4 | 0.4 | 300 | 120 | 130 |

5 | 0.4 | 250 | 130 | 150 |

A) 1

B) 2

C) 3

D) 4

**Solution: (B)**

Option B would be the better option because it leads to less training as well as a validation error.

**Which of the following thing would you observe in such a case?**

A) Training Error will decrease, and Validation error will increase

B) Training Error will increase, and Validation error will increase

C) Training Error will increase, and Validation error will decrease

D) Training Error will decrease, and Validation error will decrease

E) None of the above

**Solution: (D)**

If the added feature is important, the training and validation error would decrease.

**Context for Questions 29-30:**

Suppose you got a situation where you find that your linear regression model is underfitting the data.

- Add more variables
- Start introducing polynomial degree variables
- Remove some variables

A) 1 and 2

B) 2 and 3

C) 1 and 3

D) 1, 2 and 3

**Solution: (A)**

In case of underfitting, you need to induce more variables in variable space or you can add some polynomial degree variables to make the model more complex to be able to fit the data better.

A) L1

B) L2

C) Any

D) None of these

**Solution: (D)**

I won’t use any regularization methods because regularization is used in case of overfitting.

Hope this comprehensive guide on linear regression interview questions has helped you assess yourself and also taught you a few new things. Taking such skill tests before your data analytics job interview is always helpful to keep you prepared and confident. Apart from these questions, it is also important to brush up your skills in MS Excel, SQL, Python, NumPy, pandas, scikit-learn, data mining, data visualization, etc., before your interview to ensure you ace it.

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Become a full stack data scientist
##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

Understanding Cost Function
Understanding Gradient Descent
Math Behind Gradient Descent
Assumptions of Linear Regression
Implement Linear Regression from Scratch
Train Linear Regression in Python
Implementing Linear Regression in R
Diagnosing Residual Plots in Linear Regression Models
Generalized Linear Models
Introduction to Logistic Regression
Odds Ratio
Implementing Logistic Regression from Scratch
Introduction to Scikit-learn in Python
Train Logistic Regression in python
Multiclass using Logistic Regression
How to use Multinomial and Ordinal Logistic Regression in R ?
Challenges with Linear Regression
Introduction to Regularisation
Implementing Regularisation
Ridge Regression
Lasso Regression

Introduction to Stacking
Implementing Stacking
Variants of Stacking
Implementing Variants of Stacking
Introduction to Blending
Bootstrap Sampling
Introduction to Random Sampling
Hyper-parameters of Random Forest
Implementing Random Forest
Out-of-Bag (OOB) Score in the Random Forest
IPL Team Win Prediction Project Using Machine Learning
Introduction to Boosting
Gradient Boosting Algorithm
Math behind GBM
Implementing GBM in python
Regularized Greedy Forests
Extreme Gradient Boosting
Implementing XGBM in python
Tuning Hyperparameters of XGBoost in Python
Implement XGBM in R/H2O
Adaptive Boosting
Implementing Adaptive Boosing
LightGBM
Implementing LightGBM in Python
Catboost
Implementing Catboost in Python

Introduction to Clustering
Applications of Clustering
Evaluation Metrics for Clustering
Understanding K-Means
Implementation of K-Means in Python
Implementation of K-Means in R
Choosing Right Value for K
Profiling Market Segments using K-Means Clustering
Hierarchical Clustering
Implementation of Hierarchial Clustering
DBSCAN
Defining Similarity between clusters
Build Better and Accurate Clusters with Gaussian Mixture Models

Introduction to Machine Learning Interpretability
Framework and Interpretable Models
model Agnostic Methods for Interpretability
Implementing Interpretable Model
Understanding SHAP
Out-of-Core ML
Introduction to Interpretable Machine Learning Models
Model Agnostic Methods for Interpretability
Game Theory & Shapley Values

Deploying Machine Learning Model using Streamlit
Deploying ML Models in Docker
Deploy Using Streamlit
Deploy on Heroku
Deploy Using Netlify
Introduction to Amazon Sagemaker
Setting up Amazon SageMaker
Using SageMaker Endpoint to Generate Inference
Deploy on Microsoft Azure Cloud
Introduction to Flask for Model
Deploying ML model using Flask

For question 4, isn't (D) the right answer? Can't we use OLS or MLE to find best fit line in Linear Regression? I had thought MLE would be better for complex data.

7) Which of the following is true about Residuals ? A) Lower is better B) Higher is better C) A or B depend on the situation D) None of these The correct answer is D. Lower Residuals SQUARES are better than higher residuals squares!

A good place to test yourself ! Great effort! Really helped.

Hey Ankit Thanks for all these questions. If possible can you please post more question on Linear as well as Multiple regression and on Hypothesis theory as well. Thanks in advance!!

Thanks for making it possible to train our knowledge regarding regression techniques. "Question Context 20-22: Suppose that you have a dataset D1 and you design a linear regression model of degree 3 polynomial and you found that the training and testing error is “0” or in another terms it perfectly fits the data." But one question, a degree 3 polynomial regression isn't considered as a linear regerssion model right? Cheers, Lena

Awesome write-up. I’m a normal visitor of your web site and appreciate you taking the time to maintain the nice site. I will be a regular visitor for a really long time. Thanks a lot.