This article was published as a part of the Data Science Blogathon

Regression is a supervised learning technique that supports finding the correlation among variables. A regression problem is when the output variable is a real or continuous value.

In Regression, we plot a graph between the variables which best fit the given data points. The machine learning model can deliver predictions regarding the data. In naÃ¯ve words, * “Regression shows a line or curve that passes through all the data points on a target-predictor graph in such a way that the vertical distance between the data points and the regression line is minimum.”* It is used principally for prediction, forecasting, time series modeling, and determining the causal-effect relationship between variables.

- Linear Regression
- Polynomial Regression
- Logistics Regression

Linear regression is a quiet and simple statistical regression method used for predictive analysis and shows the relationship between the continuous variables. Linear regression shows the linear relationship between the independent variable (X-axis) and the dependent variable (Y-axis), consequently called linear regression*. If there is a single input variable (x), such linear regression is called simple linear regression*

The above graph presents the linear relationship between the dependent variable and independent variables. When the value of x (**independent variable**) increases, the value of y (**dependent variable**) is likewise increasing. The red line is referred to as the best fit straight line. Based on the given data points, we try to plot a line that models the points the best.

*To calculate best-fit line linear regression uses a traditional slope-intercept form.*

**y= Dependent Variable.**

x= Independent Variable.

a0= intercept of the line.

a1 = Linear regression coefficient.

**Need of a Linear regression**

As mentioned above, Linear regression estimates the relationship between a dependent variable and an independent variable. Letâ€™s understand this with an easy example:

Letâ€™s say we want to estimate the salary of an employee based on year of experience. You have the recent company data, which indicates that the relationship between experience and salary. Here year of experience is an independent variable, and the salary of an employee is a dependent variable, as the salary of an employee is dependent on the experience of an employee. Using this insight, we can predict the future salary of the employee based on current & past information.

*A regression line can be a Positive Linear Relationship or a Negative Linear Relationship.*

If the dependent variable expands on the Y-axis and the independent variable progress on X-axis, then such a relationship is termed a Positive linear relationship.

If the dependent variable decreases on the Y-axis and the independent variable increases on the X-axis, such a relationship is called a negative linear relationship.

The goal of the linear regression algorithm is to get the best values for a0 and a1 to find the best fit line. The best fit line should have the least error means the error between predicted values and actual values should be minimized.

The cost function helps to figure out the best possible values for a0 and a1, which provides the best fit line for the data points.

Cost function optimizes the regression coefficients or weights and measures how a linear regression model is performing. The cost function is used to find the accuracy of the **mapping function** that maps the input variable to the output variable. This mapping function is also known as **the Hypothesis function**.

In Linear Regression, **Mean Squared Error (MSE)** cost function is used, which is the average of squared error that occurred between the predicted values and actual values.

*By simple linear equation y=mx+b we can calculate MSE as:*

*Letâ€™s y = actual values, y _{i }= predicted values*

Using the MSE function, we will change the values of a0 and a1 such that the MSE value settles at the minima. Model parameters *xi, b ***(a _{0},a_{1})**

Gradient descent is a method of updating a0 and a1 to minimize the cost function (MSE). A regression model uses gradient descent to update the coefficients of the line (a0, a1 => xi, b) by reducing the cost function by a random selection of coefficient values and then iteratively update the values to reach the minimum cost function.

Imagine a pit in the shape of U. You are standing at the topmost point in the pit, and your objective is to reach the bottom of the pit. There is a treasure, and you can only take a discrete number of steps to reach the bottom. If you decide to take one footstep at a time, you would eventually get to the bottom of the pit but, this would take a longer time. If you choose to take longer steps each time, you may get to sooner but, there is a chance that you could overshoot the bottom of the pit and not near the bottom. In the gradient descent algorithm, the number of steps you take is the learning rate, and this decides how fast the algorithm converges to the minima.

To update a_{0} and a_{1}, we take gradients from the cost function. To find these gradients, we take partial derivatives for a_{0} and a_{1}.

The partial derivates are the gradients, and they are used to update the values of a_{0} and a_{1}. Alpha is the learning rate.

Source : mygreatleaning.com

The blue line represents the optimal value of the learning rate, and the cost function value is minimized in a few iterations. The green line represents if the learning rate is lower than the optimal value, then the number of iterations required high to minimize the cost function. If the learning rate selected is very high, the cost function could continue to increase with iterations and saturate at a value higher than the minimum value, that represented by a red and black line.

In this, I will take random numbers for the dependent variable (salary) and an independent variable (experience) and will predict the impact of a year of experience on salary.

import matplotlib.pyplot as plt import pandas as pd import numpy as np

x= np.array([2.4,5.0,1.5,3.8,8.7,3.6,1.2,8.1,2.5,5,1.6,1.6,2.4,3.9,5.4]) y = np.array([2.1,4.7,1.7,3.6,8.7,3.2,1.0,8.0,2.4,6,1.1,1.3,2.4,3.9,4.8]) n = np.size(x)

**Python Code:**

The main function to calculate values of coefficients

- Initialize the parameters.
- Predict the value of a dependent variable by given an independent variable.
- Calculate the error in prediction for all data points.
- Calculate partial derivative w.r.t a0 and a1.
- Calculate the cost for each number and add them.
- Update the values of a0 and a1.

#initialize the parameters a0 = 0 #intercept a1 = 0 #Slop lr = 0.0001 #Learning rate iterations = 1000 # Number of iterations error = [] # Error array to calculate cost for each iterations. for itr in range(iterations): error_cost = 0 cost_a0 = 0 cost_a1 = 0 for i in range(len(experience)): y_pred = a0+a1*experience[i] # predict value for given x error_cost = error_cost +(salary[i]-y_pred)**2 for j in range(len(experience)): partial_wrt_a0 = -2 *(salary[j] - (a0 + a1*experience[j])) #partial derivative w.r.t a0 partial_wrt_a1 = (-2*experience[j])*(salary[j]-(a0 + a1*experience[j])) #partial derivative w.r.t a1 cost_a0 = cost_a0 + partial_wrt_a0 #calculate cost for each number and add cost_a1 = cost_a1 + partial_wrt_a1 #calculate cost for each number and add a0 = a0 - lr * cost_a0 #update a0 a1 = a1 - lr * cost_a1 #update a1 print(itr,a0,a1) #Check iteration and updated a0 and a1 error.append(error_cost) #Append the data in array

*At approximate iteration 50- 60, we got the value of a0 and a1.*

print(a0) print(a1)

plt.figure(figsize=(10,5)) plt.plot(np.arange(1,len(error)+1),error,color='red',linewidth = 5) plt.title("Iteration vr error") plt.xlabel("iterations") plt.ylabel("Error")

pred = a0+a1*experience print(pred)

plt.scatter(experience,salary,color = 'red') plt.plot(experience,pred, color = 'green') plt.xlabel("experience") plt.ylabel("salary")

error1 = salary - pred se = np.sum(error1 ** 2) mse = se/n print("mean squared error is", mse)

from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error experience = experience.reshape(-1,1) model = LinearRegression() model.fit(experience,salary) salary_pred = model.predict(experience) Mse = mean_squared_error(salary, salary_pred) print('slop', model.coef_) print("Intercept", model.intercept_) print("MSE", Mse)

A. Linear regression is a fundamental machine learning algorithm used for predicting numerical values based on input features. It assumes a linear relationship between the features and the target variable. The model learns the coefficients that best fit the data and can make predictions for new inputs. It is widely employed in various fields for tasks like forecasting, trend analysis, and correlation exploration.

A. The function of linear regression is to model and analyze the relationship between a dependent variable (target) and one or more independent variables (features). It helps us understand how the independent variables influence the dependent variable and allows us to make predictions based on the learned relationship. Linear regression estimates the best-fitting line that represents this relationship, enabling us to analyze and make predictions for new data points.

In Regression, we plot a graph between the variables which best fit the given data points. Linear regression shows the linear relationship between the independent variable (X-axis) and the dependent variable (Y-axis).To calculate best-fit line linear regression uses a traditional slope-intercept form. A regression line can be a Positive Linear Relationship or a Negative Linear Relationship.

The goal of the linear regression algorithm is to get the best values for a0 and a1 to find the best fit line and the best fit line should have the least error. In Linear Regression, **Mean Squared Error (MSE)** cost function is used, which helps to figure out the best possible values for a0 and a1, which provides the best fit line for the data points. Using the MSE function, we will change the values of a0 and a1 such that the MSE value settles at the minima. Gradient descent is a method of updating a0 and a1 to minimize the cost function (MSE)

*The media shown in this article are not owned by Analytics Vidhya and are used at the Authorâ€™s discretion.*

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Become a full stack data scientist
##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

Understanding Cost Function
Understanding Gradient Descent
Math Behind Gradient Descent
Assumptions of Linear Regression
Implement Linear Regression from Scratch
Train Linear Regression in Python
Implementing Linear Regression in R
Diagnosing Residual Plots in Linear Regression Models
Generalized Linear Models
Introduction to Logistic Regression
Odds Ratio
Implementing Logistic Regression from Scratch
Introduction to Scikit-learn in Python
Train Logistic Regression in python
Multiclass using Logistic Regression
How to use Multinomial and Ordinal Logistic Regression in R ?
Challenges with Linear Regression
Introduction to Regularisation
Implementing Regularisation
Ridge Regression
Lasso Regression

Introduction to Stacking
Implementing Stacking
Variants of Stacking
Implementing Variants of Stacking
Introduction to Blending
Bootstrap Sampling
Introduction to Random Sampling
Hyper-parameters of Random Forest
Implementing Random Forest
Out-of-Bag (OOB) Score in the Random Forest
IPL Team Win Prediction Project Using Machine Learning
Introduction to Boosting
Gradient Boosting Algorithm
Math behind GBM
Implementing GBM in python
Regularized Greedy Forests
Extreme Gradient Boosting
Implementing XGBM in python
Tuning Hyperparameters of XGBoost in Python
Implement XGBM in R/H2O
Adaptive Boosting
Implementing Adaptive Boosing
LightGBM
Implementing LightGBM in Python
Catboost
Implementing Catboost in Python

Introduction to Clustering
Applications of Clustering
Evaluation Metrics for Clustering
Understanding K-Means
Implementation of K-Means in Python
Implementation of K-Means in R
Choosing Right Value for K
Profiling Market Segments using K-Means Clustering
Hierarchical Clustering
Implementation of Hierarchial Clustering
DBSCAN
Defining Similarity between clusters
Build Better and Accurate Clusters with Gaussian Mixture Models

Introduction to Machine Learning Interpretability
Framework and Interpretable Models
model Agnostic Methods for Interpretability
Implementing Interpretable Model
Understanding SHAP
Out-of-Core ML
Introduction to Interpretable Machine Learning Models
Model Agnostic Methods for Interpretability
Game Theory & Shapley Values

Deploying Machine Learning Model using Streamlit
Deploying ML Models in Docker
Deploy Using Streamlit
Deploy on Heroku
Deploy Using Netlify
Introduction to Amazon Sagemaker
Setting up Amazon SageMaker
Using SageMaker Endpoint to Generate Inference
Deploy on Microsoft Azure Cloud
Introduction to Flask for Model
Deploying ML model using Flask

hey for Negative Linear Relationship the slope will be -ve and the equation can be written as y = a0 - a1*x