In today’s world, Machine learning has become one of the popular and exciting fields of study. Machine learning models can now learn and more accurately predict the outcomes for even unseen data. The ideas in Machine learning overlap and receives from **Artificial Intelligence** and many other related technologies. Today, machine learning is evolved from **Pattern Recognition** and the concept that computers can learn without being explicitly programmed to perform specific tasks. We can use Machine Learning algorithms (e.g.,** Logistic Regression**, **Naive Bayes**, etc.) to recognize spoken words, mine data, build applications that learn from data, and more. Moreover, the accuracy of these algorithms increases over time.

In this article, we will be looking at Generative and Discriminative ML models closely, along with difference between Generative vs Discriminative Models.

**Learning Objectives**

- Understand the fundamental discriminative and generative models
- Understand the differences between discriminative and generative models and when to use each one
- Explore the approach of the models
- Explore some examples of discriminative and generative models

This article was published as a part of the Data Science Blogathon.

Machine learning models can be classified intotwo types: Discriminative and Generative. In simple words, a discriminative model makes predictions on unseen data based on conditional probability and can be used either for classification or regression problem statements. On the contrary, a generative model focuses on the distribution of a dataset to return a probability for a given example.

We, as humans, can adopt any of the two different approaches to machine learning models while learning an artificial language. These two models have not previously been explored in human learning. However, it is related to known effects of causal direction, classification vs. inference learning, and observational vs. feedback learning. So, In this article, our focus is on two types of machine learning models – **Generative** and **Discriminative,** and also see the importance, comparisons, and differences of these two models.

Suppose we are working on a classification problem where our task is to decide if an email is spam or not spam based on the words present in a particular email. To solve this problem, we have a joint model over.

- Labels:
**Y=y**, and - Features:
**X={x***1*, x*2*, …x*n*}

Therefore, the joint distribution of the model can be represented as

p(Y,X) = P(y,x1,x2…xn)

Now, our goal is to estimate the probability of spam email i.e., **P(Y=1|X)**. Both generative and discriminative models can solve this problem but in different ways.

Let’s see why and how they are different!

In the case of generative models, to find the conditional probability** P(Y|X)**, they estimate the priorprobability **P(Y)** and likelihood probability **P(X|Y) **with the help of the training data and use the Bayes Theorem to calculate the posterior probability **P(Y |X):**

In the case of discriminative models, to find the probability, they directly assume some functional form for **P(Y|X) **andthen estimate the parameters of **P(Y|X)** with the help of the training data.

The discriminative model refers to a class of models used in **Statistical Classification**, mainly used for supervised machine learning. These types of models are also known as **conditional models** since they learn the boundaries between classes or labels in a dataset.

Discriminative models focus on modeling the decision boundary between classes in a classification problem. The goal is to learn a function that maps inputs to binary outputs, indicating the class label of the input. Maximum likelihood estimation is often used to estimate the parameters of the discriminative model, such as the coefficients of a logistic regression model or the weights of a neural network.

Discriminative models (just as in the literal meaning) separate classes instead of modeling the conditional probability and don’t make any assumptions about the data points. But these models are not capable of generating new data points. Therefore, the ultimate objective of discriminative models is to separate one class from another.

If we have some outliers present in the dataset, discriminative models work better compared to generative models i.e., discriminative models are more robust to outliers. However, one major drawback of these models is the **misclassification problem**, i.e., wrongly classifying a data point.

Training discriminative classifiers or discriminant analysis involves estimating a function** f: X -> Y**, or probability **P(Y|X)**

- Assume some functional form for the probability, such as
**P(Y|X)** - With the help of training data, we estimate the parameters of
**P(Y|X)**

- Logistic regression
- Support vector machines(SVMs)
- Traditional neural networks
- Nearest neighbor
- Conditional Random Fields (CRFs)
- Decision Trees and Random Forest

Generative models are machine learning models that learn to generate new data samples similar to the training data they were trained on. They capture the underlying distribution of the data and can produce novel instances. Generative models find applications in image synthesis, data augmentation, and generating realistic content like images, music, and text.

Generative models are considered a class of statistical models that can generate new data instances. These models are used in unsupervised machine learning as a means to perform tasks such as:

- Probability and Likelihood estimation,
- Modeling data points
- To describe the phenomenon in data,
- To distinguish between classes based on these probabilities.

Since these models often rely on the Bayes theorem to find the joint probability, generative models can tackle a more complex task than analogous discriminative models.

So, the Generative approach focuses on the distribution of individual classes in a dataset, and the learning algorithms tend to model the underlying patterns or distribution of the data points (e.g., gaussian). These models use the concept of joint probability and create instances where a given **feature ( x) **or input and the desired output or

These models use **probability estimates** and **likelihood** to model data points and differentiate between different class labels present in a dataset. Unlike discriminative models, these models can also generate new data points.

However, they also have a major drawback – If there is a presence of outliers in the dataset, then it affects these types of models to a significant extent.

Image Source: medium.com

Training generative classifiers involve estimating a function **f: X -> Y**, or probability **P(Y|X):**

- Assume some functional form for the probabilities such as
**P(Y), P(X|Y)** - With the help of training data, we estimate the parameters of
**P(X|Y), P(Y)** - Use the Bayes theorem to calculate the posterior probability
**P(Y |X)**

- Naïve Bayes
- Bayesian networks
- Markov random fields
- Hidden Markov Models (HMMs)
- Latent Dirichlet Allocation (LDA)
- Generative Adversarial Networks (GANs)
- Autoregressive Model

Let’s see some of the differences between the Generative vs Discriminative Models:

Aspect | Generative Models | Discriminative Models |
---|---|---|

Purpose | Model data distribution | Model conditional probability of labels given data |

Use Cases | Data generation, denoising, unsupervised learning | Classification, supervised learning tasks |

Common Examples | Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs) | Logistic Regression, Support Vector Machines, Deep Neural Networks |

Training Focus | Maximize likelihood of observed data, Capture data structure | Learn decision boundary, Differentiate between classes |

Example Task | Image generation, Inpainting (e.g., GANs, VAEs) | Text classification, Object detection (e.g., Deep Neural Networks) |

Now, lets look at the concrete differences between both the models:

Discriminative models draw boundaries in the data space, while generative models try to model how data is placed throughout the space. A generative model explains how the data was generated, while a discriminative model focuses on predicting the labels of the data.

In mathematical terms, discriminative machine learning trains a model, which is done by learning parameters that maximize the conditional probability** P(Y|X). **On the other hand, a generative model learns parameters by maximizing the joint probability of** P(X, Y)**.

Discriminative models recognize existing data, i.e., discriminative modeling identifies tags and sorts data and can be used to classify data, while Generative modeling produces something.

Since these models use different approaches to machine learning, both are suited for specific tasks i.e., Generative models are useful for unsupervised learning tasks. In contrast, discriminative models are useful for supervised learning tasks. GANs(Generative adversarial networks) can be thought of as a competition between the generator, which is a component of the generative model, and the discriminator, so basically, it is generative vs. discriminative model.

Generative models have more impact on outliers than discriminative models.

Discriminative models are computationally cheap as compared to generative models.

Let’s see some of the comparisons based on the following criteria between Generative vs Discriminative Models:

Generative models need fewer data to train compared with discriminative models since generative models are more biased as they make stronger assumptions, i.e., **assumption of conditional independence**.

In general, if we have missing data in our dataset, then Generative models can work with these missing data, while discriminative models can’t. This is because, in generative models, we can still estimate the posterior by marginalizing the unseen variables. However, discriminative models usually require all the features X to be observed.

If the assumption of conditional independence violates, then at that time, generative models are less accurate than discriminative models.

In conclusion, discriminative and generative models are two basic approaches to machine learning that have been used to solve various tasks. The discriminative approach focuses on learning the decision boundary between classes, while generative models are used to model the underlying data distribution. Understanding the difference between discriminative and generative models helps us to make better decisions about which approach to use for a particular task to build a more accurate machine-learning solution.

- Discriminative models learn the decision boundary between classes, while generative models aim to model the underlying data distribution.
- Discriminative models are often simpler and faster to train than generative models but may not perform as well on tasks where the underlying data distribution is complex or uncertain.
- Generative models can be used for a wider range of tasks, including image and text generation, but may require more training data and computational resources.

A. Discriminative models focus on modeling the decision boundary between classes, while probabilistic models focus on modeling the underlying probability distribution of the data.

A. Discriminative models support classification tasks, where the goal is to predict the class label of an input based on some features. They model the decision boundary between classes rather than modeling the distribution of the data.

A. Generative model examples include Variational Autoencoders (VAEs) for image generation and Generative Adversarial Networks (GANs) for creating realistic data like images and text.

A. No, CNN (Convolutional Neural Network) is not a generative model. It’s a type of neural network used mainly for tasks like image classification, not for generating data.

A. An example of a generative AI model is a language model like OpenAI’s GPT-3, which generates human-like text. A discriminative AI model example is logistic regression used for binary classification tasks like spam detection.

*The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.*

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Become a full stack data scientist
##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

Understanding Cost Function
Understanding Gradient Descent
Math Behind Gradient Descent
Assumptions of Linear Regression
Implement Linear Regression from Scratch
Train Linear Regression in Python
Implementing Linear Regression in R
Diagnosing Residual Plots in Linear Regression Models
Generalized Linear Models
Introduction to Logistic Regression
Odds Ratio
Implementing Logistic Regression from Scratch
Introduction to Scikit-learn in Python
Train Logistic Regression in python
Multiclass using Logistic Regression
How to use Multinomial and Ordinal Logistic Regression in R ?
Challenges with Linear Regression
Introduction to Regularisation
Implementing Regularisation
Ridge Regression
Lasso Regression

Introduction to Stacking
Implementing Stacking
Variants of Stacking
Implementing Variants of Stacking
Introduction to Blending
Bootstrap Sampling
Introduction to Random Sampling
Hyper-parameters of Random Forest
Implementing Random Forest
Out-of-Bag (OOB) Score in the Random Forest
IPL Team Win Prediction Project Using Machine Learning
Introduction to Boosting
Gradient Boosting Algorithm
Math behind GBM
Implementing GBM in python
Regularized Greedy Forests
Extreme Gradient Boosting
Implementing XGBM in python
Tuning Hyperparameters of XGBoost in Python
Implement XGBM in R/H2O
Adaptive Boosting
Implementing Adaptive Boosing
LightGBM
Implementing LightGBM in Python
Catboost
Implementing Catboost in Python

Introduction to Clustering
Applications of Clustering
Evaluation Metrics for Clustering
Understanding K-Means
Implementation of K-Means in Python
Implementation of K-Means in R
Choosing Right Value for K
Profiling Market Segments using K-Means Clustering
Hierarchical Clustering
Implementation of Hierarchial Clustering
DBSCAN
Defining Similarity between clusters
Build Better and Accurate Clusters with Gaussian Mixture Models

Introduction to Machine Learning Interpretability
Framework and Interpretable Models
model Agnostic Methods for Interpretability
Implementing Interpretable Model
Understanding SHAP
Out-of-Core ML
Introduction to Interpretable Machine Learning Models
Model Agnostic Methods for Interpretability
Game Theory & Shapley Values

Deploying Machine Learning Model using Streamlit
Deploying ML Models in Docker
Deploy Using Streamlit
Deploy on Heroku
Deploy Using Netlify
Introduction to Amazon Sagemaker
Setting up Amazon SageMaker
Using SageMaker Endpoint to Generate Inference
Deploy on Microsoft Azure Cloud
Introduction to Flask for Model
Deploying ML model using Flask