An Overview of Deep Belief Network (DBN) in Deep Learning

Debasish Last Updated : 27 May, 2024

8 min read

Introduction

A Deep Belief Network (DBN) is a sophisticated generative model that employs a deep architecture. In this article, we are going to learn all about it. After reading this article, you will have a better understanding of what a Deep Belief Network is, how it works, where to use it, and how to code biases your own Deep Belief Network.

This article was published as a part of the Data Science Blogathon.

Introduction
What is a Deep Belief Network?
How did Deep Belief Neural Networks Evolve?
The Architecture of DBN
How does DBN work?
Creating a Deep Belief Network
Learning a Deep Belief Network
Applications
Basic Python Implementation
How do you train a Deep Belief Network effectively?
Conclusion
Frequently Asked Questions

What is a Deep Belief Network?

We create Deep Belief Networks (DBNs) to address issues with classic neural networks in deep layered networks. For example – slow learning, becoming stuck in local minima owing to poor parameter selection, and requiring a large number of training datasets of these given input layer.

Several layers of stochastic latent variables make a DBN. Binary latent variables that are often known as feature detectors or hidden units are binary variables.
DBN is a hybrid generative graphical model. The top two layers have no direction. The layers above have directed links to lower layers.
DBN is an algorithm for unsupervised probabilistic deep learning.

Deep Belief Networks are machine learning algorithm that resembles the deep neural network but are not the same. These are feedforward neural networks with a deep architecture, i.e., having many hidden layers. Simple, unsupervised networks like restricted Boltzmann machines- RBMs or autoencoders make DBNs, with the hidden layer of each sub-network serving as the visible layer for the next layer.

How did Deep Belief Neural Networks Evolve?

We employ Perceptrons in the First Generation of neural networks to identify a certain object or anything else by considering the weight. However, Perceptrons may be beneficial for basic technology only, but not for sophisticated technology. To address these problems, the Second Generation of Neural Networks introduced the notion of Backpropagation, which compares the received output to the desired output and reduces the error value to zero. Then came directed acyclic graphs known as belief networks, which aided in the solution of inference and learning problems. Then, we’ll use Deep Belief Networks to help construct unbiased values that we can store in leaf nodes.

Restricted Boltzmann Machines

A Restricted Boltzmann Machine (RBM) is a type of generative stochastic artificial neural network that can learn a probability distribution from its inputs. Deep learning networks can also use RBM. Deep belief networks, in particular, can be created by “stacking” RBMs and fine-tuning the resulting deep network via gradient descent and backpropagation.

The Architecture of DBN

A series of constrained Boltzmann machines connected in a specific order make a Deep Belief Network. We supplement the result of the “output” layer of the Boltzmann machine as input to the next Boltzmann machine consecutively. Then we’ll train it until its convergence and apply the same until the completion of the whole network.

The undirected and symmetric connections between the top two levels of DBN form associative memory. The arrows pointing towards the layer closest to the data point to the relationships between all lower layers. Directed acyclic connections in the lower layers translate associative memory to observable variables. The lowest layer of visible units receives the input data. We can use Binary or actual data as input. Like Restricted Boltzmann Machine (RBM), there are no intralayer connections. The hidden units represent features that encapsulate the data’s correlations. A matrix of symmetrical weights W connects two layers. We’ll link every unit in each layer to every other unit in the layer above it.

How does DBN work?

The first stage is to train a property layer that can directly gain input signals from pixels. In an alternate labeled data retired subcaste, learn the features of the preliminarily attained features by treating the values of this subcaste as pixels. The lower bound on the log-liability of the training data set improves every time a fresh subcaste of parcels or features that we add to the network.

Deep Belief Network’s operational pipeline is as follows:

We’ll use the Greedy learning algorithm to pre-train DBN. For learning the top-down generative weights-the greedy learning method that employs a layer-by-layer approach. These generative weights determine the relationship between variables in one layer and variables in the layer above.
On the top two hidden layers, we run numerous steps of Gibbs sampling in DBN. The top two hidden layers define the RBM thus, this stage is effectively extracting a sample from it.
Then generate a sample from the visible units using a single pass of ancestral sampling through the rest of the model.
We’ll use a single bottom-up pass to infer the values of the latent variables in each layer. In the bottom layer, greedy pretraining begins with an observed data vector. It then oppositely fine-tunes the generative weights.

It’s necessary to remember that constructing a Deep Belief Network necessitates training each RBM layer. Initially, we’ll initiate the units and parameters for this purpose. In the Contrastive Divergence algorithm, there are two phases: positive and negative. We’ll calculate the binary states of the hidden layers in the positive phase by computing the probabilities of weights and visible units. It is known as the positive phase since it enhances the likelihood of the training data set. The negative phase reduces the likelihood of the model producing samples. To train a complete Deep Belief Network, we’ll employ the greedy learning technique. The greedy learning algorithm trains one RBM at a time until all of the RBMs are trained.

Creating a Deep Belief Network

Several RBMs together make a Deep Belief Networks, which reflects in their Clojure record structure. We use a fully unsupervised form of DBN to initialize a Deep Neural Network, whereas we use a classification DBN (CDBN) as a classification model on its own.

Each record type includes the RBMs that make up the network’s layers, as well as a vector- indicating the layer size and, in the case of the classification DBN- the number of classes in the representative dataset. The DBN record represents a model made up entirely of stacked RBMs, but the CDBN record contains the top-level associative memory, which is a CRBM record. This enables the top layer to be trained to create class labels for input data vectors and to classify unknown data vectors.

Learning a Deep Belief Network

Factually, we learned that RBMs are the ones that make DBN unsupervised making it much easier to train. The RBM training duration is longer than the whole DBN training period, but the code is simpler. Because CDBNs require the observation labels to be available during top-layer training, a training session entails first training the bottom layer, then propagating the dataset via the learned RBM and using the newly altered dataset as the training data for the next RBM. We’ll repeat this process until the passage of the dataset through the penultimate trained RBM, and then all the labels are concatenated with the altered dataset and used to train the top-layer associative memory. The DBN has hyperparameters to set, similar to the RBM model, and offers sensible default values.

Applications

We employ deep belief networks in place of deep feedforward networks or even convolutional neural networks in more sophisticated setups. They have the benefit of being less computationally costly. computational complexity grows linearly with the number of layers, rather than exponentially as with feedforward neural networks) and is less susceptible to the vanishing gradients problem.

Applications of DBN are as follows:

Recognition of images.
Sequences of video.
Data on mocap.
Speech recognition.

Basic Python Implementation

We’ll begin by importing Python libraries. For learning purposes, there are countless datasets available. We’re going to use https://www.kaggle.com/c/digit-recognizer for this article.

from sklearn.model_selection import train_test_split
from dbn.tensorflow import SupervisedDBNClassification
import numpy as np
import pandas as pd
from sklearn.metrics.classification import accuracy_score

Then we’ll upload the CSV file and use the sklearn package to create a DBN model. Also, divide the test set and training set into 25% and 75%, respectively. The output was then forecast and saved in y pred. Finally, we calculated the Accuracy score and displayed it on the screen.


digits = pd.read_csv("train.csv")
from sklearn.preprocessing import standardscaler
X = np.array(digits.drop(["label"], axis=1))
Y = np.array(digits["label"])
ss=standardscaler()
X = ss.fit_transform(X)
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.25)
clasifier = SupervisedDBNClassification(hidden_layers_structure =[256, 256], learning_rate_rbm=0.05, learning_rate=0.1, n_epochs_rbm=10, n_iter_backprop=100, batch_size=32, activation_function='relu', dropout_p=0.2)
clasifier.fit(x_train, y_train)
y_pred = classifier.predict(x_test)
print('nAccuracy of Prediction: %f' % accuracy_score(x_test, y_pred))
Output

Accuracy of Prediction: 93.3%

How do you train a Deep Belief Network effectively?

Training a Deep Belief Network (DBN) involves a two-stage process: pre-training and fine-tuning. Here’s a breakdown of effective techniques for each stage:

Greedy Layer-wise Training:
- DBNs (Deep Belief Networks) are constructed from stacked Restricted Boltzmann Machines (RBMs).
- Each RBM is trained one at a time in a greedy fashion. This approach uses the output from the previously trained RBM as the input for the next one, enhancing the learning models incrementally.
Contrastive Divergence:
- A common algorithm used to train RBMs.
- It involves a positive phase where data probabilities are computed and a negative phase that reconstructs the data. This helps the RBM learn good representations of the input data through a discriminative process.
Choice of Learning Rate:
- The learning rate determines how much the weights of the network are adjusted during training.
- Experimenting with different rates is essential to find the optimal rate that maximizes learning for your specific dataset.

Fine-tuning (Supervised):

Supervised Learning Algorithm:
- Once the RBMs are pre-trained, stack them together to form a DBN.
- Use a supervised learning algorithm like backpropagation to fine-tune the entire network for your specific task, whether it be classification, regression, etc. This step transitions from unsupervised to supervised learning.
Regularization Techniques:
- Regularization helps prevent overfitting by penalizing the complexity of the model.
- Techniques like dropout and weight decay can be employed during fine-tuning to ensure that the model generalizes well to new data.

Conclusion

A DBN is sometimes narrated as a stack of Restricted Boltzmann machines (RBMs) placed on top of one another.
We create Deep Belief Networks (DBNs) to address issues with classic neural networks in deep layered networks.
A number of smaller unsupervised neural networks makes up a Deep belief networks. We can calculate the binary states of the hidden layers in the positive phase by computing the probabilities of weights and visible units.
Although the layers are connected the network does not have connections between units inside a single layer, which is a common feature of deep belief networks.
It’s necessary to remember that constructing a Deep Belief Network necessitates training each RBM layer. The greedy learning algorithm trains one RBM at a time until all of the RBMs are trained.
There is a multilayer for these algorithms in neural computation.

Frequently Asked Questions

Q1. What is DBN used for?

A. A Deep Belief Network (DBN) is a type of artificial neural network used for unsupervised learning tasks such as feature learning, dimensionality reduction, and generative modeling. It consists of multiple layers of hidden units that learn to represent data in a hierarchical manner. DBNs have applications in various fields, including image recognition, natural language processing, and collaborative filtering.

Q2. What is DBN in artificial intelligence?

A. DBN stands for Deep Belief Network, a type of artificial neural network in AI. It comprises multiple layers of hidden units that learn to represent data hierarchically. DBNs are used for unsupervised learning tasks, like feature learning, dimensionality reduction, and generative modeling, finding applications in image recognition, natural language processing, and collaborative filtering.

Please feel free to leave a remark below if you have any queries or concerns about the blog. Head on to our blog for the latest articles.

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion

Debasish

Deep Learning Intermediate Python

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

You should probably give credit to the person who wrote that code. It seems to be a direct use of albertup's Github example of DBNs...The dbn.tensorflow module does not exist, it was created by albertup. As is, the code doesn't run and is missing most of the code needed to train a DBN

suchithra

how to install module dbn

Reading list

Introduction to Deep Learning

Feed Forward Networks

Feed Forward Networks

Gradient Descent

Loss Function

Activation Functions

Introduction to Neural networks

Forward and Backward Propagation

Optimizers

Learning Rate Schedulers

NN on Structured Data

Improving the Deep Learning Model

Deep Learning Model Optimization

Unsupervised Deep Learning

AutoDL

Model Deployment

Introduction to PyTorch

An Overview of Deep Belief Network (DBN) in Deep Learning

Introduction

Table of contents

What is a Deep Belief Network?

How did Deep Belief Neural Networks Evolve?

The Architecture of DBN

How does DBN work?

Creating a Deep Belief Network

Learning a Deep Belief Network

Applications

Basic Python Implementation

How do you train a Deep Belief Network effectively?

Conclusion

Frequently Asked Questions

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Congratulations, You Did It!

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp