Deep Learning 101: Beginners Guide to Neural Network

Himanshi Singh Last Updated : 27 Jul, 2023

10 min read

Introduction

If there is one area in data science that has led to the growth of Machine Learning and Artificial Intelligence in the last few years, it is Deep Learning. From research labs in universities with low success in the industry to powering every smart device on the planet – Deep Learning and Neural Networks have started a revolution.

Note: If you are more interested in learning concepts in an Audio-Visual format, We have this entire article explained in the video below. If not, you may continue reading.

In this article, we will be introducing you to the components of neural networks.

Introduction
Building Blocks of a Neural Network: Layers and Neurons
- 1. What are Layers in a Neural Network?
- 2. What are Neurons in a Neural Network?
What is a Firing of a neuron?
Activation Functions in a Neural Network
Frequently Asked Questions
End Notes

Building Blocks of a Neural Network: Layers and Neurons

There are two building blocks of a Neural Network, let’s look at each one of them in detail-

1. What are Layers in a Neural Network?

A neural network is made up of vertically stacked components called Layers. Each dotted line in the image represents a layer. There are three types of layers in a NN-

Input Layer– First is the input layer. This layer will accept the data and pass it to the rest of the network.

Hidden Layer– The second type of layer is called the hidden layer. Hidden layers are either one or more in number for a neural network. In the above case, the number is 1. Hidden layers are the ones that are actually responsible for the excellent performance and complexity of neural networks. They perform multiple functions at the same time such as data transformation, automatic feature creation, etc.

Output layer– The last type of layer is the output layer. The output layer holds the result or the output of the problem. Raw images get passed to the input layer and we receive output in the output layer. For example-

In this case, we are providing an image of a vehicle and this output layer will provide an output whether it is an emergency or non-emergency vehicle, after passing through the input and hidden layers of course.

Now, that we know about layers and their function let’s talk in detail about what each of these layers is made up of.

2. What are Neurons in a Neural Network?

A layer consists of small individual units called neurons. A neuron in a neural network can be better understood with the help of biological neurons. An artificial neuron is similar to a biological neuron. It receives input from the other neurons, performs some processing, and produces an output.

Now let’s see an artificial neuron-

Here, X1 and X2 are inputs to the artificial neurons, f(X) represents the processing done on the inputs and y represents the output of the neuron.

What is a Firing of a neuron?

In real life, we all have heard the phrase- “Fire up those neurons” in one form or another. The same applies to artificial neurons as well. Every neuron has a tendency to fire but only in certain conditions. For example-

If we represent this f(X) by addition then this neuron may fire when the sum is greater than, say 100. While there may be a case where the other neuron may fire when the sum is greater than 10-

firing an artificial neuron in neural network

These certain conditions which differ neuron to neuron are called Threshold. For example, if the input X1 into the first neuron is 30 and X2 is 0:

This neuron will not fire, since the sum 30+0 = 30 is not greater than the threshold i.e 100. Whereas if the input had remained the same for the other neuron then this neuron would have fired since the sum of 30 is greater than the threshold of 10.

Now, the negative threshold is called the Bias of a neuron. Let us represent this a bit mathematically. So we can represent the firing and non-firing condition of a neuron using these couple of equations-

If the sum of the inputs is greater than the threshold then the neuron will fire. Otherwise, the neuron will not fire. Let’s simplify this equation a bit and bring the threshold to the left side of the equations. Now, this negative threshold is called Bias-

One thing to note is that in an artificial neural network, all the neurons in a layer have the same bias. Now that we have a good understanding of bias and how it represents the condition for a neuron to fire, let’s move to another aspect of an artificial neuron called Weights.

So far even in our calculation, we have assigned equal importance to all the inputs. For example-

Here X1 has a weight of 1 and X2 has a weight of 1 and the bias has a weight of 1 but what if we want to have different weights attached to different inputs?

Let’s have a look at an example to understand this better. Suppose today is a college party and you have to decide whether you should go to the party or not based on some input conditions such as Is the weather good? Is the venue near? Is your crush coming?

So, if the weather is good then it will be presented with a value of 1, otherwise 0. Similarly, if the venue is near it will be represented by 1, otherwise 0. And similarly for whether your crush is coming to the party or not.

Now suppose being a college teenager, you absolutely adore your crush and you can go to any lengths to see him or her. So you will definitely go to the party no matter how the weather is or how far the venue is, then you will want to assign more weight to X3 which represents the crush in comparison to the other two inputs.

Such a situation can be represented if we assign weights to an input such as this-

We can assign a weight of 3 to the weather, a weight of 2 to the venue, and a weight of 6 to the crush. Now if the sum of all these three factors that is weather, venue, and crush is greater than a threshold of 5, then you can decide to go to the party otherwise not.

Note: X0 is the bias value

So for example, we have taken initially the condition where crush is more important than the weather or the venue itself.

So let’s say for example, as we represented here the weather(X1) is bad represented by 0 and the venue(X2) is far off represented by 0 but your crush(X3) is coming to the party which is represented by 1, so when you calculate the sum after multiplying the values of Xs with their respective weights, we get a sum of 0 for Weather(X1), 0 for Venue(X2) and 6 for Crush(X3). Since 6 is greater than the threshold of 5, you will decide to go to the party. Hence the output(y) is 1.

Let’s imagine a different scenario now. Imagine you’re sick today and no matter what you will not attend the party then this situation can be represented by assigning equal weight to weather, venue, and crush with the threshold of 4.

Now, in this case we are changing the value of the threshold and setting it to a value of 4 so even if the weather is good, the venue is near and your crush is coming, you won’t be going to the party since the sum i.e 1 + 1 + 1 equal to 3, is less than the threshold value of 4.

This w0, w1, w2, and w3 are called the weights of neurons and are different for different neurons. These weights are the ones that a neural network has to learn to make good decisions.

Activation Functions in a Neural Network

Now that we know how a neural network combines different inputs using weights, let’s move to the last aspect of a neuron called the Activation functions. So far what we have been doing is simply adding some weighted inputs and calculating some output and this output can read from minus infinity to infinity.

But this can be challenged in many circumstances. Assume we first want to estimate the age of a person from his height, weight, and cholesterol level and then classify the person as old or not, based on if the age is greater than 60.

Now if we use this given neuron then the age of -20 is even possible. You know that the range of age according to the current structure of this neuron will range from -∞ to ∞. So even the age of someone as -20 is possible, given this absurd range for age we can still use our condition to decide whether a person is old or not. For example, if we have said a certain criterion such as a person is old only if the age is greater than 60. So even if the age comes out to be -20 we can use this criterion to classify the person as not old.

But it would have been much better had the age made much more sense such as if the output of this neuron which represents the age had been in the range of let’s say 0 to 120. So, how can we solve this problem when the output of a neuron is not in a particular range?

One method is to clip the age on the negative side would be to use a function such as max(0, X).

Now let’s first note the original condition, before using any function. For the positive X, we had a positive Y, and for negative X we had a negative Y. Here x-axis represents the actual values and y represents the transformed values-

But now if you want to get rid of the negative values what we can do is use a function like max(0, X). Using this function anything which is on the negative side of the x-axis gets clipped to 0.

This type of function is called a ReLU function and these classes of functions, which transform the combined input are called Activation functions. So, ReLU is an activation function.

Depending on the type of transformation needed there can be different kinds of activation functions. Let’s have a look at some of the popular activation functions-

Sigmoid activation function– This function transforms the range of combined inputs to a range between 0 and 1. For example, if the output is from minus infinity to infinity which is represented by the x-axis, the sigmoid function will restrict this infinite range to a value between 0 & 1.
Tanh activation function- This function transforms the range of combined inputs to a range between -1 and 1. Tanh looks very similar to the shape of the sigmoid but it restricts the range between -1 and 1.

Different activation functions perform differently on different data distribution. So sometimes you have to try and check different activation functions and find out which works better for a particular problem.

Frequently Asked Questions

Q1. How many layers are there in neural network?

A. The number of layers in a neural network can vary depending on the architecture. A typical neural network consists of an input layer, one or more hidden layers, and an output layer. The depth of a neural network refers to the number of hidden layers. Deep neural networks may have multiple hidden layers, hence the term “deep learning.”

Q2. What are the 3 layers of deep learning?

A. In deep learning, the three essential layers of a neural network are:
1. Input Layer: The first layer that receives the input data, such as images or text.
2. Hidden Layers: One or more layers in between the input and output layers where complex patterns and representations are learned.
3. Output Layer: The final layer that produces the model’s predictions or outputs based on the learned representations from the hidden layers.

End Notes

So far, we have discussed that the neural network is composed of different types of layers stacked together and each of these layers is composed of individual units called Neurons. Every neuron has three properties: first is biased, second is weight and third is the activation function.

Further, bias is the negative threshold after which you want the neuron to fire. Weight is how you define which input is more important to the others. The activation function helps to transform the combined weighted input to arrange according to the need at hand.

I highly recommend you check out our Certified AI & ML BlackBelt Plus Program to begin your journey into the fascinating world of data science and learn these and many more topics.

I hope this article works as a starting point to your learning towards neural networks and deep learning.

Reach out to us in the comments below in case you have any doubts.

Neural network

Himanshi Singh

I’m a data lover who enjoys finding hidden patterns and turning them into useful insights. As the Manager - Content and Growth at Analytics Vidhya, I help data enthusiasts learn, share, and grow together.

Thanks for stopping by my profile - hope you found something you liked :)

Advanced Deep Learning Maths Videos

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Stephen Hobbs

Excellent article! I sent it to two sons and one friend. I learned about activation functions, which I did not understand before the article. Next article: Optimizers (adam, etc.). Let me know when you publish it.

Show 1 reply

Thanks a lot, Stephen! I'll surely try to write about optimizers very soon:)

Aagusthya shanker

i am actually from math background , and i can only understand if you show me the math and how it works using math. your article was truly an excellent work.

Thanks Aagusthya!

Ene

Thank you so much! This article explained everything in such a simple and fun manner^^

Elizabeth

I love the article...it is detailed enough and easy to understand...I am taking a neural networks class and this article just made me understand how to through the course

Ayushi Saxena

it was a great help as im a beginner in Artificial intelligence and machine learning im tying to use theknowlege to implement programs in Pytorch using Pandas,Keras and Tensorflow libraries

Max Mathew

Excellent article Himanshi, God bless you

gallant_mohammad465

So excellent and simply understandabel article, keep going to demonstrating DL subjects

Reading list

Introduction to Deep Learning

Feed Forward Networks

Feed Forward Networks

Gradient Descent

Loss Function

Activation Functions

Introduction to Neural networks

Forward and Backward Propagation

Optimizers

Learning Rate Schedulers

NN on Structured Data

Improving the Deep Learning Model

Deep Learning Model Optimization

Unsupervised Deep Learning

AutoDL

Model Deployment

Introduction to PyTorch

Deep Learning 101: Beginners Guide to Neural Network

Introduction

Table of contents

Building Blocks of a Neural Network: Layers and Neurons

1. What are Layers in a Neural Network?

2. What are Neurons in a Neural Network?

What is a Firing of a neuron?

Activation Functions in a Neural Network

Frequently Asked Questions

End Notes

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set