The Place of Data Science in the Data Universe

Ayoub Abid 23 Feb, 2021 • 5 min read

This article was published as a part of the Data Science Blogathon.

According to Barton Poulson, it is quite difficult to separate Data Science, Machine Learning, and Artificial Intelligence. That is why there’s no consistent definition, and why there’s so much debate over what one thing is, and what the other one is.

To illustrate, he visualizes the relationships among all of Data Science, Machine Learning, Neural Networks, and Artificial Intelligence in the graph above.

And explains the graph:

Between Data Science and ML, there’s a lot of overlap.
within ML, there’s a specific approach called Neural Networks NN.
AI refers to this, not a well-defined category that mostly overlaps with NN and ML.

Entering the data universe is not easy. It requires the consumption of the right courses, mentoring from the right people, and practice on industry-relevant datasets. Check out our Certified AI & ML BlackBelt Accelerate program to find all of it under a single roof.

Artificial Intelligence

There are a lot of statements about this, but none of them are taken as definitive. Examples of these statements:

Things that computers cannot do.
Tasks that normally require humans.
Programs that learn from data.
The simulation of human intelligence in machines that are programmed to think and act like humans.

Examples of Artificial Intelligence application:

Categorizing photos.
Translations.
Games.

Artificial Intelligence Types

General AI: It’s more like a machine that has a mind and is able to make independent decisions.

Narrow AI: It’s algorithms that are pre-programmed by humans, and it’s more like human behavior simulation.

The AI and Data Science concepts are different from each other but not in a way that makes them comparable to each other. So explaining what both of them are will clarify how they differ.

Artificial Intelligence: Algorithms that learn from data.

Data Science: Skills and techniques for dealing with challenging data.

There’s an enormous amount of overlap between the two and they are not exclusive.

Back in the day, a machine was just a machine that did whatever it was supposed to do. Things like stamping metal or washing your clothes with a fair amount of help on your part. A machine was defined as a device that uses mechanical power to perform a particular task.

But nowadays, there is a big change in the way we think about machines. They are required to do more than just the given mechanical functions. They are supposed to be smart, to learn about us, and to adjust their functions according to their sensors in order to meet our different desires.

Examples of useful tasks machines can learn to do easily:

Email Classification (spam/ not spam)
Image Identification (human face / not)

Machine Learning

It is the ability of algorithms to learn from data and to learn in such a way that can improve their function in the future.

How do machines learn?

They learn by searching for patterns among huge data, and once they found one, they adjust the program to reflect the “truth” of what they found. The more data you expose the machine to, the smarter it gets.

How Humans Learn v.s. How Machines learn

The great thing about machine learning is that it needs a small amount of human interference. You don’t need to specify all the criteria or create a huge flow chart of (if-this-then-that) statements. That would be something called an Expert System. It is an old system that has been found to have limited utility.

The approach of teaching a machine is to train it

How to train it:

Show the algorithm millions of labeled examples.
The algorithm finds its own distinctive features. The features that the algorithm classifies based on may not be relevant or visible to humans.

The relationship between Machine Learning and Data Science

Data Science can definitely be done without Machine Learning.

Any traditional classification task, Logistic Regression, Decision Tree. In addition to Predictive Models, Sentiment Analysis. That is usually not Machine Learning.

Machine learning without data science is not very useful.

It is possible to be done without extensive domain expertise.

It is better to think about Machine Learning as a subdiscipline of Data Science.

One of the Machine Learning algorithms that have been responsible for nearly all of these amazing developments in Machine Learning is Neural Network or Artificial Neural Network.

Artificial Neural Network

Itis an information processing model that is inspired by the way biological nervous systems, such as the brain, process information. They are loosely modeled after the neuronal structure of the mammalian cerebral cortex but on much smaller scales.

Circles represent the neurons and the lines represent connections like the connections between neurons in a biological brain.

The theory has existed for years. However, computing power has recently caught up to the demands that the theory places.
Also, thanks to social media, the availability of labeled data has recently caught up too.

Now using the combination of theory, computing power, and raw data. It is possible to do computations that resemble what goes in the human brain.

How ANNs work

The idea is to take very basic pieces of information and input it to the nodes of ANN, and by connecting it with many other nodes, you can give rise to very high-level cognitive decisions and classifications.

Input Layer: That is where your raw data comes in. Then, it starts to combine the entered information and passes it to one or more hidden layers.

Hidden Layer: That is where the nonlinear transformations of the entered inputs are performed.

Output Layer: Where you get the final classification or decision about what is happening.

Just like a human brain, things can get a little complicated in a neural network, or really massively complicated. And it can be hard to know exactly what is happening inside.

Black Box Model

It limits your ability to interpret what’s going on.

Neural Network is considered to be a black box, as data goes in, output or decision comes out. And we can not be sure of what happened in between.

P. S.

In this story, I shared what I summarized and understood from a chapter of a LinkedIn course (Data Science Foundations: Fundamentals, by Barton Poulson), in addition to other things I researched and studied on my own.

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion.