Aytan Hajiyeva — November 9, 2021

This article was published as a part of the Data Science Blogathon

In this article, I am going to build multiple neural network models to solve a regression problem. Before we start working on the model, I would like to give a brief overview of what we will touch on and what steps we will follow. We will build a neural network model together and then improve it in order to achieve better predictions, also will see this development visually in the graphs. As I mentioned above, we are going to solve a regression problem by building a neural network model. There are multiple definitions for a regression problem, but in our case, we are going to simplify it, we are going to predict a number.

Let’s start from the very beginning.

## What is regression?

For example, if the model that we built should predict discrete or continuous values like a person’s age, earnings, years of experience, or need to find out that how these values are correlated with the person, it shows that we are facing a regression problem.

## What is a neural network?

Just like a human brain, a neural network is a series of algorithms that detect basic patterns in a set of data. The neural network works as a neural network in the human brain. A “neuron” in a neural network is a mathematical function that searches for and classifies patterns according to a specific architecture.

It is possible and important to talk about each of these topics in detail and for a long time, but my goal in this article is to build a model and work on it together after briefly touching on the important points. If you start to write codes with me and get the results by yourself, everything will be more fun and memorable. So let’s get our hands dirty.

First, let’s start with importing some libraries that we will use at the beginning:

```import tensorflow as tf
print(tf.__version__)
import numpy as np
import matplotlib.pyplot as plt```

We are dealing with a regression problem, and we will create our dataset:

```X = np.arange(-110, 110, 3)
y = np.arange(-100, 120, 3)```

One important point in NN is the input shapes and the output shapes. The input shape is the shape of the data that we train the model on, and the output shape is the shape of data that we expect to come out of our model. Here we will use X and aim to predict y, so, X is our input and y is our output.

```X.shape, y.shape
>>((74,), (74,))```

Here we can see that our tensors have the same shape, but in real life, it may not be that way always, so, we should check and fix that if needed before we build a model. Let’s start building our model with TensorFlow. There are 3 typical steps to creating a model in TensorFlow:

• Creating a model – connect the layers of the neural network yourself, here we either use Sequential or Functional API, also we may import a previously built model that we call transfer learning.
• Compiling a model – at this step, we define how to measure a model’s performance, which optimizer should be used.
• Fitting a model – In this step, we introduce the model to the data and let it find patterns.

We’ve created our dataset, that is why we can directly start modeling, but first, we need to split our train and test set.

```len(X)
>> 74
X_train = X[:60]
y_train = y[:60]
X_test = X[60:]
y_test = y[60:]
len(X_train), len(X_test)
>> (60,14)```

The best way of getting more insight into our data is by visualizing it! So, let’s do it!

```plt.figure( figsize = (12,6))
plt.scatter(X_train, y_train, c='b', label = 'Training data')
plt.scatter(X_test, y_test, c='g', label='Testing data')
plt.legend()```

## Building Regression model with Neural Network

Now we can start to build some models.

```tf.random.set_seed(42)  #first we set random seed
model = tf.keras.Sequential([
tf.keras.layers.Dense(1)
])
model.compile( loss = tf.keras.losses.mae, #mae stands for mean absolute error
optimizer = tf.keras.optimizers.SGD(), #stochastic GD
metrics = ['mae'])
model.fit( X_train, y_train, epochs = 10)```

We have just built a model and trained it! What about checking its predictions?

```model.predict()
>> array([[126.92796]], dtype=float32)```
The output should have been close to 140, it is not close enough, so let’s start to improve our model.
```tf.random.set_seed(42)
model_1 = tf.keras.Sequential([
tf.keras.layers.Dense(1)
])
model_1.compile( loss = tf.keras.losses.mae,
optimizer = tf.keras.optimizers.SGD(),
metrics = ['mae'])
model_1.fit( X_train, y_train, epochs = 100, verbose = 0)```

Here we have increased the number of epochs to make our train longer. Also if we add
“verbose = 0” we will not see the procedure of training (the procedure shown in the picture above). Let’s check predictions again.

```preds = model_1.predict(X_test)
preds
>> array([[ 81.19372 ],
[ 84.61617 ],
[ 88.03863 ],
[ 91.46108 ],
[ 94.88354 ],
[ 98.306   ],
[101.728455],
[105.15091 ],
[108.573364],
[111.99582 ],
[115.418274],
[118.84073 ],
[122.26318 ],
[125.68564 ]], dtype=float32)
```
As I mentioned above, visualization always helps better to understand, so let’s build a function and use it for visualization every time.
```def plot_preds(traindata = X_train,
trainlabels = y_train,
testdata = X_test,
testlabels = y_test,
predictions = preds):
plt.figure(figsize=(12,6))
plt.scatter(traindata, trainlabels, c="b", label="Training data")
plt.scatter(testdata, testlabels, c="g", label="Testing data")
plt.scatter(testdata, predictions, c="r", label="Predictions")
plt.legend()```
```plot_preds(traindata = X_train,

trainlabels = y_train,

testdata = X_test,

testlabels = y_test,

predictions = preds)```

As we can see from the plot, our predictions are not perfect, but quite good as far. Let’s evaluate predictions, then see if we can do better.

## Evaluate the Regression Model with Neural Network

For regression problems, we have 2 evaluation metrics, MAE(mean absolute error)  and MSE(mean squared error). For our model we have used MAE, so, let’s compare predictions to the real values:
```mae = tf.metrics.mean_absolute_error( y_true = y_test,
y_pred = preds)
mae
>>```
What? MAE should be a single value, instead, we got 14 values, what is the reason for that? This is the result of different shapes. let’s prove it.
```y_test.shape, preds.shape
>> ((14,), (14, 1))```

Yes, y_test and preds have different shapes, fortunately, we can fix it:

```preds.squeeze().shape
>> (14,)
```
Voila! let’s calculate the metrics again:
```mae = tf.metrics.mean_absolute_error(y_true=y_test,
y_pred=preds.squeeze()).numpy()
mse = tf.metrics.mean_squared_error(y_true = y_test,
y_pred=preds.squeeze()).numpy()
mae, mse
>> (3.9396794, 18.42119)```

You may ask that, why we used  “numpy()” function at the end of the line. I have done this on purpose because, in the end, we will convert our calculations into DataFrame.
So, let’s keep working!

## Improve the Regression model with neural network

```tf.random.set_seed(42)
model_2 = tf.keras.Sequential([
tf.keras.layers.Dense(1),
tf.keras.layers.Dense(1)
])
model_2.compile(loss=tf.keras.losses.mae,
optimizer=tf.keras.optimizers.SGD(),
metrics=['mae'])
model_2.fit(X_train, y_train, epochs=100, verbose=0)```

Here we just replicated the first model, and add an extra layer to see how it works?

```preds_2 = model_2.predict(X_test)
plot_preds(predictions=preds_2)```
```mae_2 = tf.metrics.mean_absolute_error(y_true=y_test,
y_pred=preds_2.squeeze()).numpy()
mse_2 = tf.metrics.mean_squared_error(y_true = y_test,
y_pred=preds_2.squeeze()).numpy()
mae_2,mse_2```
`>> (41.150764, 1738.0294)`

It seems like, extra layer didn’t help us to make our model better. Let’s try to change our optimizer.

```tf.random.set_seed(42)
model_3 = tf.keras.Sequential([
tf.keras.layers.Dense(1),
tf.keras.layers.Dense(1)
])
model_3.compile(loss=tf.keras.losses.mae,
metrics=['mae'])
model_3.fit(X_train, y_train, epochs=100, verbose=0)```

preds_3 = model_3.predict(X_test)

```mae_3 = tf.metrics.mean_absolute_error(y_true=y_test,
y_pred=preds_3.squeeze()).numpy()
mse_3 = tf.metrics.mean_squared_error(y_true = y_test,
y_pred=preds_3.squeeze()).numpy()
mae_3,mse_3```
`>> (39.64795, 1587.9797)`

```tf.random.set_seed(42)

model_4 = tf.keras.Sequential([

tf.keras.layers.Dense(100),

tf.keras.layers.Dense(10),

tf.keras.layers.Dense(1)
m.set_seed(42)
model_4 = tf.keras.Sequential([
tf.keras.layers.Dense(100),
tf.keras.layers.Dense(10),
tf.keras.layers.Dense(1)
])
model_4.compile(loss=tf.keras.losses.mae,
metrics=['mae'])
mo
])

model_4.compile(loss=tf.keras.losses.mae,

metrics=['mae'])

model_4.fit(X_train, y_train, epochs=100, verbose=0)```

This time we have added one extra layer and some extra neurons to make our predictions better. Let’s check it out.

```preds_4 = model_4.predict(X_test)
plot_preds(predictions=preds_4)```

We are getting close!

```mae_4 = tf.metrics.mean_absolute_error(y_true=y_test,
y_pred=preds_4.squeeze()).numpy()
mse_4 = tf.metrics.mean_squared_error(y_true = y_test,
y_pred=preds_4.squeeze()).numpy()
mae_4,mse_4```
`>> (8.184728, 67.23798)`

In Neural Network we have 2 activation functions, Sigmoid and Relu. Let’s check them to see if they work for our model, or not:

```tf.random.set_seed(42)
model_5 = tf.keras.Sequential([
tf.keras.layers.Dense(10, activation = tf.keras.activations.relu),
tf.keras.layers.Dense(1)
])
model_5.compile(loss=tf.keras.losses.mae,
metrics=['mae'])
model_5.fit(X_train, y_train, epochs=100, verbose=0)```
preds_5 = model_5.predict(X_test)

```mae_5 = tf.metrics.mean_absolute_error(y_true=y_test,
y_pred=preds_5.squeeze()).numpy()
mse_5 = tf.metrics.mean_squared_error(y_true = y_test,
y_pred=preds_5.squeeze()).numpy()
mae_5, mse_5```
`>> (8.17538, 72.70614)`

It is always good to check more possible combinations, because, I promise, this is the best way to both make better predictions and learn more each time.

```tf.random.set_seed(42)
model_6 = tf.keras.Sequential([
tf.keras.layers.Dense(100, activation = tf.keras.activations.relu),
tf.keras.layers.Dense(10),
tf.keras.layers.Dense(1)
])
model_6.compile(loss=tf.keras.losses.mae,
optimizer=tf.keras.optimizers.SGD(),
metrics=['mae'])
model_6.fit(X_train, y_train, epochs=100, verbose=0)```

Here we have just changed the optimizer to SGD() and checked its performance with the Relu activation function.

```preds_6 = model_6.predict(X_test)
mae_6 = tf.metrics.mean_absolute_error(y_true=y_test,
y_pred=preds_6.squeeze()).numpy()
mse_6 = tf.metrics.mean_squared_error(y_true = y_test,
y_pred=preds_6.squeeze()).numpy()
mae_6, mse_6```
`>> (1.4528008, 3.1021771)`
plot_preds(predictions=preds_6)

And we have just made almost perfect predictions! Let’s see the evaluation process of our model:

```model_results = [['model_1', mae, mse],

['model_2', mae_2, mse_2],

['model_3', mae_3, mse_3],

['model_4', mae_4, mse_4],

['model_5', mae_5, mse_5],```
```                 ['model_6', mae_6, mse_6]]
```
```import pandas as pd

all_results = pd.DataFrame(model_results, columns=["model", "mae", "mse"])

all_results```

## Conclusion

Together we created 6 different models and visualized and developed them. The most important point I want to show in this article is that not every solution always works for every model and problem. To find the optimal solution we need to practice and check. It is only necessary to take into account that the neural network works completely like the human brain, so there is no need to be afraid to look for the optimal solution by evaluating all the possible options. After understanding and setting up models on similar problems several times, you will be able to anticipate and use which API, which combination of parameters works best for which problem.

I hope the article was useful to you. 