Neural Network for Regression with Tensorflow
This article was published as a part of the Data Science Blogathon
In this article, I am going to build multiple neural network models to solve a regression problem. Before we start working on the model, I would like to give a brief overview of what we will touch on and what steps we will follow. We will build a neural network model together and then improve it in order to achieve better predictions, also will see this development visually in the graphs. As I mentioned above, we are going to solve a regression problem by building a neural network model. There are multiple definitions for a regression problem, but in our case, we are going to simplify it, we are going to predict a number.
Let’s start from the very beginning.
What is regression?
For example, if the model that we built should predict discrete or continuous values like a person’s age, earnings, years of experience, or need to find out that how these values are correlated with the person, it shows that we are facing a regression problem.
What is a neural network?
Just like a human brain, a neural network is a series of algorithms that detect basic patterns in a set of data. The neural network works as a neural network in the human brain. A “neuron” in a neural network is a mathematical function that searches for and classifies patterns according to a specific architecture.
It is possible and important to talk about each of these topics in detail and for a long time, but my goal in this article is to build a model and work on it together after briefly touching on the important points. If you start to write codes with me and get the results by yourself, everything will be more fun and memorable. So let’s get our hands dirty.
First, let’s start with importing some libraries that we will use at the beginning:
import tensorflow as tf print(tf.__version__) import numpy as np import matplotlib.pyplot as plt
We are dealing with a regression problem, and we will create our dataset:
One important point in NN is the input shapes and the output shapes. The input shape is the shape of the data that we train the model on, and the output shape is the shape of data that we expect to come out of our model. Here we will use X and aim to predict y, so, X is our input and y is our output.
X.shape, y.shape >>((74,), (74,))
Here we can see that our tensors have the same shape, but in real life, it may not be that way always, so, we should check and fix that if needed before we build a model. Let’s start building our model with TensorFlow. There are 3 typical steps to creating a model in TensorFlow:
- Creating a model – connect the layers of the neural network yourself, here we either use Sequential or Functional API, also we may import a previously built model that we call transfer learning.
- Compiling a model – at this step, we define how to measure a model’s performance, which optimizer should be used.
- Fitting a model – In this step, we introduce the model to the data and let it find patterns.
We’ve created our dataset, that is why we can directly start modeling, but first, we need to split our train and test set.
len(X) >> 74 X_train = X[:60] y_train = y[:60] X_test = X[60:] y_test = y[60:] len(X_train), len(X_test) >> (60,14)
The best way of getting more insight into our data is by visualizing it! So, let’s do it!
plt.figure( figsize = (12,6)) plt.scatter(X_train, y_train, c='b', label = 'Training data') plt.scatter(X_test, y_test, c='g', label='Testing data') plt.legend()
.png)
Building Regression model with Neural Network
Now we can start to build some models.
tf.random.set_seed(42) #first we set random seed model = tf.keras.Sequential([ tf.keras.layers.Dense(1) ]) model.compile( loss = tf.keras.losses.mae, #mae stands for mean absolute error optimizer = tf.keras.optimizers.SGD(), #stochastic GD metrics = ['mae']) model.fit( X_train, y_train, epochs = 10)
.png)
We have just built a model and trained it! What about checking its predictions?
model.predict([130]) >> array([[126.92796]], dtype=float32)
tf.random.set_seed(42) model_1 = tf.keras.Sequential([ tf.keras.layers.Dense(1) ]) model_1.compile( loss = tf.keras.losses.mae, optimizer = tf.keras.optimizers.SGD(), metrics = ['mae']) model_1.fit( X_train, y_train, epochs = 100, verbose = 0)
Here we have increased the number of epochs to make our train longer. Also if we add
“verbose = 0” we will not see the procedure of training (the procedure shown in the picture above). Let’s check predictions again.
preds = model_1.predict(X_test) preds >> array([[ 81.19372 ], [ 84.61617 ], [ 88.03863 ], [ 91.46108 ], [ 94.88354 ], [ 98.306 ], [101.728455], [105.15091 ], [108.573364], [111.99582 ], [115.418274], [118.84073 ], [122.26318 ], [125.68564 ]], dtype=float32)
def plot_preds(traindata = X_train, trainlabels = y_train, testdata = X_test, testlabels = y_test, predictions = preds): plt.figure(figsize=(12,6)) plt.scatter(traindata, trainlabels, c="b", label="Training data") plt.scatter(testdata, testlabels, c="g", label="Testing data") plt.scatter(testdata, predictions, c="r", label="Predictions") plt.legend()
plot_preds(traindata = X_train, trainlabels = y_train, testdata = X_test, testlabels = y_test, predictions = preds)
.png)
Evaluate the Regression Model with Neural Network
mae = tf.metrics.mean_absolute_error( y_true = y_test, y_pred = preds) mae >>
y_test.shape, preds.shape >> ((14,), (14, 1))
Yes, y_test and preds have different shapes, fortunately, we can fix it:
preds.squeeze().shape >> (14,)
mae = tf.metrics.mean_absolute_error(y_true=y_test, y_pred=preds.squeeze()).numpy() mse = tf.metrics.mean_squared_error(y_true = y_test, y_pred=preds.squeeze()).numpy() mae, mse >> (3.9396794, 18.42119)
You may ask that, why we used “numpy()” function at the end of the line. I have done this on purpose because, in the end, we will convert our calculations into DataFrame.
So, let’s keep working!
Improve the Regression model with neural network
tf.random.set_seed(42) model_2 = tf.keras.Sequential([ tf.keras.layers.Dense(1), tf.keras.layers.Dense(1) ]) model_2.compile(loss=tf.keras.losses.mae, optimizer=tf.keras.optimizers.SGD(), metrics=['mae']) model_2.fit(X_train, y_train, epochs=100, verbose=0)
Here we just replicated the first model, and add an extra layer to see how it works?
preds_2 = model_2.predict(X_test) plot_preds(predictions=preds_2)
.png)
mae_2 = tf.metrics.mean_absolute_error(y_true=y_test, y_pred=preds_2.squeeze()).numpy() mse_2 = tf.metrics.mean_squared_error(y_true = y_test, y_pred=preds_2.squeeze()).numpy() mae_2,mse_2
>> (41.150764, 1738.0294)
It seems like, extra layer didn’t help us to make our model better. Let’s try to change our optimizer.
tf.random.set_seed(42) model_3 = tf.keras.Sequential([ tf.keras.layers.Dense(1), tf.keras.layers.Dense(1) ]) model_3.compile(loss=tf.keras.losses.mae, optimizer=tf.keras.optimizers.Adam(), metrics=['mae']) model_3.fit(X_train, y_train, epochs=100, verbose=0)
We have used “Adam()” optimizer instead of “SGD()”.
mae_3 = tf.metrics.mean_absolute_error(y_true=y_test, y_pred=preds_3.squeeze()).numpy() mse_3 = tf.metrics.mean_squared_error(y_true = y_test, y_pred=preds_3.squeeze()).numpy() mae_3,mse_3
>> (39.64795, 1587.9797)
tf.random.set_seed(42) model_4 = tf.keras.Sequential([ tf.keras.layers.Dense(100), tf.keras.layers.Dense(10), tf.keras.layers.Dense(1) m.set_seed(42) model_4 = tf.keras.Sequential([ tf.keras.layers.Dense(100), tf.keras.layers.Dense(10), tf.keras.layers.Dense(1) ]) model_4.compile(loss=tf.keras.losses.mae, optimizer=tf.keras.optimizers.Adam(), metrics=['mae']) mo ]) model_4.compile(loss=tf.keras.losses.mae, optimizer=tf.keras.optimizers.Adam(), metrics=['mae']) model_4.fit(X_train, y_train, epochs=100, verbose=0)
This time we have added one extra layer and some extra neurons to make our predictions better. Let’s check it out.
preds_4 = model_4.predict(X_test) plot_preds(predictions=preds_4)
.png)
We are getting close!
mae_4 = tf.metrics.mean_absolute_error(y_true=y_test, y_pred=preds_4.squeeze()).numpy() mse_4 = tf.metrics.mean_squared_error(y_true = y_test, y_pred=preds_4.squeeze()).numpy() mae_4,mse_4
>> (8.184728, 67.23798)
In Neural Network we have 2 activation functions, Sigmoid and Relu. Let’s check them to see if they work for our model, or not:
tf.random.set_seed(42) model_5 = tf.keras.Sequential([ tf.keras.layers.Dense(10, activation = tf.keras.activations.relu), tf.keras.layers.Dense(1) ]) model_5.compile(loss=tf.keras.losses.mae, optimizer=tf.keras.optimizers.Adam(), metrics=['mae']) model_5.fit(X_train, y_train, epochs=100, verbose=0)
mae_5 = tf.metrics.mean_absolute_error(y_true=y_test, y_pred=preds_5.squeeze()).numpy() mse_5 = tf.metrics.mean_squared_error(y_true = y_test, y_pred=preds_5.squeeze()).numpy() mae_5, mse_5
>> (8.17538, 72.70614)
It is always good to check more possible combinations, because, I promise, this is the best way to both make better predictions and learn more each time.
tf.random.set_seed(42) model_6 = tf.keras.Sequential([ tf.keras.layers.Dense(100, activation = tf.keras.activations.relu), tf.keras.layers.Dense(10), tf.keras.layers.Dense(1) ]) model_6.compile(loss=tf.keras.losses.mae, optimizer=tf.keras.optimizers.SGD(), metrics=['mae']) model_6.fit(X_train, y_train, epochs=100, verbose=0)
Here we have just changed the optimizer to SGD() and checked its performance with the Relu activation function.
preds_6 = model_6.predict(X_test) mae_6 = tf.metrics.mean_absolute_error(y_true=y_test, y_pred=preds_6.squeeze()).numpy() mse_6 = tf.metrics.mean_squared_error(y_true = y_test, y_pred=preds_6.squeeze()).numpy() mae_6, mse_6
>> (1.4528008, 3.1021771)
.png)
And we have just made almost perfect predictions! Let’s see the evaluation process of our model:
model_results = [['model_1', mae, mse], ['model_2', mae_2, mse_2], ['model_3', mae_3, mse_3], ['model_4', mae_4, mse_4], ['model_5', mae_5, mse_5],
['model_6', mae_6, mse_6]]
import pandas as pd all_results = pd.DataFrame(model_results, columns=["model", "mae", "mse"]) all_results
.png)
Conclusion
Together we created 6 different models and visualized and developed them. The most important point I want to show in this article is that not every solution always works for every model and problem. To find the optimal solution we need to practice and check. It is only necessary to take into account that the neural network works completely like the human brain, so there is no need to be afraid to look for the optimal solution by evaluating all the possible options. After understanding and setting up models on similar problems several times, you will be able to anticipate and use which API, which combination of parameters works best for which problem.
I hope the article was useful to you.