Predicting House Prices Using Keras Functional API

Vansh Saluja 09 May, 2022

5 min read

This article was published as a part of the Data Science Blogathon.

Introduction

The Keras Functional API is a way of creating models that are more flexible and complex than the traditional sequential model. The functional API can handle models with non-linear topology, shared layers, and even multiple inputs or outputs. On the contrary, The sequential API allows the user to create a model layer-by-layer for most problems. It is limited in that it does not allow the user to create models that share layers or have multiple inputs or outputs.

In a functional API, models are defined by creating instances of layers and connecting them directly to each other in pairs, then defining a Model that specifies the layers to act as the input and output of the model.

There are three main aspects of a functional model –

1. Defining the input – Unlike the Sequential model, a standalone Input layer must be created that specifies the shape of input data. The input layer takes a shape argument that is a tuple that indicates the dimensionality of the input data.

2. Connecting Layers – The layers in the model are connected in pairs. The previous layer is kept in the bracket of the next layer.

3. Creating the model – the model created takes two parameters (the input and output) in which the value is given should connect both layers.

The Task

Now, the problem statement is to predict house prices using the images and features of the houses. The dataset contains images of 535 houses where there are 4 images of each house, the bathroom, bedroom, frontal and kitchen and the house attributes extracted from the images(No of bedroom, No of bathroom, Area, Zipcode and Price).

View the dataset!

There are many ways to solve this problem but the approach we are taking will be using the Keras Functional API. Each part of the house (eg. bedroom) will have its own convolution neural network and then at the end, the last hidden layer of each part will be concatenated to form the final layer of the image part and then later it will also get concatenated with the final layer of the attributes to form the final output layer.

So let’s begin-

First, we will load important libraries. To create a functional model you need the following relevant libraries also with the common libraries.

from tensorflow.keras.models import Model 
from tensorflow.keras.layers import Dense, MaxPooling2D, Conv2D, GlobalAveragePooling2D, Input, concatenate, Dropout

Loading house attributes –

df = open('HousesInfo.txt','r')
df = df.readlines()
for index, line in enumerate(df):
      df[index] = line.strip()
for i in range(len(df)):
    df[i] = df[i].split()
    for j in range(5):
        df[i][j] = float(df[i][j])

Converting attributes into dataframe –

df = pd.DataFrame(df, columns=['Bedrooms','Bathrooms','Area','Zipcode','Price'])
df

The output –

Now, creating input and output for the attributes, perform standardization and splitting them into train and test –

y = df['Price']
X = df.drop('Price', axis=1)
columns = ['Area','Zipcode']
#Standardizating both non-categorical features and prices
scaler = StandardScaler()
X = np.array(X.join(pd.DataFrame(scaler.fit_transform(X[columns]))).drop(columns, axis=1))
y = scaler.fit_transform(np.array(y).reshape(-1,1)).reshape(-1)

data_train, data_test, y_train, y_test = train_test_split(X, y, test_size=0.15, random_state=42)

Now, creating the model for the attributes upto the hidden layer using Keras functional API with proper use of the aspects –

input_data = Input(shape=4)
#Creating model for feature dataset without training as it will be combined with image data later
model1 = Dense(4, activation='relu', kernel_initializer='uniform')(input_data)
model1 = Dense(12, activation='relu')(model1)

Now, we will Load the images and convert it into a 2D array list of length 535 depicting the houses and each of the houses containing 4 images.

#creating image dataset
DATADIR = r"C:UsersKIITDownloadsHouse Images Dataset"
rooms = ['bathroom','bedroom','frontal','kitchen']
CATEGORIES = [([(str(j)+'_'+i) for i in rooms]) for j in range(1,536)]
k = []
for a in CATEGORIES:
    for b in a:
        k.append(b)
data = []
rooms = []
for category in CATEGORIES:
    for i in range(len(category)):
        path = DATADIR
        im_array = cv2.imread(os.path.join(path, category[i]+'.jpg'))
        img_array = cv2.cvtColor(im_array, cv2.COLOR_BGR2RGB)
        new_array = cv2.resize(img_array, (100, 100)) 
        rooms.append(new_array)
    data.append(np.array(rooms))
    rooms.clear()

#Converting the list into array and dividing it by 255 to keep the values between 0 and 1.
X = np.array(data)
X = X/255

Now splitting the data into the 4 categories and creating training data for each of them –

bathroom = []
for a in range(X.shape[0]):
    bathroom.append(X[a][0])
bedroom = []
for b in range(X.shape[0]):
    bedroom.append(X[b][1])
frontal = []
for c in range(X.shape[0]):
    frontal.append(X[c][2])
kitchen = []
for d in range(X.shape[0]):
    kitchen.append(X[d][3])
#Fetching different categories from image data

bathroom_train, bathroom_test, y_train, y_test = train_test_split(np.array(bathroom), y, test_size=0.15, random_state=42)

bedroom_train, bedroom_test, y_train, y_test = train_test_split(np.array(bedroom), y, test_size=0.15, random_state=42)

frontal_train, frontal_test, y_train, y_test = train_test_split(np.array(frontal), y, test_size=0.15, random_state=42)

kitchen_train, kitchen_test, y_train, y_test = train_test_split(np.array(kitchen), y, test_size=0.15, random_state=42)

Now finally we will start creating the functional API using proper norms.

1. Defining the Input for each category - 
Input_bath = Input(shape=(100,100,3))
Input_bed = Input(shape=(100,100,3))
Input_front = Input(shape=(100,100,3))
Input_kitchen = Input(shape=(100,100,3))

2. Now connecting layers for each category and later combine all of it with the attributes –

bath = Conv2D(filters = 32, kernel_size = (3,3),padding = 'Same',activation ='relu')(Input_bath)

bath = MaxPooling2D(pool_size=(2,2))(bath)

bath = Conv2D(filters = 16, kernel_size = (3,3),padding = 'Same',activation ='relu')(bath)

bath = MaxPooling2D(pool_size=(2,2))(bath)

bath_final = GlobalAveragePooling2D()(bath)

bed = Conv2D(filters = 32, kernel_size = (3,3),padding = 'Same',activation ='relu')(Input_bed)

bed = MaxPooling2D(pool_size=(2,2))(bed)

bed = Conv2D(filters = 16, kernel_size = (3,3),padding = 'Same',activation ='relu')(bed)

bed = MaxPooling2D(pool_size=(2,2))(bed)

bed_final = GlobalAveragePooling2D()(bed)

front = Conv2D(filters = 32, kernel_size = (3,3),padding = 'Same',activation ='relu')(Input_front)

front = MaxPooling2D(pool_size=(2,2))(front)

front = Conv2D(filters = 16, kernel_size = (3,3),padding = 'Same',activation ='relu')(front)

front = MaxPooling2D(pool_size=(2,2))(front)

front_final = GlobalAveragePooling2D()(front)

kitchen = Conv2D(filters = 32, kernel_size = (3,3),padding = 'Same',activation ='relu')(Input_kitchen)

kitchen = MaxPooling2D(pool_size=(2,2))(kitchen)

kitchen = Conv2D(filters = 16, kernel_size = (3,3),padding = 'Same',activation ='relu')(kitchen)

kitchen = MaxPooling2D(pool_size=(2,2))(kitchen)

kitchen_final = GlobalAveragePooling2D()(kitchen)

#combining all inputs 

combined = concatenate([bath_final, bed_final, front_final, kitchen_final])

#adding regularization and creating the final layers for the image category

dropout = Dropout(0.5)(combined)

hidden1 = Dense(64, activation='relu')(dropout)

hidden2 = Dense(32, activation='relu')(hidden1)

#combining both output of the images and the features of the houses

combined2 = concatenate([model1, hidden2])

output = Dense(1, activation='linear')(combined2)

3. Finally, creating the model-
model=Model(inputs=([Input_bath, Input_bed, Input_front, Input_kitchen, input_data]), outputs=output)

Now compiling and fitting the model using proper callbacks we get:

callbacks = [
             tf.keras.callbacks.EarlyStopping(patience=60, restore_best_weights=False),
             tf.keras.callbacks.ReduceLROnPlateau(patience=35, factor=0.1)
            ]
model.compile(loss='mse', optimizer=Adam(learning_rate=0.001))
model.fit([bathroom_train, bedroom_train, frontal_train, kitchen_train, data_train], y_train, epochs=200, 
           callbacks=callbacks, batch_size=96, validation_split=0.15)

Conclusion

Therefore we have established that there is a way of solving different types of problems where the layers are most definitely non-sequential using the functional API. Feel free to check the whole code on my Github – https://github.com/salujav4/Keras-Functional-API

You can solve the problem with more improvement using image augmentation and performing hyperparameter tuning on the model.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.