Age and Gender Detection Using Deep Learning

Sonia Singla 31 Jul, 2023 • 7 min read

This article was published as a part of the Data Science Blogathon


The main aim of this article is to detect age and gender through the given data set. We will use simple python and Keras methods for detecting age and gender. The data set can be downloaded from AGE, GENDER AND ETHNICITY (FACE DATA) CSV | Kaggle.


The upgrading of image pictures taken from the camera sources, from satellites, aeroplanes, and the images caught in everyday lives is called picture processing. Processing of the image based on analysis undergoes many different techniques and calculations. Digital formed pictures need to be carefully imagined and studies.

Image processing has two main steps followed by simple steps. The improvement of an image with the end goal of more good quality pictures; that can be adopted by other programs are called picture upgrades. The other procedure is the most pursued strategy utilized for the extraction of data from a picture. The division of images into certain parts is called segmentation.

The location of the information accessible in the pictures is much-needed information. The information that the image contains is to be changed and adjusted for discovery purposes.

There are different sorts of procedures required for, just as the expulsion of the issue. In a Facial identification strategy: The articulations that the faces contain hold a great deal of data. At whatever point the individual associates with the other individual, there is an association of a ton of ideas.

The evolving of ideas helps in figuring certain boundaries. Age assessment is a multi-class issue in which the years; are categorized into classes. Individuals of various ages have various facials, so it is hard to assemble the pictures.

To identify the age and gender of several faces’ procedures, are followed by several methods. From the neural network, features are taken by the convolution network. In light of the prepared models, the image is processed into one of the age classes. The highlights are handled further and shipped off the preparation frameworks.

The Dataset

UTK Dataset comprises age, gender, images, and pixels in .csv format. Age and gender detection according to the images have been researched for a long time. Different methodologies have been assumed control over the years to handle this issue. Presently we start with the assignment of recognizing age and gender utilizing the Python programming language.

Keras is the interface for the TensorFlow library. Use Keras on the off chance that you need a profound learning library that allows simple and quick prototyping (through ease of use, seclusion, and extensibility). Support both convolutional networks and repetitive organizations, just as blends of the two. Run flawlessly on CPU and GPU.


#Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
def plot_data(rows, cols, lower_value, upper_value):
    fig = plt.figure(figsize=(cols*3,rows*4))
    for i in range(1, cols*rows + 1):
        k = np.random.randint(lower_value,upper_value)
        fig.add_subplot(rows, cols, i) # adding sub plot
        gender = gender_values_to_labels[df.gender[k]]
        ethnicity = eth_values_to_labels[df.ethnicity[k]]
        age = df.age[k]
        im = df.pixels[k]
        plt.imshow(im, cmap='gray')
plot_data(rows=1, cols=7, lower_value=0, upper_value=len(df))
data sample images | Age and Gender Detection

Fig 1 Age and gender detection by a simple python


Keras is an open-source Neural Network library. It is written in Python and is sufficiently fit to run on Theano, TensorFlow, or CNTK, developed by one of the Google engineers, Francois Chollet. It is made easy to understand, extensible, and particular for quicker experimentation with profound neural organizations.

First of all, we will upload all libraries required for the data set. We will convert all the columns into an array by using the np.array and into dtype float. We will then split the data set into xTrain, yTrain, yTest, and xtest. In the end, we will apply the model sequential and test the predictions.

In detail, first, we read the CSV file containing five columns age, ethnicity, gender, img_name, and pixels by using pandas, read_csv function. The first five rows got by using DataFrame.head() method. We converted the column-named pixels into an array by using the NumPy library and Reshaping them into dimensions 48, 48 by using the lambda function. We also converted the values in the float by the same lambda function.

We divided the values further by 255.

We assigned the variable name to get the first row of the column of the pixel. We further checked the image by using matplotlib if it is seen or not.

#importing libraries

import keras
import json
import sys
import tensorflow as tf
from keras.layers import Input
import numpy as np
import argparse
from keras_applications.resnext import ResNeXt50
from keras.utils.data_utils import get_file
import face_recognition
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import cv2
from PIL import Image
df=pd.read_csv("age_gender.csv")df.head()df1= pd.DataFrame(df)
df1['pixels'] = df1.pixels.apply(lambda x: np.reshape(np.array(x.split(' '),dtype='float32'),(48,48)))
df1['pixels']= df1['pixels']/255
im = df1['pixels'][0]
plt.imshow(im, cmap='gray')
Reshaped image | Age and Gender Detection

Fig 2 Image after reshaping

To convert all the values into float and reshape it, we used function for and NumPy. To have the age and gender stored in the list, we will use the other variable labels_f. A later model will be used to fit the data and validate it.

#To collect all the images and reshape them and check the dtype.

X = np.zeros(shape=(23705,48,48))
for i in range(len(df1["pixels"])):
    X[i] = df1["pixels"][i]
Output - dtype('float64')
ag = df1['age']
ag= np.array(ag)

output- (23705,)


g= df1['gender']
labels_f =[]


while i <len(a):






Both age and gender are combined and stored in labels_f, we will further convert the list into array.

labels_f =np.array(labels_f)
(23705, 2, 1)

Using the most used machine learning library, sklearn, the data is split into train and test.
#Splitting the data taking data set
import tensorflow as tf

from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test= train_test_split(X,a,test_size=0.25)




train test shape | Age anUsing the most used machine learning library, sklearn, the data is split into train and test.d Gender Detection

Fig 3 Shape output of X_train, X_test, Y_train, and Y_test



from tensorflow.keras.layers import Dropout

from tensorflow.keras.layers import Flatten,BatchNormalization

from tensorflow.keras.layers import Dense, MaxPooling2D,Conv2D

from tensorflow.keras.layers import Input,Activation,Add

from tensorflow.keras.models import Model

from tensorflow.keras.regularizers import l2

from tensorflow.keras.optimizers import Adam

import tensorflow as tf

def Convolution(input_tensor,filters):

    x = Conv2D(filters=filters,kernel_size=(3, 3),padding = 'same',strides=(1, 1),kernel_regularizer=l2(0.001))(input_tensor)

    x = Dropout(0.1)(x)

    x= Activation('relu')(x)

    return x

def model(input_shape):

  inputs = Input((input_shape))

  conv_1= Convolution(inputs,32)

  maxp_1 = MaxPooling2D(pool_size = (2,2)) (conv_1)

  conv_2 = Convolution(maxp_1,64)

  maxp_2 = MaxPooling2D(pool_size = (2, 2)) (conv_2)

  conv_3 = Convolution(maxp_2,128)

  maxp_3 = MaxPooling2D(pool_size = (2, 2)) (conv_3)

  conv_4 = Convolution(maxp_3,256)

  maxp_4 = MaxPooling2D(pool_size = (2, 2)) (conv_4)

  flatten= Flatten() (maxp_4)

  dense_1= Dense(64,activation='relu')(flatten)

  dense_2= Dense(64,activation='relu')(flatten)



  output_1= Dense(1,activation="sigmoid",name='sex_out')(drop_1)

  output_2= Dense(1,activation="relu",name='age_out')(drop_2)

  model = Model(inputs=[inputs], outputs=[output_1,output_2])

  model.compile(loss=["binary_crossentropy","mae"], optimizer="Adam",


  return model
model summary

Fig 4 Model summary in details,Y_train_2,batch_size=64,validation_data=(X_test,Y_test_2),epochs=5,callbacks=[callback_list])
Training the model
evaluate model

#Plot the image

def test_image(ind,X,Model):







  print("Predicted Age: "+ str(age))

  print("Predicted Sex: "+ sex_f[sex])

test_image(1980,X, Model)
Final prediction | Age and Gender Detection

Fig 5 Age and gender detection by model.

Frequently Asked Questions

Q1. What are the uses of age and gender prediction?

A. Age and gender prediction finds applications in various fields, including targeted advertising, market research, customer segmentation, and personalized user experiences. It helps businesses tailor their products and services to specific demographics and analyze consumer behavior. Additionally, it aids in age and gender-based content recommendations, social media marketing strategies, and public health initiatives, allowing for more effective and targeted campaigns.

Q2. How accurate is age and gender prediction?

A. Age and gender prediction algorithms can be reasonably accurate, but their performance can vary depending on the dataset used for training and the specific model employed. While they often achieve high accuracy rates, there may be occasional mis-classifications due to factors like diverse appearances, age-related changes, and cultural variations. Continuous improvement in technology and data quality can enhance their precision further.


The task of recognizing age and gender, nonetheless, is an innately troublesome issue, more so than numerous other PC vision undertakings. The fundamental justification for this trouble hole lies in the information needed to prepare these kinds of frameworks. While general article discovery errands can regularly approach many thousands or even large numbers of pictures for preparing, datasets with age and gender names are extensively more modest, as a rule in the large numbers or, best case scenario, several thousand. Python obtained images and the Model did not do well much in the accuracy rate, further, improvement is required in the model algorithm.

A small introduction about myself-

I, Sonia Singla have done MSc in Biotechnology from Bangalore University, India and an MSc in Bioinformatics from University of Leicester, U.K. I have also done a few projects on data science from CSIR-CDRI. Currently is an advisory editorial board member at IJPBS. Have reviewed and published few research papers in Springer, IJITEE and various other Publications. You can contact me or reach to me on Linkedin. Thanks

Linkedin –

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Sonia Singla 31 Jul 2023

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers


yashank gupta
yashank gupta 09 Dec, 2021

can you give advantages of this project? what was the real world application of age and gender detection

Related Courses
0 Hrs 72 Lessons

A Comprehensive Learning Path for Deep Learning in 2023

0 Hrs 72 Lessons

A Comprehensive Learning Path for Deep Learning in 2020

0 Hrs 76 Lessons

A Comprehensive Learning Path for Deep Learning in 2019


  • [tta_listen_btn class="listen"]