Revolutionizing Fashion: Future of AI with GANs

Amrutha K 06 Jun, 2024

12 min read

Introduction

Imagine a world where fashion designers never run out of new ideas and every outfit we wear is a work of art. Sounds interesting, right? Well, we can make this happen in reality with the help of General Adversarial Networks (GANs). GANs had blurred the line between reality and imagination. It’s like a genie in a bottle that grants all our creative wishes. We can even create a sun on the Earth with the help of GANs, which is not possible in real life.

Back in the 2010s, Lan Goodfellow and his colleagues introduced this framework. They actually aimed to address the challenge of unsupervised learning, where the model learns from unlabelled data and generate new samples. GANs have revolutionized a number of industries with their capacity to produce fascinating and lifelike content, and the fashion industry is leading the way in embracing this potential. Now we will explore the potential of GANs and understand how they magically work.

Learning Objectives

About Generative Adversarial Networks(GANs), and working of GANs.
The role of GANs in the fields of ML and AI
We will also see some challenges of using GANs and their future potential
Understanding the power and potential of GANs
Finally, the implementation of GANs on the MNIST fashion dataset

This article was published as a part of the Data Science Blogathon.

Generative Adversarial Networks(GANs)

Generative Adversarial Networks are a class of machine learning models which are used for generating new realistic data. It can produce highly realistic images, videos, and many more. It contains only two neural networks: Generator and Discriminator.

Generator

A generator is a convolutional neural network that generates data samples that cannot be distinguished by the discriminator. Here generator learns how to create data from noise. It always tries to fool the discriminator.

Discriminator

The discriminator is a deconvolutional neural network that tries to correctly classify between real and fake samples generated by the generator. Discriminator takes both real and fake data generated by the generator and learns to distinguish it from real data. The discriminator will give a score between 0 and 1 as output for the generated images. Here 0 indicates the image is fake, and 1 indicates the image is real.

Adversarial Training

The training process includes generating fake data, and the discriminator tries to identify it correctly. It involves two stages: Generator training and Discriminator training. It also involves optimizing both the generator and discriminator. The goal of the generator is to generate data that are not distinguishable from real data and the goal of the discriminator is to identify real and fake data. If both networks work properly, then we can say the model is optimized. Both of them are trained using backpropagation. So whenever an error occurs, it will be propagated back and they will update their weights.

Training of GAN typically has the following steps:

Define the problem statement
Choose the architecture
Train Discriminator on real data

Generate fake inputs for the Generator
Train the Discriminator on fake data

Train Generator with the output of the Discriminator
Iterate and refine

Loss Function

The loss function used in the GANs consists of two components, as we have two networks in its architecture. In this, the generator’s loss is based on how well it can generate realistic data that are not distinguishable by the discriminator. It always tries to minimize the discriminator’s ability. On the other hand, the discriminator’s loss is based on how well it can classify real and fake samples. It tries to minimize misclassification.

During training, both the generator and discriminator are updated alternatively. Here both try to minimize their losses. The generator tries to reduce its loss by generating better samples for the discriminator, and the discriminator tries to reduce its loss by classifying fake samples and real samples accurately. This process continues until the GAN reaches the desired level of convergence.

Role of GANs in Machine Learning and Artificial Intelligence

Due to their ability to generate new realistic data, GANs have become more important in the field of machine learning and artificial intelligence. This has many varieties of applications like video generation, image generation, text-to-image synthesis, etc. These revolutionize many industries. Let’s see some reasons why GANs are important in this field.

Data Generation: We know that data is the most important thing for building models. We need a large number of datasets to train and build better models. Sometimes data is scarce, or maybe it is expensive. In such cases, GANs can be used to generate more new data using the existing ones.
Data Privacy: Sometimes we need to use data for training models, but it may affect the privacy of individuals. In such cases, we can use GANs to create similar data to the original one and train the models to protect the privacy of individuals.
Realistic Simulations: These enable the creation of accurate simulations of real-world situations and can be utilized to create machine learning models. For instance, since testing robots in the real world can be risky or expensive, we can utilize them to test the robots.
Adversarial Attacks: GANs can be used to create adversarial attacks to test the robustness of machine learning models. It helps to identify vulnerabilities and helps in developing better models and also to improve security.
Creative Applications: GANs can be used in generating creative applications for AI. They can be used to create games, music, artwork, films, animations, photographs, and much more. Additionally, it can produce original writing, like stories, poems, etc.

As the research on GANs still continues, we can expect many more miracles of this technology in the future.

Challenges and Limitations

Even though GANs have shown their ability to generate realistic and diverse data, it still has some challenges and limitations that need to be considered. Let’s see some challenges and limitations of GANs.

GANs are very much dependent on training data. Generated data is based on the data used for training. These will generate data similar to training data. If it is limited in diversity, then GANs will also generate data limited in diversity and quality.
It is difficult to train GANs because they are highly sensitive to the architecture of the network and the choice of hyperparameters used. These are prone to training instability as the generator and the discriminator can get stuck in the cycle of mutual deception. This leads to poor convergence resulting in the generation of poor-quality samples.
If the generator is very good at distinguishing real and fake samples, then the generator will be able to generate samples that can fool the discriminator for distinguishing. This leads to the production of samples that are highly similar to each other, and it will be able to generate samples that cover the full range of possibilities in the dataset.
It is also expensive to train GANs. Training GANs can be computationally expensive, especially when working with large datasets and complex architectures.
One of the most concerning challenges of GANs is the impact on society in creating realistic fake data. This may lead to privacy concerns, bias, or misuse. For example, these can generate fake images or videos, leading to misinformation and fraud.

Future Potential

Though it has some challenges and limitations, GANs have a potentially bright future. Numerous industries, including healthcare, finance, and entertainment, are expected to experience a revolution as a result of GANs.

One of its potential development will be generative medicine. It could be able to generate personalized medical Images and treatment plans for them. With the help of these GANs, even doctors could treat patients better by developing more effective treatments.
It could be used to create virtual reality environments. These are very realistic and have many applications, like entertainment.
Using GANs, we can create more realistic simulated environments where it can be used for testing autonomous vehicles. So that we can develop safer and more effective self-driving cars.
These are not only limited to image-related tasks. They can also be used in Natural Language Processing( NLP) tasks. These include text generation, translation, and many more. They could generate contextually relevant texts, which is a must in building virtual assistants and chatbots.
It will be very helpful for architects. It could generate new designs for buildings or any other structure. This helps architects and designers very much in creating more innovative designs.
It could also be used for scientific research as it can generate data that can mimic real-world phenomena. They can create synthetic data for testing and validation in scientific investigations, help with drug development and molecular design, and simulate complex physical processes.
GANs could also be used for crime investigation. For example, we can create images of suspects using their identities. This leads to faster and more successful investigations.

Fashion MNIST Dataset

It is a popular dataset used in machine learning for various purposes. It’s a replacement for the original MNIST dataset, which contains digits from 0 to 9. In our fashion MNIST dataset, we have images of various fashion items instead of digits. This dataset contains 70000 images, of which 60000 are training images and 10000 are testing images. Each of them is in greyscale with 28 x 28 pixels. The fashion MNIST dataset has 10 classes of fashion items. They are:

T-shirt
Dress
Coat
Pullover
Shirt
Trouser
Bag
Sandal
Sneaker
Ankle Boot

Initially, this dataset was created to develop machine-learning models for classification. This dataset is even used as a benchmark for evaluating many machine learning algorithms. This dataset is easy to access and can be downloaded from various sources, including Tensorflow and PyTorch libraries. Compared to the original digits MINIST dataset, it is more challenging. Models must be able to distinguish between various fashion products that may have similar shapes or patterns. This makes it suitable for testing the robustness of various algorithms.

Applications of GANs in the Fashion Industry

The fashion industry has undergone a tremendous transition because of GANs, which enabled creativity and change. The way we design, produce, and experience fashion has been revolutionized by GANs. Let’s see some real-world applications of General Adversarial Networks(GANs) in the fashion industry.

Fashion Design and Generation: GANs are capable of generating new designs and new fashion concepts. This helps designers in creating innovative and attractive styles. A wide range of combinations, patterns, and colors can be explored by using GANs. For instance, H&M, a clothing shop, used GANs to develop fresh outfits for their products.
Virtual Try-on: Virtual try-on is a virtual trial room. In this, GANs can generate more realistic images of customers with their garments. So customers can actually know how they look in those garments without actually wearing them physically.

Fashion Forecasting: GANs are also used for forecasting. They can generate fashion trends in the future. This helps fashion brands in generating new styles and keeping with trends.
Fabric and Texture Synthesis: GANs help designers in generating high-resolution fabric textures by experimenting with various materials and patterns virtually without actually experimenting with them in real. This helps in saving a lot of time and resources and also helps with innovative design processes.

Implementation of the Fashion MNIST dataset

We will now use Generative Adversarial Networks (GANs) to generate fashion samples using the MNIST fashion dataset. Start by importing all the necessary libraries.

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Input
from tensorflow.keras.layers import Reshape
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import ZeroPadding2D
from tensorflow.keras.layers import LeakyReLU
from tensorflow.keras.layers import UpSampling2D
from tensorflow.keras.layers import Conv2D

from tensorflow.keras.models import Sequential 
from tensorflow.keras.models import Model

from tensorflow.keras.optimizers import Adam
import matplotlib.pyplot as plt
import sys

We have to load the dataset. Here we are using the fashion MNIST dataset. This is a built-in dataset in tensorflow. So we can directly load this using tensorflow keras. This dataset is basically used for classification tasks. As discussed earlier, it has greyscale images of pixels 28 x 28. We just need a training set of data. So we will divide it into training and testing datasets and load only the training set.

Loaded data is then normalized between -1 and 1. We usually normalize to improve the stability and convergence of deep learning models during training. This is a common step in most deep-learning tasks. And finally, we will add an extra dimension to the data array. Because we need to match the expected input shape of the generator. The generator requires a 4D tensor. It represents the batch size, height, width, and number of channels.

# Load fashion dataset
(X_train, _), (_, _) = tf.keras.datasets.fashion_mnist.load_data()
X_train = X_train / 127.5 - 1.
X_train = np.expand_dims(X_train, axis=3)

Set dimensions of generator and discriminator. Here gen_input_dim is the size of the generator’s input, and in the next line, define the shape of images that are generated by the generator. Here it is 28 x 28 and in greyscale as we are providing only one channel.

gen_input_dim = 100
img_shape = (28, 28, 1)

Define Generator Model

Now we will define the generator model. It takes only one single argument and that is the input dimension. It uses keras sequential API to build the model. It has three fully connected layers with LeakyReLU activation functions and batch normalization. And in the final layer, it uses tanh activation function to generate the final output image. Finally, it returns a keras model object which takes the noise vector as input and gives a generated image as output.

def build_generator(input_dim):
    model = Sequential()
    model.add(Dense(256, input_dim=input_dim))
    model.add(LeakyReLU(alpha=0.2))
    model.add(BatchNormalization(momentum=0.8))
    model.add(Dense(512))
    model.add(LeakyReLU(alpha=0.2))
    model.add(BatchNormalization(momentum=0.8))
    model.add(Dense(1024))
    model.add(LeakyReLU(alpha=0.2))
    model.add(BatchNormalization(momentum=0.8))
    model.add(Dense(np.prod(img_shape), activation='tanh'))
    model.add(Reshape(img_shape))

    noise = Input(shape=(input_dim,))
    img = model(noise)

    return Model(noise, img)

Define Discriminator Model

The next step is to build a discriminator. It is almost similar to the generator model but here it has only two fully connected layers and with sigmoid activation function for the last layer. And it returns the model object as output by taking the noise vector as input and outputs the probability that the image is real.

def build_discriminator(img_shape):
    model = Sequential()
    model.add(Flatten(input_shape=img_shape))
    model.add(Dense(512))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dense(256))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dense(1, activation='sigmoid'))

    img = Input(shape=img_shape)
    validity = model(img)

    return Model(img, validity)

Compile Models

Now we have to compile them. We use binary cross-entropy loss and the Adam optimizer to compile the discriminator and generator. We set the learning rate to 0.0002 and the decay rate to 0.5. A discriminator model is built and compiled using a binary cross-entropy loss function which is popularly used for binary classification tasks. Accuracy metrics are also defined to evaluate the discriminator.

Similarly, a generator model is built that creates an architecture for the generator. Here we won’t compile the generator as we do for the discriminator. It will be trained in an adversarial manner against the discriminator. z is an input layer representing random noise for the generator. The generator takes z as input and generates img as output. The discriminator’s weights are frozen during the training of the combined model. The generator’s output will be fed to the discriminator and validity will be generated, which measures the quality of the generated image. Then the combined model is created using z as input and validity as output. This is used to train the generator.

optimizer = Adam(0.0002, 0.5)
discriminator = build_discriminator(img_shape)
discriminator.compile(loss='binary_crossentropy',
                      optimizer=optimizer,
                      metrics=['accuracy'])
generator = build_generator(gen_input_dim)
z = Input(shape=(gen_input_dim,))
img = generator(z)
discriminator.trainable = False
validity = discriminator(img)
combined = Model(z, validity)
combined.compile(loss='binary_crossentropy',
                 optimizer=optimizer)

Training

It’s time to train our GAN. We know that it runs for epochs number of iterations. In each iteration, a batch of random images is taken from the training set and a batch of fake images is generated by the generator by passing noise.

Discriminator is trained on both real images and fake images. And the average loss is calculated. The generator is trained on noise and the loss is calculated. Here we have defined sample_interval as 1000. So for every 1000 iterations, losses will be printed.

# Train GAN
epochs = 5000
batch_size = 32
sample_interval = 1000
d_losses = []
g_losses = []

for epoch in range(epochs):
    idx = np.random.randint(0, X_train.shape[0], batch_size)
    real_images = X_train[idx]

    # Train discriminator
    noise = np.random.normal(0, 1, (batch_size, gen_input_dim))
    fake_images = generator.predict(noise)
    d_loss_real = discriminator.train_on_batch(real_images, np.ones((batch_size, 1)))
    d_loss_fake = discriminator.train_on_batch(fake_images, np.zeros((batch_size, 1)))
    d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)
    d_losses.append(d_loss[0])

    # Train generator
    noise = np.random.normal(0, 1, (batch_size, gen_input_dim))
    g_loss = combined.train_on_batch(noise, np.ones((batch_size, 1)))
    g_losses.append(g_loss)

    # Print progress
    if epoch % sample_interval == 0:
        print(f"Epoch {epoch}, Discriminator loss: {d_loss[0]}, Generator loss: {g_loss}")

Generate Sample Images

Now let’s see some generated samples. Here we are plotting a grid with 5 rows and 10 columns of these samples. This is created with matplotlib. These generated samples are similar to the dataset we used for training. We can generate better-quality samples by training for more epochs.

# Generate sample images
r, c = 5,10
noise = np.random.normal(0, 1, (r * c, gen_input_dim))
gen_imgs = generator.predict(noise)

# Rescale images 0 - 1
gen_imgs = 0.5 * gen_imgs + 0.5

# Plot images
fig, axs = plt.subplots(r, c)
cnt = 0
for i in range(r):
    for j in range(c):
        axs[i,j].imshow(gen_imgs[cnt,:,:,0], cmap='gray')
        axs[i,j].axis('off')
        cnt += 1
plt.show()

Conclusion

Generative Adversarial Networks (GANs) are the most popular choice for many applications because of their unique architecture, training process, and their ability to generate data. As with any technology, GANs too have some challenges and limitations. Researchers are working to minimize them and crave better GANs. Overall we have learned and understood the power and potential of GANs and their working. We have also built a GAN to generate fashion samples using the fashion MNIST dataset.

Key Takeaways

These are powerful tools for generating new data samples for a variety of applications. As demonstrated in this article, it can revolutionize many industries, and fashion is one among them.
There are different types of GANs based on their ability to generate a kind of data and also based on their features. For example, we have DCGANs, for generating images, Conditional GANs for image-to-image translation, Style GANs, etc.
One relieving advantage of GANs is that there will be no data scarcity for training and building machine learning models.
It has no limit to its creativity that can rule the future of artificial intelligence and machine learning. Let’s see what miracles it will create in the future.

Hope you found this article useful. Connect with me on LinkedIn.

Frequently Asked Questions

Q1. What is GANs used for?

A. GANs, or Generative Adversarial Networks, generate synthetic data that closely resembles real data. They have applications in various fields, including image generation, video synthesis, text generation, and data augmentation.

Q2. What are the different types of GAN?

A. There are several types of GANs, including Conditional GANs (cGANs) that generate outputs based on specific conditions, CycleGANs that learn mappings between two domains, and Progressive GANs that generate images of increasing quality.

Q3. What are GANs composed of?

A. GANs have two main components: generator and discriminator networks. The generator generates synthetic data, while the discriminator distinguishes between real and fake data. Both networks are trained simultaneously in a competitive fashion.

Q4. What is the advantage of GAN?

A. The advantage of GANs is their ability to generate realistic and diverse synthetic data. They can capture complex patterns and generate new samples that exhibit similar characteristics to the training data. GANs have broad applications in various creative and data-driven domains.

Q5. What are the limitations of GAN?

A. GANs can be challenging to train and stabilize. They are sensitive to hyperparameters and may suffer from mode collapse, where the generator fails to explore the entire data distribution. Evaluating GAN performance objectively is also a challenge, making it difficult to assess the quality of generated samples.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.