Non-Generalization Principles and Transfer Learning Technique

Mobarak Inuwa Last Updated : 15 Jun, 2023

7 min read

This article was published as a part of the Data Science Blogathon.

Introduction

The ability to reimplement a skill or knowledge from one, the original knowledge bearer, to a new knowledge “needer”, is referred to as transfer learning. These could be skilled in math, music, or cookery. In this post, we will discover how this idea has been applied to building machine learning models.

Data science knowledge is needed to understand the details of how TL works. The idea necessitates ideas like neural networks and computer vision because these are terms that are frequently used in the industry. The learner should have been exposed to the concept of machine learning in training and developing models to grasp transfer learning concepts. Transfer learning should be a technique that may be used on machine learning models, not a different kind of machine learning.

What is Transfer Learning?

Transfer learning is the reuse of a model that has already been trained on new models comparable to them by emulating new qualities. New models are trained and tested using the pre-trained model and then used in related situations without having to start the process from scratch. It’s crucial to note that the two problems ought to be as related as possible.

Source: critical-thinking-concept-illustration

An illustration is applying the skills learned to identify vehicles to the scenario of identifying trucks. Another example is applying the skills learned in identifying phones to identifying tablets. Or in recognizing novels and textbooks. Instead of starting from scratch, we just continue using the patterns we’ve already mastered from completing a comparable activity. With a few minor differences, transfer learning appears to be attempting to lessen the necessity of resolving old issues in new ways.

The principle of Transfer Learning and Non-generalization of Models

According to some experts, generalizable properties that can be applied to the second task and learned by the initial model are necessary for transfer learning to be successful. Also, asserting that the dataset used in both models must be comparable.

Data scientists may assume that training machine learning models does not require generalization and must be avoided. A model developed using data from a certain domain cannot be used for production in another domain. For instance, a weather dataset to train a model to predict the weather will prevent it from being used to predict sales. This is a known hypothesis, both intellectually and practically. This should be established as the norm to get the best results, except that the lack of datasets will likely be one of the main obstacles to data science. This has made it difficult for data science initiatives to be completed in some fields with a paucity of data. The option to attempt using an earlier trained model is then presented to assist in making predictions in failing fields. With a few specialized strategies, transfer learning now offers a means of accomplishing this.

Transfer learning may not be necessary for other reasons, despite that it appears to have some clear advantages. For instance, constructing models without good computing or environment has long been a problem.

The ability to slightly relax the prohibition against generalization is provided by transfer learning. The quest for generalization is the main core of transfer learning. We employ the power of transfer learning to adjust and reap greater benefits by overcoming most prior endeavours, rather than strictly forbidding the reuse of past trained models on different problem domains.

A hypothesis called the Theory of Generalization of Experience was put forth by a guy named Charles Judd. It claims that what is learned in task “A” is transferable to task “B” because while studying “A,” the learner learns a general concept that applies partially or entirely in both “A” and “B.” Similar to how two models from two distinct problem areas may have learned independently while employing the same variable and constraint behavior. Transfer learning still has limits in situations where there should be a relationship between the two models. This means that we cannot combine completely separate models.

When to Use Transfer Learning?

When time is of essence Time

The time required to obtain new data and train new models from scratch is reduced by the availability of pre-trained models ready for reuse. It can take a lot of time and money to do this.

Availability of Datasets
A lot of data is needed to train machine learning models from scratch. It is a common issue to run out of this data quickly. Transfer learning can produce efficient and effective models that perform like a normally trained model even when the data is not originating from that domain. For the neural network’s final layers to be trained, only a small amount of data is needed.

Improved performance

In addition to advancing and saving time, TL can enhance model performance. Comparatively, transfer learning might lead to the development of better-performing, more efficient systems. A model may perform better with transfer learning (TL) than its opulent original abundance, even if none of the negatives that may motivate its use is present. This improves performance and fortifies conventional machine learning models.

Memory

High levels of computational power might be needed, for instance, in computer vision. The majority of students and startups might not have this privilege. The difficulties of weak memory can be temporarily alleviated by the application of transfer learning techniques. Because of this, applications for neural networks are in memory-intensive domains like natural language processing, computer vision, and image processing.

how to use tl — Source: question-concept-illustration-freepik

As previously noted, memory has caused the use of TL in memory-intensive fields like Natural Language Processing (NLP). The practice of NLP involves creating tools for processing and comprehending human language. This eliminates the communication gap between people and computers. Technologies like voice-to-text converters, personal assistants like Google Assistant, language translators, etc. have been made possible by this.

The TL is also used in ANNs, or artificial neural networks. This relates to the area where we attempt to model the functioning and execution of the human nervous system. Deep learning aims to use this technology. This has led to the development of several pre-trained models that can be used for transfer learning to speed up the procedure.

Last, computer vision is a further area where TL is useful. Take it literally that computer vision is the ability of computers to see! This may have to do with media and real-time. Media can quickly fill up enormous amounts of memory, which demands a lot of computing power. This is one of T Learning’s most prevalent uses.

How does Transfer Learning Work?

Finding a suitable pre-trained model that complements the new model to be trained may be the first step toward building a transfer learning model. The final network layers must then be frozen to prevent the loss of the knowledge that initially drew us to them. After that, we add a new trainable layer to the network and train it. Finally, we perform analyses and perfect them to ensure they accurately reach their objectives.

Various libraries have been created to aid transfer learning operations depending on the dataset type or the method the models were built using. Neural networks in computer vision work by first identifying edges in the image, then taking into account shapes and some strict features in the later layers. Here’s why we must employ comparable models: To avoid retraining the entire model and losing the advantages of transfer learning or altering the earlier layers of the networks, we can only train the final layers of the networks.

Types of Transfer Learning

TL may be divided into various categories according to different studies. For the purposes of this article, we’re mainly interested in three different learning behaviors.

Positive Transfer

The first we will see is when the learning is positive. This quality of learning allows us to essentially accomplish two goals at once. A circumstance in which one learning in A indirectly hones another learning in B. For instance, learning how to play the drums tends to make it easier to play the bass guitar, and learning the keyboard makes it easier to sing in tune.

Negative Transfer

An explanation for negative transfer learning is when learning one thing diminishes the past knowledge gained on other things.

Neutral Transfer (Zero transfer)

This is the middle of positive and negative learning. It neither adds nor removes any past knowledge when done.

Examples of Pretrained Models

A few pre-trained models are created and made reusable in TL. Various pre-trained models have been created using a variety of technologies, including computer vision (media), NLP, and more. On the broad list are;

AlexNet
VGG
Inception
XCeption
ResNet
Word2Vec
GloVe
FastText
ImageNet

Conclusion

Transfer learning processes use the knowledge acquired after solving one problem to teach or train a different but related model. Eg., training a neural network can be time- and resource-consuming, but many pre-trained models can be used as a jumping-off point. Transfer learning can be used to solve many machine learning problems.

Key takeaways;

Transfer learning is the reuse of models that have already been trained on new models comparable to them by emulating as many new qualities as possible.
The Theory of Generalization of Experience is put forth by Charles Judd. which claims that what is learned in task “A” is transferable to task “B” because while studying “A,” the learner learns a general concept that applies partially or entirely in both “A” and “B.”
Two models from two distinct problem areas may have learned independently while employing the same variable and constraint behavior. This is beneficial to transfer learning.
Transfer learning can be used when a pre-trained model matches a new model facing some constraints.
Some challenges that may lead to transfer learning include time, hardware and software, memory, and dataset availability.
Learning could be positive, negative, or neutral.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Mobarak Inuwa

I am an AI Engineer with a deep passion for research, and solving complex problems. I provide AI solutions leveraging Large Language Models (LLMs), GenAI, Transformer Models, and Stable Diffusion.

Deep Learning Intermediate Libraries NLP

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Model Deployment

Introduction to Computer Vision

Getting Started with Image Data

Introduction to CNN and Implementation

Introduction to Transfer Learning

CNN Visualization

Overview of Pretrained Models

Inception

ResNets

DenseNets

CSRNet

Introduction to Object Detection

Region Based Convolutional Neural Network

Single Stage Networks

Transformed Based Object Detection Models

Face Detection

Object Tracking

Pose Estimation

Introduction to Image Segmentation

Understanding Deep Learning Architectures for Image Segmentation

Video Classification

Introduction to Image Generation

Zero and Few Shot Learning

Non-Generalization Principles and Transfer Learning Technique

Introduction

What is Transfer Learning?

The principle of Transfer Learning and Non-generalization of Models

When to Use Transfer Learning?

How does Transfer Learning Work?

Types of Transfer Learning

Examples of Pretrained Models

Conclusion

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Congratulations, You Did It!

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang