Mastering Language Models: From Concepts to Code in PyTorch

10 AUGUST 2024 | 09:30AM - 05:30PM

About the workshop

Right now, people all over the world are going bonkers over something called ChatGPT. In this workshop, we’ll learn the basic concepts behind how ChatGPT works and then learn how to code, train, and use our own version of it from scratch using PyTorch. From this coding experience, we’ll learn about the strengths and weaknesses of models like ChatGPT, as well as discuss alternative design strategies. Then we’ll learn how to fine-tune a production language model on a custom dataset. Fine-tuning on a custom dataset gives us more control over how the model behaves and can make it more reliable.

NOTE: This workshop will be done in “StatQuest Style” meaning every little detail will be clearly explained. We’ll also start each module with a silly song.

Instructor

Joshua Starmer PhD

Founder and CEO

Modules

In this module, we’ll cover the basic concepts of neural networks and transformers, which provide the backbone for ChatGPT-style language models. In this module we will discuss:

The basics of how neural networks can fit any shape to any dataset.
The basics of how neural networks are trained with backpropagation.
The basics of how transformers work, including:

Word embedding
Position Encoding
Attention

In this module, we’ll cover the essential matrix algebra that is required when coding neural networks in PyTorch. Specifically, we will discuss:

Matrix addition and multiplication.
Why matrix multiplication is so funky.
Matrix concepts that help us read PyTorch documentation and error messages.
A walkthrough of all the matrix math required to code a transformer.

In this module, we will code a ChatGPT-like language model from scratch. Specifically, we will:

Code Position Encoding.
Code Attention.
Code a Decoder-Only Transformer from scratch.
Train our model.
Use our model.

Training a large-scale language model from scratch is crazy expensive and takes a long time and pretty much nobody does it. Instead, what they take is a pre-trained model and fine-tune it to perform specific tasks. In this module, we’ll learn how to fine-tune a production grade large language model and do it ourselves with GPUs in the cloud. Specifically, we will:

Load and use a large language model in the cloud and run it on a GPU.
Fine-tune a large language model on a custom dataset.
Use the fine-tuned model.

*Note: These are tentative details and are subject to change.

Mastering Language Models: From Concepts to Code in PyTorch

10 AUGUST 2024 | 09:30AM - 05:30PM

About the workshop

Instructor

Joshua Starmer PhD

Modules

Module 1: Introduction to Neural Networks and Transformers

Module 2: (BAM!): Essential Matrix Algebra for Coding Transformers

Module 3: (DOUBLE BAM!!): Coding a Language Model from Scratch

Module 4: (TRIPLE BAM!!!): Fine-tuning a Production Grade Large Language Model

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

_gcl_au

SID

SAPISID

__Secure-#

APISID

SSID

HSID

DV

NID

1P_JAR

OTZ

Facebook (2)

_fbp

fr

LinkedIn (6)

bscookie

lidc

bcookie

aam_uuid

UserMatchHistory

li_sugr

Microsoft (2)

MR

ANONCHK