A Comprehensive Guide to Vision Language Models

About

This talk comprehensively introduces Vision-Language Models (VLMs), their importance, and a wide range of applications. It delves into the technical aspects of pre-training VLMs, covering common techniques and recent advancements. Attendees will gain hands-on experience through live demonstrations using open-source VLMs and minimal reproducible Colab notebooks. Additionally, the talk will focus on fine-tuning PaliGemma, Google's latest VLM, providing a step-by-step guide for specific tasks.

Key Takeaways:

In-depth Understanding of VLMs: Participants will learn the fundamentals of VLMs, their significance, and diverse use cases across various domains.
Technical Know-how: The talk will equip attendees with knowledge of pre-training techniques, including both established methods and cutting-edge research directions in the field.
Practical Skills: Through live code demonstrations using open-source VLMs and Colab notebooks, participants will gain hands-on experience and learn how to work with these models effectively.
Fine-tuning Expertise: The talk will provide a detailed walkthrough of fine-tuning PaliGemma, Google's latest VLM, enabling attendees to adapt the model for their tasks.
Combination of Theory and Practice: The session will balance conceptual depth and practical techniques, ensuring participants grasp both the theoretical underpinnings and the practical applications of VLMs.

Speaker

Aritra Roy Gosthipaty

Contract Machine Learning Engineer

Ritwik Raha

Machine Learning Engineer

Download Brochure

Phone Number

Email Id

I Agree to the Terms & Conditions

Send WhatsApp Updates

A Comprehensive Guide to Vision Language Models

About

Key Takeaways:

Speaker

Aritra Roy Gosthipaty

Ritwik Raha

Download agenda

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

_gcl_au

SID

SAPISID

__Secure-#

APISID

SSID

HSID

DV

NID

1P_JAR

OTZ

Facebook (2)

_fbp

fr

LinkedIn (6)

bscookie

lidc

bcookie

aam_uuid

UserMatchHistory

li_sugr

Microsoft (2)

MR

ANONCHK

04

10

19

48