Reinforcement Learning for LLM Agents: Training, Fine-Tuning & Deployment

8 August 2026 | 09:30AM - 05:30PM | Bengaluru

About the workshop

In this hands-on workshop, participants will learn how reinforcement learning (RL) is used to train large language model–based agents that can make sequential decisions, interact with environments, call tools autonomously, and improve performance through experience.

We will cover RL fundamentals for LLM agents, extend Markov Decision Processes (MDPs) to agent settings, explore modular RL frameworks, and dive into practical implementations using OpenPipe’s Agent Reinforcement Trainer (ART). By the end, attendees will understand how to design, train, and evaluate RL-based LLM agents for real-world tasks.

Prerequisites

Familiarity with Large Language Models (LLMs) and Python

Basic understanding of Reinforcement Learning concepts (policies, rewards, environments)

Prior exposure to agent frameworks is helpful but not required

*Note: These are tentative details and are subject to change.

Modules

What makes an LLM an agent vs. a predictor

Markov Decision Process (MDP) in the context of LLM actions

States, actions, rewards, environment interactions

Challenges in RL for LLMs (instability, reward design, scaling)

Overview of PPO, GRPO, and policy optimization methods

End-to-end RL workflows for LLM agents

Understanding Agent-R1 framework and structured RL pipelines

Crafting reward functions for multi-step tasks

Overview of OpenPipe ART architecture

Installation and environment setup

Training loop walkthrough

Experiment tracking with Weights & Biases

Task and environment design

Reward shaping and policy objectives

Tool use and hierarchical decision making

Case study and implementation walkthrough

Evaluation metrics (success rate, trajectory efficiency, robustness)

Human-in-the-loop evaluation

Reward hacking and safety risks

Deployment considerations

Instructor

Bhaskarjit Sarmah

Head of Financial Services AI Research

Reinforcement Learning for LLM Agents: Training, Fine-Tuning & Deployment

8 August 2026 | 09:30AM - 05:30PM | location Bengaluru

About the workshop

Modules

Module 1: Foundations: RL for Autonomous Agents

Module 2: RL Methodologies & Training Paradigms

Module 3: Getting Started with OpenPipe ART

Module 4: Designing RL Workflows for Practical Tasks

Module 5: Evaluation & Safety Considerations

Instructor

Bhaskarjit Sarmah

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC

G_ENABLED_IDPS

test_cookie

Webengage (2)

_we_us

WebKlipperAuth

LinkedIn (16)

ln_or

JSESSIONID

li_rm

AnalyticsSyncHistory

lms_analytics

liap

visit

li_at

s_plt

lang

s_tp

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

s_pltp

s_tslv

li_theme

li_theme_set

Google (11)

_gcl_au

SID

SAPISID

__Secure-#

APISID

SSID

HSID

DV

NID

1P_JAR

OTZ

Facebook (2)

_fbp

fr

LinkedIn (6)

bscookie

lidc

bcookie

aam_uuid

UserMatchHistory

li_sugr

Microsoft (2)

MR

ANONCHK

04

8 August 2026 | 09:30AM - 05:30PM | Bengaluru