Fine-tune LLM with Reinforcement Learning
Fine-tune LLM with Reinforcement Learning
14 May 202513:05pm - 14 May 202514:05pm
Fine-tune LLM with Reinforcement Learning
About the Event
Large Language Models are powerful—but not always aligned with human intent. In this session, we explore Reinforcement Learning from AI Feedback (RLAIF), a scalable alternative to RLHF that uses AI-based evaluators to train safer, more helpful models. We’ll compare RLAIF with RLHF and Direct Policy Optimization (DPO), outlining their trade-offs and practical applications. Through a hands-on walkthrough, you'll learn how to implement RLAIF using public datasets to reduce toxicity in model outputs—pushing the frontier of ethical, aligned AI development.
Key Takeaways:
- Understand the limitations of prompt engineering and SFT in aligning LLMs with human values.
- Explore Reinforcement Learning from AI Feedback (RLAIF) as a scalable alternative to human-guided alignment.
- Learn how Constitutional AI and LLM-based evaluators can reduce toxicity and improve model behavior.
- Get hands-on insights into implementing RLAIF using public datasets and evaluation pipelines.
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
Who is this DataHour for?
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
- Best articles get published on Analytics Vidhya’s Blog Space
About the Speaker
Sainath Reddy Sankepally is an AI researcher currently working as a Data Scientist in India. He completed his undergraduate studies in Data Science and Artificial Intelligence at IIIT Raipur. Sainath has contributed to AI research at several prestigious institutions and organizations around the world, including MIT, Harvard, NUS Singapore, IIT Jodhpur, IIT Patna, and IIIT Delhi. He has also held applied research roles in industry, working on Data Science and AI initiatives at companies such as Swiggy and Siemens. You can reach him on LinkedIn.
Participate in discussion
Registration Details
Registered
Become a Speaker
Share your vision, inspire change, and leave a mark on the industry. We're calling for innovators and thought leaders to speak at our event
- Professional Exposure
- Networking Opportunities
- Thought Leadership
- Knowledge Exchange
- Leading-Edge Insights
- Community Contribution
