DataHour: Efficient Fine-Tuning of LLMs on single T4 GPU using Ludwig

DataHour: Efficient Fine-Tuning of LLMs on single T4 GPU using Ludwig

12 Oct 202312:10pm - 12 Oct 202313:10pm

DataHour: Efficient Fine-Tuning of LLMs on single T4 GPU using Ludwig

About the Event

Numerous recent studies have unequivocally demonstrated the substantial performance enhancements that fine-tuning base large language models can bring to domain-specific tasks. Writing fine-tuning code from scratch requires integrating with a variety of libraries and can be highly error-prone and time-consuming. Then, iterating over your code for each fine-tuning experiment increases the chances of such errors and reduces the velocity of testing changes iteratively. Once you write your code and run it, you're also likely to run into CPU and GPU out-of-memory errors for a variety of different reasons, especially when fine-tuning models like Llama-2 on cheap commodity hardware like a single T4 GPU.

This presentation will center on Ludwig, an open-source, low-code declarative machine learning framework designed for supervised fine-tuning of large language models. We will delve into the mechanics of fine-tuning these models, identify situations where fine-tuning is beneficial, explore best practices, and provide a comprehensive, step-by-step guide on how to efficiently fine-tune Llama-2-13B (and even GPTNeoX-20B) using just a single T4 GPU, leveraging Ludwig's array of optimizations for fine-tuning.


Prerequisites:

  1. Access to Google Colab: https://colab.research.google.com/ 
  2. Request access to Llama-2 weights on HuggingFace: https://huggingface.co/meta-llama/Llama-2-7b-hf 
  1. Best articles get published on Analytics Vidhya’s Blog Space
  2. Best articles get published on Analytics Vidhya’s Blog Space
  3. Best articles get published on Analytics Vidhya’s Blog Space
  4. Best articles get published on Analytics Vidhya’s Blog Space
  5. Best articles get published on Analytics Vidhya’s Blog Space

Who is this DataHour for?

  1. Best articles get published on Analytics Vidhya’s Blog Space
  2. Best articles get published on Analytics Vidhya’s Blog Space
  3. Best articles get published on Analytics Vidhya’s Blog Space

About the Speaker

Arnav Garg

Arnav Garg

Senior Machine Learning Engineer at Predibase

Meet Arnav Garg, a seasoned Senior Machine Learning Engineer at Predibase, where he is making waves in the realm of applied machine learning and large-scale training. With a relentless focus on optimizing fine-tuning processes and scaling distributed training and inference, Arnav is at the forefront of harnessing the power of open-source large language models of all sizes.In his daily pursuits, Arnav delves into the intricacies of Ludwig's distributed hyperparameter optimization experience for deep learning models, ensuring that they perform at their peak. But his expertise doesn't stop there. Arnav is on a mission to automate the process of right-sizing compute resources for distributed training jobs. He builds robust reliability mechanisms to guarantee cost-effective and efficient training on the Predibase platform, enabling users to concentrate solely on their tasks without the hassle of managing compute resources.

Participate in discussion

Registration Details

8158

Registered

Become a Speaker

Share your vision, inspire change, and leave a mark on the industry. We're calling for innovators and thought leaders to speak at our event

  • Professional Exposure
  • Networking Opportunities
  • Thought Leadership
  • Knowledge Exchange
  • Leading-Edge Insights
  • Community Contribution