Hi!👋 I am your Mentornaut - a helpful mentor to help you navigate through your AI learning journey. Click here to engage me anytime.

DataHour: Efficient Fine-Tuning of LLMs on single T4 GPU using Ludwig

About the Event

Numerous recent studies have unequivocally demonstrated the substantial performance enhancements that fine-tuning base large language models can bring to domain-specific tasks. Writing fine-tuning code from scratch requires integrating with a variety of libraries and can be highly error-prone and time-consuming. Then, iterating over your code for each fine-tuning experiment increases the chances of such errors and reduces the velocity of testing changes iteratively. Once you write your code and run it, you're also likely to run into CPU and GPU out-of-memory errors for a variety of different reasons, especially when fine-tuning models like Llama-2 on cheap commodity hardware like a single T4 GPU.

This presentation will center on Ludwig, an open-source, low-code declarative machine learning framework designed for supervised fine-tuning of large language models. We will delve into the mechanics of fine-tuning these models, identify situations where fine-tuning is beneficial, explore best practices, and provide a comprehensive, step-by-step guide on how to efficiently fine-tune Llama-2-13B (and even GPTNeoX-20B) using just a single T4 GPU, leveraging Ludwig's array of optimizations for fine-tuning.

Prerequisites:

Access to Google Colab: https://colab.research.google.com/
Request access to Llama-2 weights on HuggingFace: https://huggingface.co/meta-llama/Llama-2-7b-hf

About the Speaker

Arnav Garg

Senior Machine Learning Engineer at Predibase

Meet Arnav Garg, a seasoned Senior Machine Learning Engineer at Predibase, where he is making waves in the realm of applied machine learning and large-scale training. With a relentless focus on optimizing fine-tuning processes and scaling distributed training and inference, Arnav is at the forefront of harnessing the power of open-source large language models of all sizes.In his daily pursuits, Arnav delves into the intricacies of Ludwig's distributed hyperparameter optimization experience for deep learning models, ensuring that they perform at their peak. But his expertise doesn't stop there. Arnav is on a mission to automate the process of right-sizing compute resources for distributed training jobs. He builds robust reliability mechanisms to guarantee cost-effective and efficient training on the Predibase platform, enabling users to concentrate solely on their tasks without the hassle of managing compute resources.

Participate in discussion

Registration Details

8167

Registered

Flagship Programs

GenAI Pinnacle ProgramGenAI Pinnacle Plus ProgramAI/ML BlackBelt ProgramAgentic AI Pioneer Program

Popular Categories

AI AgentsGenerative AIPrompt EngineeringGenerative AI ApplicationNewsTechnical GuidesAI ToolsInterview PreparationResearch PapersSuccess StoriesQuizUse CasesListicles

AI Development Frameworks

n8nLangChainAgent SDKA2A by GoogleSmolAgentsLangGraphCrewAIAgnoLangFlowAutoGenLlamaIndexSwarmAutoGPT

DataHour: Efficient Fine-Tuning of LLMs on single T4 GPU using Ludwig