{"id":1646,"date":"2023-06-07T11:11:19","date_gmt":"2023-06-07T05:41:19","guid":{"rendered":"https:\/\/www.analyticsvidhya.com\/datahack-summit-2023\/?page_id=1646"},"modified":"2023-07-19T19:08:00","modified_gmt":"2023-07-19T13:38:00","slug":"parameter-efficient-fine-tuning-doing-more-with-less","status":"publish","type":"page","link":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/","title":{"rendered":"Parameter-Efficient Fine-Tuning: Doing more with less"},"content":{"rendered":"<p><span data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Large Language Models (LLMs) based on the transformer architecture, like GPT, T5, and BERT have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. They have also started foraying into other domains, such as Computer Vision (CV) (VIT, Stable Diffusion, LayoutLM) and Audio (Whisper, XLS-R). The conventional paradigm is large-scale pretraining on generic web-scale data, followed by fine-tuning to downstream tasks. Fine-tuning these pretrained LLMs on downstream datasets results in huge performance gains when compared to using the pretrained LLMs out-of-the-box (zero-shot inference, for example).\\nHowever, as models get larger and larger, full fine-tuning becomes infeasible to train on consumer hardware. In addition, storing and deploying fine-tuned models independently for each downstream task becomes very expensive, because fine-tuned models are the same size as the original pretrained model. Parameter-Efficient Fine-tuning (PEFT) approaches are meant to address both problems!\\nParameter-Efficient Fine-Tuning (PEFT) is a technique that allows us to fine-tune a large pretrained model on a specific downstream task while requiring significantly fewer parameters than full fine-tuning. The goal is to achieve comparable or even better performance than full fine-tuning, while requiring less computation and memory resources.\\nPEFT approaches only fine-tune a small number of (extra) model parameters while freezing most parameters of the pretrained LLMs, thereby greatly decreasing the computational and storage costs. This also overcomes the issues of\u00a0catastrophic forgetting, a behaviour observed during the full finetuning of LLMs. PEFT approaches have also shown to be better than fine-tuning in the low-data regimes and generalize better to out-of-domain scenarios. It can be applied to various modalities, e.g.,\u00a0image classification\u00a0and\u00a0stable diffusion dreambooth.\\nIt also helps in portability wherein users can tune models using PEFT methods to get tiny checkpoints worth a few MBs compared to the large checkpoints of full fine-tuning, e.g.,\u00a0bigscience\/mt0-xxl\u00a0takes up 40GB of storage and full fine-tuning will lead to 40GB checkpoints for each downstream dataset whereas using PEFT methods it would be just a few MBs for each downstream dataset all the while achieving comparable performance to full fine-tuning. The small trained weights from PEFT approaches are added on top of the pretrained LLM. So the same LLM can be used for multiple tasks by adding small weights without having to replace the entire model.\\n\ud83e\udd17 PEFT library provides the latest Parameter-Efficient Fine-tuning techniques seamlessly integrated with \ud83e\udd17 Transformers and \ud83e\udd17 Accelerate. This enables using the most popular and performant models from Transformers coupled with the simplicity and scalability of Accelerate.\\nKey Takeaways:\\n1. What, What and how of PEFT approaches\\n2. Using PEFT and quantization for finetuning LLMs with tens of billions of parameters on consumer hardware.\\n3. Showcasing application to PEFT to various downstream tasks and modalities.&quot;}\" data-sheets-userformat=\"{&quot;2&quot;:17405,&quot;3&quot;:{&quot;1&quot;:0},&quot;5&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;6&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;7&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;8&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;9&quot;:1,&quot;10&quot;:1,&quot;11&quot;:3,&quot;12&quot;:0,&quot;17&quot;:1}\">Large Language Models (LLMs) based on the transformer architecture, like GPT, T5, and BERT have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. They have also started foraying into other domains, such as Computer Vision (CV) (VIT, Stable Diffusion, LayoutLM) and Audio (Whisper, XLS-R). The conventional paradigm is large-scale pretraining on generic web-scale data, followed by fine-tuning to downstream tasks. Fine-tuning these pretrained LLMs on downstream datasets results in huge performance gains when compared to using the pretrained LLMs out-of-the-box (zero-shot inference, for example).<br \/>\nHowever, as models get larger and larger, full fine-tuning becomes infeasible to train on consumer hardware. In addition, storing and deploying fine-tuned models independently for each downstream task becomes very expensive, because fine-tuned models are the same size as the original pretrained model. Parameter-Efficient Fine-tuning (PEFT) approaches are meant to address both problems!<br \/>\nParameter-Efficient Fine-Tuning (PEFT) is a technique that allows us to fine-tune a large pretrained model on a specific downstream task while requiring significantly fewer parameters than full fine-tuning. The goal is to achieve comparable or even better performance than full fine-tuning, while requiring less computation and memory resources.<br \/>\nPEFT approaches only fine-tune a small number of (extra) model parameters while freezing most parameters of the pretrained LLMs, thereby greatly decreasing the computational and storage costs. This also overcomes the issues of\u00a0catastrophic forgetting, a behaviour observed during the full finetuning of LLMs. PEFT approaches have also shown to be better than fine-tuning in the low-data regimes and generalize better to out-of-domain scenarios. It can be applied to various modalities, e.g.,\u00a0image classification\u00a0and\u00a0stable diffusion dreambooth.<br \/>\nIt also helps in portability wherein users can tune models using PEFT methods to get tiny checkpoints worth a few MBs compared to the large checkpoints of full fine-tuning, e.g.,\u00a0bigscience\/mt0-xxl\u00a0takes up 40GB of storage and full fine-tuning will lead to 40GB checkpoints for each downstream dataset whereas using PEFT methods it would be just a few MBs for each downstream dataset all the while achieving comparable performance to full fine-tuning. The small trained weights from PEFT approaches are added on top of the pretrained LLM. So the same LLM can be used for multiple tasks by adding small weights without having to replace the entire model.<br \/>\n\ud83e\udd17 PEFT library provides the latest Parameter-Efficient Fine-tuning techniques seamlessly integrated with \ud83e\udd17 Transformers and \ud83e\udd17 Accelerate. This enables using the most popular and performant models from Transformers coupled with the simplicity and scalability of Accelerate.<br \/>\n<strong><br \/>\nKey Takeaways:<\/strong><br \/>\n<\/span><\/p>\n<ol>\n<li><span data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Large Language Models (LLMs) based on the transformer architecture, like GPT, T5, and BERT have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. They have also started foraying into other domains, such as Computer Vision (CV) (VIT, Stable Diffusion, LayoutLM) and Audio (Whisper, XLS-R). The conventional paradigm is large-scale pretraining on generic web-scale data, followed by fine-tuning to downstream tasks. Fine-tuning these pretrained LLMs on downstream datasets results in huge performance gains when compared to using the pretrained LLMs out-of-the-box (zero-shot inference, for example).\\nHowever, as models get larger and larger, full fine-tuning becomes infeasible to train on consumer hardware. In addition, storing and deploying fine-tuned models independently for each downstream task becomes very expensive, because fine-tuned models are the same size as the original pretrained model. Parameter-Efficient Fine-tuning (PEFT) approaches are meant to address both problems!\\nParameter-Efficient Fine-Tuning (PEFT) is a technique that allows us to fine-tune a large pretrained model on a specific downstream task while requiring significantly fewer parameters than full fine-tuning. The goal is to achieve comparable or even better performance than full fine-tuning, while requiring less computation and memory resources.\\nPEFT approaches only fine-tune a small number of (extra) model parameters while freezing most parameters of the pretrained LLMs, thereby greatly decreasing the computational and storage costs. This also overcomes the issues of\u00a0catastrophic forgetting, a behaviour observed during the full finetuning of LLMs. PEFT approaches have also shown to be better than fine-tuning in the low-data regimes and generalize better to out-of-domain scenarios. It can be applied to various modalities, e.g.,\u00a0image classification\u00a0and\u00a0stable diffusion dreambooth.\\nIt also helps in portability wherein users can tune models using PEFT methods to get tiny checkpoints worth a few MBs compared to the large checkpoints of full fine-tuning, e.g.,\u00a0bigscience\/mt0-xxl\u00a0takes up 40GB of storage and full fine-tuning will lead to 40GB checkpoints for each downstream dataset whereas using PEFT methods it would be just a few MBs for each downstream dataset all the while achieving comparable performance to full fine-tuning. The small trained weights from PEFT approaches are added on top of the pretrained LLM. So the same LLM can be used for multiple tasks by adding small weights without having to replace the entire model.\\n\ud83e\udd17 PEFT library provides the latest Parameter-Efficient Fine-tuning techniques seamlessly integrated with \ud83e\udd17 Transformers and \ud83e\udd17 Accelerate. This enables using the most popular and performant models from Transformers coupled with the simplicity and scalability of Accelerate.\\nKey Takeaways:\\n1. What, What and how of PEFT approaches\\n2. Using PEFT and quantization for finetuning LLMs with tens of billions of parameters on consumer hardware.\\n3. Showcasing application to PEFT to various downstream tasks and modalities.&quot;}\" data-sheets-userformat=\"{&quot;2&quot;:17405,&quot;3&quot;:{&quot;1&quot;:0},&quot;5&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;6&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;7&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;8&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;9&quot;:1,&quot;10&quot;:1,&quot;11&quot;:3,&quot;12&quot;:0,&quot;17&quot;:1}\">What, What and how of PEFT approaches<br \/>\n<\/span><\/li>\n<li><span data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Large Language Models (LLMs) based on the transformer architecture, like GPT, T5, and BERT have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. They have also started foraying into other domains, such as Computer Vision (CV) (VIT, Stable Diffusion, LayoutLM) and Audio (Whisper, XLS-R). The conventional paradigm is large-scale pretraining on generic web-scale data, followed by fine-tuning to downstream tasks. Fine-tuning these pretrained LLMs on downstream datasets results in huge performance gains when compared to using the pretrained LLMs out-of-the-box (zero-shot inference, for example).\\nHowever, as models get larger and larger, full fine-tuning becomes infeasible to train on consumer hardware. In addition, storing and deploying fine-tuned models independently for each downstream task becomes very expensive, because fine-tuned models are the same size as the original pretrained model. Parameter-Efficient Fine-tuning (PEFT) approaches are meant to address both problems!\\nParameter-Efficient Fine-Tuning (PEFT) is a technique that allows us to fine-tune a large pretrained model on a specific downstream task while requiring significantly fewer parameters than full fine-tuning. The goal is to achieve comparable or even better performance than full fine-tuning, while requiring less computation and memory resources.\\nPEFT approaches only fine-tune a small number of (extra) model parameters while freezing most parameters of the pretrained LLMs, thereby greatly decreasing the computational and storage costs. This also overcomes the issues of\u00a0catastrophic forgetting, a behaviour observed during the full finetuning of LLMs. PEFT approaches have also shown to be better than fine-tuning in the low-data regimes and generalize better to out-of-domain scenarios. It can be applied to various modalities, e.g.,\u00a0image classification\u00a0and\u00a0stable diffusion dreambooth.\\nIt also helps in portability wherein users can tune models using PEFT methods to get tiny checkpoints worth a few MBs compared to the large checkpoints of full fine-tuning, e.g.,\u00a0bigscience\/mt0-xxl\u00a0takes up 40GB of storage and full fine-tuning will lead to 40GB checkpoints for each downstream dataset whereas using PEFT methods it would be just a few MBs for each downstream dataset all the while achieving comparable performance to full fine-tuning. The small trained weights from PEFT approaches are added on top of the pretrained LLM. So the same LLM can be used for multiple tasks by adding small weights without having to replace the entire model.\\n\ud83e\udd17 PEFT library provides the latest Parameter-Efficient Fine-tuning techniques seamlessly integrated with \ud83e\udd17 Transformers and \ud83e\udd17 Accelerate. This enables using the most popular and performant models from Transformers coupled with the simplicity and scalability of Accelerate.\\nKey Takeaways:\\n1. What, What and how of PEFT approaches\\n2. Using PEFT and quantization for finetuning LLMs with tens of billions of parameters on consumer hardware.\\n3. Showcasing application to PEFT to various downstream tasks and modalities.&quot;}\" data-sheets-userformat=\"{&quot;2&quot;:17405,&quot;3&quot;:{&quot;1&quot;:0},&quot;5&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;6&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;7&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;8&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;9&quot;:1,&quot;10&quot;:1,&quot;11&quot;:3,&quot;12&quot;:0,&quot;17&quot;:1}\">Using PEFT and quantization for finetuning LLMs with tens of billions of parameters on consumer hardware.<br \/>\n<\/span><\/li>\n<li><span data-sheets-value=\"{&quot;1&quot;:2,&quot;2&quot;:&quot;Large Language Models (LLMs) based on the transformer architecture, like GPT, T5, and BERT have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. They have also started foraying into other domains, such as Computer Vision (CV) (VIT, Stable Diffusion, LayoutLM) and Audio (Whisper, XLS-R). The conventional paradigm is large-scale pretraining on generic web-scale data, followed by fine-tuning to downstream tasks. Fine-tuning these pretrained LLMs on downstream datasets results in huge performance gains when compared to using the pretrained LLMs out-of-the-box (zero-shot inference, for example).\\nHowever, as models get larger and larger, full fine-tuning becomes infeasible to train on consumer hardware. In addition, storing and deploying fine-tuned models independently for each downstream task becomes very expensive, because fine-tuned models are the same size as the original pretrained model. Parameter-Efficient Fine-tuning (PEFT) approaches are meant to address both problems!\\nParameter-Efficient Fine-Tuning (PEFT) is a technique that allows us to fine-tune a large pretrained model on a specific downstream task while requiring significantly fewer parameters than full fine-tuning. The goal is to achieve comparable or even better performance than full fine-tuning, while requiring less computation and memory resources.\\nPEFT approaches only fine-tune a small number of (extra) model parameters while freezing most parameters of the pretrained LLMs, thereby greatly decreasing the computational and storage costs. This also overcomes the issues of\u00a0catastrophic forgetting, a behaviour observed during the full finetuning of LLMs. PEFT approaches have also shown to be better than fine-tuning in the low-data regimes and generalize better to out-of-domain scenarios. It can be applied to various modalities, e.g.,\u00a0image classification\u00a0and\u00a0stable diffusion dreambooth.\\nIt also helps in portability wherein users can tune models using PEFT methods to get tiny checkpoints worth a few MBs compared to the large checkpoints of full fine-tuning, e.g.,\u00a0bigscience\/mt0-xxl\u00a0takes up 40GB of storage and full fine-tuning will lead to 40GB checkpoints for each downstream dataset whereas using PEFT methods it would be just a few MBs for each downstream dataset all the while achieving comparable performance to full fine-tuning. The small trained weights from PEFT approaches are added on top of the pretrained LLM. So the same LLM can be used for multiple tasks by adding small weights without having to replace the entire model.\\n\ud83e\udd17 PEFT library provides the latest Parameter-Efficient Fine-tuning techniques seamlessly integrated with \ud83e\udd17 Transformers and \ud83e\udd17 Accelerate. This enables using the most popular and performant models from Transformers coupled with the simplicity and scalability of Accelerate.\\nKey Takeaways:\\n1. What, What and how of PEFT approaches\\n2. Using PEFT and quantization for finetuning LLMs with tens of billions of parameters on consumer hardware.\\n3. Showcasing application to PEFT to various downstream tasks and modalities.&quot;}\" data-sheets-userformat=\"{&quot;2&quot;:17405,&quot;3&quot;:{&quot;1&quot;:0},&quot;5&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;6&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;7&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;8&quot;:{&quot;1&quot;:[{&quot;1&quot;:2,&quot;2&quot;:0,&quot;5&quot;:{&quot;1&quot;:2,&quot;2&quot;:0}},{&quot;1&quot;:0,&quot;2&quot;:0,&quot;3&quot;:3},{&quot;1&quot;:1,&quot;2&quot;:0,&quot;4&quot;:1}]},&quot;9&quot;:1,&quot;10&quot;:1,&quot;11&quot;:3,&quot;12&quot;:0,&quot;17&quot;:1}\">Showcasing application to PEFT to various downstream tasks and modalities.<\/span><\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>Large Language Models (LLMs) based on the transformer architecture, like GPT, T5, and BERT have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. They have also started foraying into other domains, such as Computer Vision (CV) (VIT, Stable Diffusion, LayoutLM) and Audio (Whisper, XLS-R). The conventional paradigm is large-scale pretraining on generic web-scale [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1648,"parent":1126,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"session-details.php","meta":[],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.7 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Parameter-Efficient Fine-Tuning: Doing more with less - DataHack Summit 2023<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Parameter-Efficient Fine-Tuning: Doing more with less - DataHack Summit 2023\" \/>\n<meta property=\"og:description\" content=\"Large Language Models (LLMs) based on the transformer architecture, like GPT, T5, and BERT have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. They have also started foraying into other domains, such as Computer Vision (CV) (VIT, Stable Diffusion, LayoutLM) and Audio (Whisper, XLS-R). The conventional paradigm is large-scale pretraining on generic web-scale [&hellip;]\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/\" \/>\n<meta property=\"og:site_name\" content=\"DataHack Summit 2023\" \/>\n<meta property=\"article:modified_time\" content=\"2023-07-19T13:38:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-content\/uploads\/2023\/06\/Efficient-Fine.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"500\" \/>\n\t<meta property=\"og:image:height\" content=\"250\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/\",\"url\":\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/\",\"name\":\"Parameter-Efficient Fine-Tuning: Doing more with less - DataHack Summit 2023\",\"isPartOf\":{\"@id\":\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/#website\"},\"datePublished\":\"2023-06-07T05:41:19+00:00\",\"dateModified\":\"2023-07-19T13:38:00+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Session\",\"item\":\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Parameter-Efficient Fine-Tuning: Doing more with less\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/#website\",\"url\":\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/\",\"name\":\"DataHack Summit 2023\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.analyticsvidhya.com\/dhs-2023\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Parameter-Efficient Fine-Tuning: Doing more with less - DataHack Summit 2023","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/","og_locale":"en_US","og_type":"article","og_title":"Parameter-Efficient Fine-Tuning: Doing more with less - DataHack Summit 2023","og_description":"Large Language Models (LLMs) based on the transformer architecture, like GPT, T5, and BERT have achieved state-of-the-art results in various Natural Language Processing (NLP) tasks. They have also started foraying into other domains, such as Computer Vision (CV) (VIT, Stable Diffusion, LayoutLM) and Audio (Whisper, XLS-R). The conventional paradigm is large-scale pretraining on generic web-scale [&hellip;]","og_url":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/","og_site_name":"DataHack Summit 2023","article_modified_time":"2023-07-19T13:38:00+00:00","og_image":[{"width":500,"height":250,"url":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-content\/uploads\/2023\/06\/Efficient-Fine.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/","url":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/","name":"Parameter-Efficient Fine-Tuning: Doing more with less - DataHack Summit 2023","isPartOf":{"@id":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/#website"},"datePublished":"2023-06-07T05:41:19+00:00","dateModified":"2023-07-19T13:38:00+00:00","breadcrumb":{"@id":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/parameter-efficient-fine-tuning-doing-more-with-less\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/"},{"@type":"ListItem","position":2,"name":"Session","item":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/session\/"},{"@type":"ListItem","position":3,"name":"Parameter-Efficient Fine-Tuning: Doing more with less"}]},{"@type":"WebSite","@id":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/#website","url":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/","name":"DataHack Summit 2023","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"}]}},"_links":{"self":[{"href":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-json\/wp\/v2\/pages\/1646"}],"collection":[{"href":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-json\/wp\/v2\/comments?post=1646"}],"version-history":[{"count":3,"href":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-json\/wp\/v2\/pages\/1646\/revisions"}],"predecessor-version":[{"id":2371,"href":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-json\/wp\/v2\/pages\/1646\/revisions\/2371"}],"up":[{"embeddable":true,"href":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-json\/wp\/v2\/pages\/1126"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-json\/wp\/v2\/media\/1648"}],"wp:attachment":[{"href":"https:\/\/www.analyticsvidhya.com\/dhs-2023\/wp-json\/wp\/v2\/media?parent=1646"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}