How to Run LLMs Locally?

Diksha Kumari Last Updated : 09 May, 2025

2 min read

LLMs like GPT and Llama have completely transformed how we tackle language tasks, from creating intelligent chatbots to generating complex pieces of code. Cloud platforms like Hugging Face simplify using these models, but there are times when running an LLM locally on your own computer is the smarter choice. Why? Because it offers greater privacy, allows for customizations tailored to your specific needs, and can significantly reduce costs. Running LLMs locally gives you full control, letting you leverage their power on your own terms.

Let me show you how to run an LLM on your system in just a few simple steps using Ollama and Hugging Face!

Here’s a video that explains it step by step:

How to run LLMs locally in 1 minute [beginner-friendly]

using ollama 🦙 and hugging face 🤗 pic.twitter.com/CZtNdtEkIh
— dylan (@dylan_ebert_) January 6, 2025

Steps to Run LLMs Locally

Step 1: Download Ollama

First, search for “Ollama” on your browser, download it, and install it on your system.

Step 2: Find the Best Open-Source LLMs

Next, search for “HuggingFace LLM leaderboard” to find a list of the top open-source language models.

Step 3: Filter the Models for Your Device

Once you see the list, apply filters to find models that work best for your setup. For example:

Select consumer devices for home use.
Choose official providers only to avoid unofficial or unverified models.
If your laptop has a lower-end GPU, select models designed for edge devices.

Click on a top-ranked model, such as Qwen/Qwen2.5-35B. On the top-right corner of the screen, click “Use this model.” However, you won’t find Ollama listed here as an option.

That’s because Ollama uses a specialized format called gguf, which is a smaller, faster, and quantized version of the model.

(Note: Quantization slightly reduces quality but makes it more efficient for local use.)

To get a model in the gguf format:

Go to the Quantization section on the leaderboard – there are around 80 models available here. Sort these models by most download ones.

Look for models with “gguf” in their name, like Bartowski. This is a good choice.

Select this model and click “Use this model with Ollama.”
For quantization settings, choose a file size that’s 1-2GB smaller than your GPU’s RAM or pick a recommended option like Q5_K_M.

Step 5: Download and Start Using the Model

Copy the command provided for your selected model and paste it into your terminal. Hit “Enter” and wait for the download to complete.

Once it’s downloaded, you can start chatting with the model just like you would with any other LLM. Simple and fun!

And there you go! You’re now running a powerful LLM locally on your device. Let me know if these steps worked for you in the comment section below.

Diksha Kumari

As an Instructional Designer at Analytics Vidhya, Diksha has experience creating dynamic educational content on the latest technologies and trends in data science. With a knack for crafting engaging, cutting-edge content, Diksha empowers learners to navigate and excel in the evolving tech landscape, ensuring educational excellence in this rapidly advancing field.

News

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.6

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

How to Run LLMs Locally?

Steps to Run LLMs Locally

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques

Reading list

Introduction to Generative AI

Introduction to Generative AI applications

No-code Generative AI app development

Code-focused Generative AI App Development

Introduction to Responsible AI

LLMS

Prompt Engineering

Finetuning LLMs

Training LLMs from Scratch

Langchain

RAG

LlamaIndex

Stable Diffusion

How to Run LLMs Locally?

Steps to Run LLMs Locally

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques