Caching in Python: The LRU Algorithm

Shikha Gupta Last Updated : 12 Jun, 2025

12 min read

To use LRU caching in Python, you just need to add two lines – import and declaration of the @lru_cache decorator. We show examples of how and why to use it.

Caching is one approach that, if used correctly, significantly speeds up work and reduces the load on computational resources. A decorator is implemented in the Python standard library module that makes it possible to cache the output of functions using the Least Recently Used (LRU) strategy. This is a simple yet powerful technique that allows you to leverage caching capabilities in your code. functions @lru_cache

This article was published as a part of the Data Science Blogathon

In this guide, we’ll cover:

How to use decorators to implement all caching strategies that are available;
what is LRU and how does this approach work;
how to improve program performance using a decorator; @lru_cache
how to extend decorator functionality and stop caching after a certain time interval. @lru_cache

Python caching: what’s the use
Implementing caching in Python with a dictionary
- OUTPUT
Caching strategies
Diving into the idea of LRU caching
Looking Behind the Scenes of LRU Cache
Using @lru_cache to Implement LRU Cache in Python
A visual representation of the algorithm: jumping over the steps
Timing the execution time of the program code
Using memoisation to improve the solution
Other @lru_cache features
Add cache expiration
Criteria for excluding entries from the cache
Conclusion

Python caching: what’s the use

Caching is a storage optimization technique in which data is manipulated more efficiently than at its source.

Imagine we are creating a newsreader application that aggregates news from various sources. The user navigates through the list, and the application downloads the articles and displays them on the screen.

What will the program do if the reader decides to compare a couple of articles and begins to move between them many times? Without caching, the application will have to receive the same content every time. In this case, both the user’s system and the server with articles, on which additional load is created, are ineffectively used.

The best approach after receiving the article would be to store the content locally. The next time the user opens the article, the app will be able to open the content from the saved copy, instead of reloading the material from the source, and this is a procedure called caching in data science.

Implementing caching in Python with a dictionary

Python can implement caching using a dictionary. Instead of going to the server every time, you can check to see if the content is in the cache and query the server only if there is no content. You can use the URL of the article as a key, and its content as a value:

import requests
link = "https://www.analyticsvidhya.com/blog/2021/08/edge-contour-detection-an-application-of-computer-vision/"
cache = dict()
def get_article_from_server(url):
    response = requests.get (url)
    return response.text
def get_article (url):
     print ("Getting the article ...")
     if url not in cache:
        cache[url] = get_article_from_server(url)
        return cache[url]
print(get_article(link)[:1000])

OUTPUT

We get the article…

We take the article from the server…

We get the article..

Note– To run this example, you must have the library installed requests:

pip install requests

Although the call get_article()is made twice, the article is downloaded from the server only once. After the first access to the article, we put its URL and content into a dictionary cache. The second time, the code does not need to retrieve the item from the server again.

Caching strategies

In this simple implementation of caching, an obvious problem crept in: the content of the dictionary will grow indefinitely: the more entries a user opens, the more memory space was used.

To work around this problem, we need a strategy that allows the program to decide which articles to delete. There are several different strategies that you can use to remove items from the cache and prevent it from exceeding their maximum size. The five most popular are listed in the table.

Strategy	Which record do we delete	These records are most often reused.
First-In / First-Out (FIFO)	The oldest	New
Last-In / First-Out (LIFO)	Most recent	Old
Least Recently Used (LRU)	Used the longest	Recently read
Most Recently Used (MRU)	Used last	Read First
Least Frequently Used (LFU)	Used most rarely	Used frequently

Diving into the idea of LRU caching

The cache, implemented through the LRU strategy, orders the items in the order in which they are used. Each time we access a record, the LRU algorithm moves it to the top of the cache. Thus, the algorithm can quickly identify the entry that has not been used the longest by checking the end of the list.

The following figure shows the cache view after a user has requested an article from the web.

LRU Cache Population Process Step 1

The article is saved in the last cache slot before being sent to the user. The following figure shows what happens when the user requests the next article.

Lru caching python Population Process Step 2 LRU Cache Population Process Step 2

The second article takes up the last slot, moving the first article down the list.

This LRU strategy works as the later the object was used, the more likely it will be needed in the future. The algorithm keeps such an object in the cache for as long as possible.

Looking Behind the Scenes of LRU Cache

A combination of a doubly linked list and a hash table is a common method to implement an LRU cache in Python. The head of a doubly-linked list points to the last requested entry, and the tail to the most recently used.

The figure below shows a possible structure for an LRU cache implementation.

Caching implementation diagram

Using a hash table, we provide access to each item in the cache by mapping each entry to a specific location in the doubly linked list. At the same time, access to the recently used item and updating the cache are operations performed in constant time (that is, with the time complexity of the algorithm 𝑂 ( 1 ).

Since version 3.2, Python includes a decorator to implement the LRU strategy. @lru_cache

Using @lru_cache to Implement LRU Cache in Python

The decorator behind the scenes uses a dictionary. The result of the function execution is cached under the key corresponding to the function call and the supplied arguments. That is, for the decorator to work, the arguments must be hashable. @lru_cache

A visual representation of the algorithm: jumping over the steps

Imagine that we want to determine the number of ways in which we can reach a certain rung on a staircase. How many ways are there, for example, to get to the fourth step, if we can step over – jump over 1, 2, 3 (but no more) steps? The figure below shows the corresponding combinations.

A path is shown under each of the figures with an indication of the number of steps covered in one jump. Moreover, the number of ways to reach the fourth step is equal to the total number of ways in which you can get to the third, second, and first steps:

It turns out that the solution to the problem can be decomposed into smaller subtasks. To define the different paths to the fourth step, we can add four ways to reach the third step, two ways to reach the second step, and the only way to reach the first. That is, a recursive approach can be used.

Let’s describe the programmatically recursive solution exactly as we see it now:

      def steps_to (stair):
    if stair == 1:
        # The first step can be reached from the floor
        # unique way
        return 1
    elif stair == 2
        #to reach 2nd step, just stepping one at a time or crossing two at once
        return 2
    elif stair == 3:
        # For getting the 3rd step:
        # 1. Jump immediately to the third
        # 2. Jump over two, then one
        # 3. Jump over one then two
        # 4. One at a time
        return 4
    else:
        # All intermediate steps are different
        # so the total number of options is the sum
        # of such combinations
        return (
            steps_to (stair - 3)
            + steps_to (stair - 2)
            + steps_to (stair - 1)
        )
>>> print(steps_to(4))

OUTPUT

The code works for 4 rungs. Let’s check how he calculates the number of options for a staircase of 30 steps.

>>> steps_to(30)

OUTPUT

53798080

More than 53 million combinations turned out. However, when we were looking for a solution for Step 30, the scenario could take quite a long time.

Timing the execution time of the program code

Let’s measure how long it takes to execute the code.

For this, we can use the Python module ‘timeit’ or as the corresponding command in the Jupyter notebook.

%%timeit
>>> steps_to(30)

OUTPUT

53798080

The number of seconds depends on the specifications of the computer being used. On my system, the calculation took 3 seconds, which is quite slow for only thirty steps. This solution can be greatly improved with memoisation.

Using memoisation to improve the solution

Our recursive implementation solves the problem by breaking it down into smaller steps that complement each other. The following figure shows a tree for seven rungs, with each node representing a specific challenge: steps_to()

The step_to () function call tree

You may notice that the algorithm has to be called with the same argument multiple times. For example, it is evaluated twice, – four times, – seven times, etc. Calling the same function several times starts computations that are not necessary – the result is always the same. steps_to() steps_to(5) steps_to(4) steps_to(3)

To solve this problem, we can use memoisation: we store the result obtained for the same input values in memory and then return it on the next similar request. Great opportunity to apply as a decorator! @lru_cache

We import the decorator from the module and apply it to the main function. functools

from functools import lru_cache
@lru_cache()
def steps_to(stair):
    if stair == 1:
        return 1
    elif stair == 2:
        return 2
    elif stair == 3:
        return 4
    else:
        return (steps_to(stair - 3)
                + steps_to(stair - 2)
                + steps_to(stair - 1))
%%timeit
>>> steps_to(30)

OUTPUT

53798080

From units of seconds to tens of nanoseconds is a tremendous improvement due to the fact that behind the scenes, the decorator stores the call results for each unique input value. @lru_cache steps_to()

Other @lru_cache features

By attaching a decorator, we store each call and response in memory for later access if needed again. But how many of these combinations can we keep until memory runs out? @lru_cache

As soon as the cache starts deleting old elements, the decorator returns an attribute that defines the maximum number of entries till that time. The default is 128. If we assign a value, then the cache will grow without any deletion of entries. This can become a problem if we store too many different calls in memory.

Let’s apply using an attribute and add a method call: @lru_cache max size cache_info()

@lru_cache(maxsize=16)
def steps_to(stair):
    if stair == 1:
        return 1
    elif stair == 2:
        return 2
    elif stair == 3:
        return 4
    else:
        return (steps_to(stair - 3)
                + steps_to(stair - 2)
                + steps_to(stair - 1))
>>> steps_to(30)

OUTPUT

53798080

>>> steps_to.cache_info()

OUTPUT

CacheInfo(hits=52, misses=30, maxsize=16, currsize=16)

We can use the information returned to understand how the cache works and tweak it to find the right balance between speed and memory: cache_info()

hits=52 – the number of calls returned directly from memory since they were present in the cache; @lru_cache
misses=30 – the number of calls that were not taken from memory, but were calculated (in the case of our problem, this is each new stage);
max_size=16 – this is the size of the cache, which we determined by passing it to the decorator;
curr_size=16 – the current size of the cache, in this case, the cache is full.

Add cache expiration

Let’s move from a case study to a more realistic one. Imagine that we want to track the appearance on the Real Python resource of news articles containing the word in the title – display the title, download the article and display its length (number of characters). python

Real Python provides the Atom protocol so that we can use the pipe parser library and the article content library as we did before. Feed parser requests

import feedparser   # pip install feedparser
import requests
import ssl
import time
if hasattr(ssl, "_create_unverified_context"):
    ssl._create_default_https_context = ssl._create_unverified_context
def get_article_from_server(url):
     response = requests.get(url)
    return response.text
def monitor(url):
    maxlen = 45
    while True:
        try:
           print (" nChecking the feed ...")
             feed = feedparser.parse (url)
             for entry in feed.entries [: 5]:
                 if "python" in entry.title.lower ():
                     truncated_title = (
                         entry.title [: maxlen] + "..."
                         if len (entry.title)> maxlen
                         else entry.title
                     )
                     print (
                         "Coincidence:",
                        truncated_title,
                        len(get_article_from_server(entry.link)),
                    )
            time.sleep(5)
        except KeyboardInterrupt:
            break

The script will run continuously until we stop it by clicking [Ctrl + C]in the terminal window (or interrupt the execution in Jupyter Notepad).

>>> monitor("https://realpython.com/atom.xml")

OUTPUT

Check the tape ...
Retrieving an article from the server ...
Coincidence: The Real Python Podcast - Episode # 35: Securi ... 28704
Retrieving an article from the server ...
Match: PyPy: Faster Python With Minimal Effort 67387
Retrieving an article from the server ...
Match: Handling Missing Keys With the Python default ... 33224
Retrieving an article from the server ...
Match: Use Sentiment Analysis With Python to Classif ... 158401
Retrieving an article from the server ...
Coincidence: The Real Python Podcast - Episode # 34: The Py ... 29576
Check the tape ...
Retrieving an article from the server ...
Coincidence: The Real Python Podcast - Episode # 35: Securi ... 28704
Retrieving an article from the server ...
Match: PyPy: Faster Python With Minimal Effort 67389
Retrieving an article from the server ...
Match: Handling Missing Keys With the Python default ... 33224
Retrieving an article from the server ...
Match: Use Sentiment Analysis With Python to Classif ... 158400
Retrieving an article from the server ...
Coincidence: The Real Python Podcast - Episode # 34: The Py ... 29576

The code downloads and parses the XML file from Real Python. Then the loop iterates over the first five entries in the list. If the word is part of the title, the code prints the title and length of the article. Then the code “falls asleep” for 5 seconds, after which monitoring starts again.

Each time the script downloads an article, the message “Getting article from server…” is displayed on the console. If we let the script run long enough, we will see this message re-appear even when loading the same link.

We can use a decorator, but the content of the article may change over time. The decorator returns the same data every time that it saves during the first opening of the article. If the message is updated, the monitoring script will never know about it. To solve this problem, we need to set the expiration date for the entries in the cache. @lru_cache

Criteria for excluding entries from the cache

We can implement the described idea in a new decorator that extends. The cache should return a result per request only if the write caching time has not yet expired – otherwise, the result should be fetched from the server. Here’s a possible implementation of the new decorator: @lru_cache

from functools import lru_cache, wraps
from datetime import datetime, timedelta
def timed_lru_cache(seconds: int, maxsize: int = 128):
    def wrapper_cache(func):
        func = lru_cache(maxsize=maxsize)(func)
# instrumentation of the decorator with two attributes,
         #for lifetime representation and its expiration date expiration
        func.lifetime = timedelta(seconds=seconds)
        func.expiration = datetime.utcnow() + func.lifetime
        @wraps(func)
        def wrapped_func(*args, **kwargs):
            if datetime.utcnow() >= func.expiration:
                func.cache_clear()
                func.expiration = datetime.utcnow() + func.lifetime
            return func(*args, **kwargs)
        return wrapped_func
    return wrapper_cache

The decorator implements functionality to operate on the lifetime of entries in the cache (in seconds) and the maximum cache size. @timed_lru_cache

The code wraps the function with a decorator. This allows us to use the already familiar caching functionality. @lru_cache

Before accessing a cache entry, the decorator checks to see if the expiration date has passed. If so, the decorator clears the cache and recalculates the lifetime and expiration date.

Article caching with the new decorator

We can now use a new decorator with a function to prevent the article content from being downloaded from the server on every new request. Collecting the code in one place, we get the following result: @timed_lru_cache monitor()

import feedparser
import requests
import ssl
import time
from functools import lru_cache, wraps
from datetime import datetime, timedelta
if hasattr(ssl, "_create_unverified_context"):
    ssl._create_default_https_context = ssl._create_unverified_context
def timed_lru_cache(seconds: int, maxsize: int = 128):
    def wrapper_cache(func):
        func = lru_cache(maxsize=maxsize)(func)
        func.lifetime = timedelta(seconds=seconds)
        func.expiration = datetime.utcnow() + func.lifetime
        @wraps(func)
        def wrapped_func(*args, **kwargs):
            if datetime.utcnow() >= func.expiration:
                func.cache_clear()
                func.expiration = datetime.utcnow() + func.lifetime
            return func(*args, **kwargs)
        return wrapped_func
    return wrapper_cache
@timed_lru_cache(60)
def get_article_from_server(url):
    print("Получение статьи с сервера...")
    response = requests.get(url)
    return response.text
def monitor(url):
    maxlen = 45
    while True:
        try:
            print("n lets take the feed as.")
            feed = feedparser.parse(url)
            for entry in feed.entries[:5]:
                if "python" in entry.title.lower():
                    truncated_title = (
                        entry.title[:maxlen] + "..."
                        if len(entry.title) > maxlen
                        else entry.title
                    )
                    print(
                        "Соnfidence:",
                        truncated_title,
                        len(get_article_from_server(entry.link)),
                    )
            time.sleep(5)
        except KeyboardInterrupt:
            break
>>> monitor ("https://realpython.com/atom.xml")

OUTPUT

Retrieving an article from the server ...
Coincidence: The Real Python Podcast - Episode # 35: Securi ... 28704
Retrieving an article from the server ...
Match: PyPy: Faster Python With Minimal Effort 67387
Retrieving an article from the server ...
Match: Handling Missing Keys With the Python default ... 33224
Retrieving an article from the server ...
Match: Use Sentiment Analysis With Python to Classif ... 158400
Retrieving an article from the server ...
Coincidence: The Real Python Podcast - Episode # 34: The Py ... 29576
Checking the tape ...
Coincidence: The Real Python Podcast - Episode # 35: Securi ... 28704
Match: PyPy: Faster Python With Minimal Effort 67387
Match: Handling Missing Keys With the Python default ... 33224
Match: Use Sentiment Analysis With Python to Classif ... 158400
Coincidence: The Real Python Podcast - Episode # 34: The Py ... 29576

Notice how the code prints the message “Getting an article from the server…” the first time you access the corresponding articles. After that, depending on the speed of the network, the script will fetch articles from the cache several times before going back to the server.

Conclusion

Caching is an important optimization technique that improves the performance of any software system. Understanding how caching works is a fundamental step towards effectively incorporating it into your code.

In this tutorial, we briefly covered:

what are the caching strategies;
how LRU caching works in Python;
how to use the decorator; @lru_cache
how the recursive approach in combination with caching helps to solve the problem quickly enough.
The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Shikha Gupta

My full name is Shikha Gupta , pursuing B.tech in computer science from Banasthali vidhyapeeth Rajasthan.
I am from East Champaran in Bihar.
My area of interest ,Deep learning,NLP,Java,Data Structure,DBMS and many more.

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.6

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

Reading list

Caching in Python: The LRU Algorithm

Table of contents

Python caching: what’s the use

Implementing caching in Python with a dictionary

OUTPUT

Caching strategies

Diving into the idea of LRU caching

Looking Behind the Scenes of LRU Cache

Using @lru_cache to Implement LRU Cache in Python

A visual representation of the algorithm: jumping over the steps

Timing the execution time of the program code

Using memoisation to improve the solution

Other @lru_cache features

Add cache expiration

Criteria for excluding entries from the cache

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques

Reading list

Intoduction to Python

Variables and data types

OOPs Concepts

Conditional statement

Looping Constructs

Data Structures

String Manipulation

Functions

Modules, Packages and Standard Libraries

Python Libraries for Data Science

Reading Data Files in Python

Preprocessing, Subsetting and Modifying Pandas Dataframes

Sorting and Aggregating Data in Pandas

Visualizing Patterns and Trends in Data

Programming

Caching in Python: The LRU Algorithm

Table of contents

Python caching: what’s the use

Implementing caching in Python with a dictionary

OUTPUT

Caching strategies

Diving into the idea of ​​LRU caching

Looking Behind the Scenes of LRU Cache

Using @lru_cache to Implement LRU Cache in Python

A visual representation of the algorithm: jumping over the steps

Timing the execution time of the program code

Using memoisation to improve the solution

Other @lru_cache features

Add cache expiration

Criteria for excluding entries from the cache

Conclusion

Login to continue reading and enjoy expert-curated content.

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Become an Author

Flagship Programs

Free Courses

Popular Categories

Generative AI Tools and Techniques

Popular GenAI Models

AI Development Frameworks

Data Science Tools and Techniques

Diving into the idea of LRU caching