An Overview of Python Memory Management

Mayank Rijhwani 23 Jan, 2024 • 8 min read

This article was published as a part of the Data Science Blogathon

1. Introduction:

Python is a high-level, interpreted, and general-purpose programming language. You must be thinking, “do I really need to manage the memory in a high-level language like Python?” The most upvoted answer is ‘No’; you do not have to take care of memory management in Python. Still, yes, you should be aware of how variables and objects are managed internally, especially when it comes to Python memory management. Having a good understanding of how chunks of memory are allocated, re-used, and de-allocated for Python objects enables you to write more efficient code and solve a lot of issues related to the extra memory that your program pulls.

​​​​​​​Python memory management plays a major role to make it much popular and adaptable. How so? Python memory manager has been implemented in a way to support many functionalities and to make our life easier. ‘Dynamically typed’ is the best example to mention here. Python allows you to create variables without type information and going forward, you can also assign another object irrespective of the size and type of the new object.

Did you ever wonder how is this possible and how python handles it internally? Let deep dive into it.

Python does not have variables, instead, it has ‘names’:

If you are coming from C/C++ or Java, you would be aware that you must declare a variable with its type before we can use it. Based on the type specified, it reserves some (fixed in size) space in the memory with a default value, and then the value is stored during the assignment. In Python memory management, unlike C/C++ or Java, you don’t need to explicitly declare the type of a variable. Python uses dynamic typing, allowing variables to be assigned without specifying their type explicitly. The interpreter dynamically allocates memory based on the assigned value and handles the memory management automatically. Also, we know that a variable in C/C++ or Java will override its content in the same reserved memory block once a new value is assigned to it. If we assign a larger value, overflow may occur. In Python, memory management is more flexible, as it automatically adjusts the memory allocation for variables based on their content, reducing the risk of overflow errors.

Python works in a different way and technically, it does not have anything like ‘variables’, instead, it uses ‘names’. Please note that –

· Python ‘name’ is just a label for an object and used to refer to a value.

· We never specify ‘type’ information while creating an object.

· A python ‘name’ can change its type.

· A single Python object can have lots of names.

· Two names will point to the same object if the id(x) method returns the same value.

Values are not updated, instead, a new object is pointed:

​​​​​​​You can see that both the names ‘x’ and ‘y’ point to the same object. So, what happens to ‘y’ if we change the value of ‘x’? will it return the updated value? Absolutely not! If we change the value of ‘x’, the memory manager will search if an object equivalent to the updated value is already present in the heap? If yes, then it starts pointing to it, otherwise, a new object with an updated value is created –

Python optimizes memory utilization by allocating the same object reference to a new variable if the object already exists with the same value. That is why python is called more memory efficient.

So, where is the ‘type’?

Unlike C/C++ or Java, python names neither point to a specific ‘memory location’, nor fix in ‘type’. We have already seen, a python name will start referring to another object once it is updated, or a new value is assigned. Similarly, it inherits the ‘type’ information from the object it currently refers to. We can get the type of a python name by calling type(x) method.

​​​​​​​How is memory allocated to new Objects?

Python uses the Dynamic Memory Allocation (DMA), which is internally managed by the Heap data structure. All python objects are stored in a private heap, and this heap is managed in such a way that you have zero control over it. The python memory management system employs a garbage collector that automatically handles the allocation and deallocation of memory. Let us get some more details about DMA and compare it with SMA –

Static memory allocation

Dynamic memory Allocation

Memory is allocated at compile time.Memory is allocated while the program starts executing.
It is a faster way of memory allocation.It is a slower way of memory allocation.
Once static memory is allocated, neither its size can be changed, nor it can be re-used. Hence, less efficient.We can change the memory size after allocation and can be reused as well. Hence, more efficient.
In this case, variables get allocated permanently, and allocated memory remains blocked until the program terminates.In this case, variables get allocated only when the program unit gets active and releases the memory when variables get out of scope.
Uses stack for memory management.Uses heap for memory management.


A Python Object:

​​​​​​​Python is an Object-Oriented Language and everything in python is an object. All python objects always derived from ‘PyObject‘, which is just like a key-value container and it contains below 3 fields –

· type

· ref-count

· value​​​​​​​​​​​​​​

Python uses a CPython interpreter, which is written in C. When we create a python object (X=200), internally –

· CPython creates a PyObject in memory.

· The type of PyObject is set to an integer.

· The value of PyObject is set to 200.

· A name ‘X’ is created and set to point PyObject.

· PyObject ref-count is set to 1.​​​​​​​​​​​​​​​​​​

Python objects can be divided into 3 parts –

· Simple objects (Integer, Float, Boolean, String, etc.)

· Container objects (Tuple, List, Set, Dictionary, etc.)

· User-defined custom classes (Employee class etc.)

How objects are removed from memory: Garbage Collection

We have understood that a Python ‘name’ can start pointing to another (new or existing) object (same or different type) when we update its value or assign a new instance to it. This dynamic behavior in Python’s memory management raises a crucial question: What happens to the older object, which was being referred to by this ‘name’? Will that still be available in memory? The answer is – “May or may not be”! Python memory management system handles the allocation and deallocation of memory for objects, and the availability of the older object depends on factors such as reference counts and the presence of garbage collection mechanisms. Understanding these aspects is essential for effective Python programming and efficient memory usage.

Python is a dynamically typed language and dynamically allocates the memory to its objects when a chunk of the program starts its execution. Similarly, Python also de-allocates the memory occupied by unused objects using “Garbage Collection”. When there are no more references available, the object can be safely removed from memory. Python uses the below 2 algorithms to perform garbage collection –

· Reference Counting

· Tracing

7.1. Reference Counting:

Python keeps tracking of all the names (references), currently pointing to a particular object. The total number of names referring to an object is called the ‘reference count’ of the object (PyObject) and Python keeps this count to ‘ref-count’ field.

Please note –

· The reference count of an object can increase or decrease dynamically.

· We can call a sys.getrefco​​​​​​​unt(X) to get the current ref-count value of an object ‘X’.

· Passing x to getrefcount() function adds one extra reference to it.

Every time, when a new reference to a Python object is created, its reference count is increased. Similarly, every time, when a reference to a Python object is removed, its reference count is decreased. When a reference count reaches 0, we can safely remove the object from memory.

7.1.1 What makes a reference?

Below are some cases, when the reference of a python object (new/existing PyObject) is made, and ref-count is increased by 1 –​​​​​​​​​​​​

· Binding a new object to a name:

x = 200

· Re-using an already available object and giving a new reference:

y = 200

· Adding an object to a container:

z = [200, 200]​​​​​​​

· Passing it to a function:

my_fun(200)

7.1.2 What removes a reference?

Look at below cases, when a reference to an object (existing PyObject) is removed, and ref-counter is decreased by 1 –

· Assigning a new object to a name:

x = True

· Removing the reference or its container:

​​​​​​​del y

· When a variable goes out of scope:

Once a local variable is loaded using a block or method, it gets some memory dynamically. Once the block completed its execution, the variable is called ‘out of scope’.

Please note, Global namespace never gets out of scope and stays alive until the program completes its execution.

7.1.3 Cascading effect:

If a removed object O1 was pointing to another object O2, the reference count of O2 will also be decreased by 1 and if the ref-count of O2 reaches 0, O2 can also be removed from memory. It means, one reference count reaching 0 can cause a lot of objects to be cleared from memory. This is called cascading effect in Python garbage collection.

7.1.4 Reference counting has good, bad, and ugly sides:

Good: The Good part of the reference counting algorithm is, that –

· It is easy to implement.

· It immediately clears the memory once the ref-count reaches 0.

Bad: Reference counting can be called bad as it is –

· This algorithm has some space overhead involved, as it needs some extra space for each object to store count values.

· It also has execution overhead, as the reference count is changed at every assignment. Hence, a single assignment operator can cause too many executions internally.​​​​​​​

Ugly: Reference counting sometimes shows serious issues like –

· It is generally not thread-safe, and it may create lots of issues when multiple threads are trying to update reference count at the same time.

· Reference counting does not detect cyclical reference.

​​​​​​​​​​​​​​In the given diagram (Figure 1.10), there are 3 references ‘x’, ‘y’, and ‘z’ are creating a cyclical reference, and another reference ‘A’ is pointing to ‘y’. Once ‘A’ started referring to a new object, ‘x’, ‘y’, and ‘z’ are no more required in the memory. But, as the reference counts of these 3 variables are not zero, the garbage collector won’t remove them from the memory. Hence reference counting is not capable to handle this case.

7.2. Tracing:

Starting from Python 3, Python uses ‘Tracing’, along with reference counting to handle this type of situation. Tracing works on the ‘Mark and sweep’ principle, which is performed in two phases – ​​​​​​​

7.2.1 Mark Phase:

When the no of objects in memory reaches a max threshold value, it starts marking all reachable references using a reference graph, e.g., nodes r, 1, 4, 6, 7, and 8 will be masked reachable during the mark phase.

7.2.2 Sweep Phase:

Once all reachable references are identified, all remaining objects are removed out of memory in the sweep phase, e.g., nodes 2, 3, and 5 will be removed during the sweep phase.

Conclusion:

In this article, we comprehensively explored Python memory management and delved into how Python internally manages its objects to minimize memory usage. We have also examined the mechanism through which Python executes its garbage collector. Most of the garbage collection is carried out through reference counting, and it’s noteworthy that programmers have limited control over this process. Despite the considerable execution overhead associated with memory management, making it comparatively slower than some other languages, Python remains the most widely used language due to its extensive features that are highly beneficial to programmers.

Frequently Asked Questions

Q1. How is memory managed in Python interview questions?

Python manages memory through dynamic memory allocation (DMA) and a garbage collector. The private heap stores objects, and the garbage collector automatically frees up memory by reclaiming unreferenced objects.

Q2.Does Python automatically free memory?

Yes, Python automatically frees memory using its garbage collection mechanism, which identifies and releases memory associated with unreferenced objects. Developers typically don’t need to manually free memory in Python.

References:

· https://docs.python.org/3/c-api/structures.html

· https://realpython.com/python-memory-management

· https://scoutapm.com/blog/python-memory-management​​​​​​​​​​​​​​​​​​​​​

· https://www.youtube.com/watch?v=F6u5rhUQ6dU

· https://www.youtube.com/watch?v=54NWGAYhfbc

· https://en.wikipedia.org/wiki/Python_(programming_language)

https://www.geeksforgeeks.org/memory-management-in-python

The media shown in this article are not owned by Analytics Vidhya and is used at the Author’s discretion. 

Mayank Rijhwani 23 Jan 2024

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Anurag Singh
Anurag Singh 11 Apr, 2021

Nice informative

sagar Gupta
sagar Gupta 13 Apr, 2021

Very nice explanation..now everything is clear..thank you so much sir..

Deepika Gangwani
Deepika Gangwani 24 Apr, 2021

Very Informative! Well explained. Really helped to understand the concept Python. 👍🏻

apam
apam 29 Apr, 2022

very interesting topic and nicely written! thank you

SoundofText
SoundofText 29 Feb, 2024

Thanks for sharing this informative post! I've been struggling with Python memory management lately, and your tips are really helpful. I'll definitely start using the `gc.collect()` function more often. Also, the section on leaving savings for later execution was eye-opening - I never knew that could make such a big difference. looking forward to implementing these strategies in my projects!

Related Courses

image.name
0 Hrs 70 Lessons
5

Introduction to Python

Free

  • [tta_listen_btn class="listen"]