Everything You Should Know About Data Structures in Python

Aniruddha Bhandari 28 Aug, 2023 • 14 min read

Introduction

A Data Structure sounds like a very straightforward topic, yet many data science and analytics newcomers have no idea what it is. When I quiz these folks about the different data structures in Python and how they work, I’m met with a blank stare. Not good!

Python is an easy programming language to learn, but we need to clear our basics before diving into the attractive machine learning coding bits. That’s because behind every data exploration task we perform, even the analytics step we take, there is a basic element of storage and organization of the data.

And this is a no-brainer – extracting information when we store our data efficiently is so much easier. We save ourselves a ton of time thanks to our code running faster – who wouldn’t want that?

And that’s why I implore you to learn about data structures in Python.

Python Data Structures

In this article, we will explore the basic in-built data structures in Python that will come in handy when dealing with data in the real world. So whether you’re a data scientist or an analyst, this article is equally relevant.

If you’re new to this awesome programming language, go through our comprehensive FREE Python course.

What Are Data Structures?

Data structures are a way of storing and organizing data efficiently. This will allow you to access and perform operations on the data easily.

There is no one-size-fits-all kind of model when it comes to data structures. You will want to store data in different ways to cater to the needs of the hour. Maybe you want to store all types of data together, or you want something for faster searching of data, or maybe something that stores only distinct data items.

Luckily, Python has a host of in-built data structures that help us to organize our data easily. Therefore, it becomes imperative to get acquainted with these first so that when dealing with data, we know exactly which data structure will solve our purpose effectively.

Data Structure #1: Lists in Python

Lists in Python are the most versatile data structure. They are used to store heterogeneous data items, from integers to strings or even another list! They are also mutable, meaning their elements can be changed even after creating the list.

Creating Lists

Lists are created by enclosing elements within [square] brackets, and each item is separated by a comma:

Python Code:

Since each element in a list has its own distinct position, having duplicate values in a list is not a problem:

Duplicate values in Python list

Accessing List Elements

To access elements of a list, we use Indexing. Each element in a list has an index related to it depending on its position in the list. The first element of the list has the index 0, the next element has index 1, and so on. The last element of the list has an index of one less than the length of the list.

Indexing in Python lists | data structures in python

But indexes don’t always have to be positive; they can be negative too. What do you think negative indexes indicate?

While positive indexes return elements from the start of the list, negative indexes return values from the end of the list. This saves us from the trivial calculation we would have to perform otherwise if we wanted to return the nth element from the end of the list. So instead of trying to return List_name[len(List_name)-1] element, we can simply write List_name[-1].

Using negative indexes, we can return the nth element from the end of the list easily. If we want to return the first element from the end of the last index, the associated index is -1. Similarly, the index for the second last element will be -2, and so on. Remember, the 0th index will still refer to the very first element in the list.

Indexing in Python Lists | data structures in python

But what if we wanted to return a range of elements between two positions in the lists? This is called Slicing. All we have to do is specify the start and end index within which we want to return all the elements – List_name[start : end].

Python lists indexing | data structures in python

One important thing to remember is that the element at the end index is never included. Only elements from the start index to the index equaling end-1 will be returned.

Appending Values in Lists

We can add new elements to an existing list using the append() or insert() methods:

  • append() – Adds an element to the end of the list
  • insert() – Adds an element to a specific position in the list that needs to be specified along with the value
Adding elements to Lists | data structures in python

Removing Elements from Lists

Removing elements from a list is as easy as adding them and can be done using the remove() or pop() methods:

  • remove() – Removes the first occurrence from the list that matches the given value
  • pop() – This is used when we want to remove an element at a specified index from the list. However, if we don’t provide an index value, the last element will be removed from the list
Removing elements from Lists | data structures in python

Sorting Lists

Most of the time, you will use a list to sort elements. So, it is essential to know about the sort() method. It lets you sort list elements in place in either ascending or descending order:

Sorting lists | data structures in python

But where things get a bit tricky is when you want to sort a list containing string elements. How do you compare two strings? Well, string values are sorted using ASCII values of the characters in the string. Each character in the string has an integer value associated with it. We use these values to sort the strings.

On comparing two strings, we compare the integer values of each character from the beginning. If we encounter the same characters in both strings, we compare the next character until we find two differing characters. It is, of course, done internally, so you don’t have to worry about it!

Sorting Python Lists | data structures in python

Concatenating Lists

We can even concatenate two or more lists using the + symbol. This will return a new list containing elements from both the lists:

Concatenating Lists | data structures in python

List Comprehensions

A very interesting application of Lists is List Comprehension, which provides a neat way of creating new lists. These new lists are created by applying an operation on each element of an existing list. It will be easy to see their impact if we first check out how it can be done using the good old for-loops:

Python for-loops

Now, we will see how we can concisely perform this operation using list comprehensions:

List comprehensions

See the difference? List comprehensions are a useful asset for any data scientist because you have to write concise and readable code on a daily basis!

Stacks & Queues using Lists

A list is an in-built data structure in Python. But we can use it to create user-defined data structures. Two very popular user-defined data structures built using lists are Stacks and Queues.

Stacks are a list of elements in which elements are added or deleted from the end of the list. Think of it as a stack of books. You do it from the top whenever you need to add or remove a book from the stack. It uses the simple concept of Last-In-First-Out.

Stack

Queues, on the other hand, are a list of elements in which elements are added at the end of the list, but the deletion of elements takes place from the front of the list. You can think of it as a queue in the real world. The queue becomes shorter when people from the front exit the queue. The queue becomes longer when someone new adds to the queue from the end. It uses the concept of First-In-First-Out.

Queue

Now, as a data scientist or an analyst, you might not be employing this concept every day, but knowing it will surely help you when you have to build your own algorithm!

Data Structure #2: Tuples in Python

Tuples are another very popular in-built data structure in Python. These are quite similar to Lists except for one difference – they are immutable. This means that no value can be added, deleted, or edited once a tuple is generated.

We will explore this further, but let’s first see how to create a Python Tuple!

Creating Tuples in Python

Tuples can be generated by writing values within (parentheses), and each element is separated by a comma. But even if you write many values without any parenthesis and assign them to a variable, you will still have a tuple! Have a look for yourself:

Python Tuple | data structures in python

Now that we know how to create tuples let’s talk about immutability.

Immutability of Tuples

Anything that cannot be modified after creation is immutable in Python. Python language can be broken down into mutable and immutable objects.

Lists, dictionaries, and sets (we will explore these in further sections) are mutable objects, meaning they can be modified after creation. On the other hand, integers, floating values, boolean values, strings, and even tuples are immutable objects. But what makes them immutable?

Everything in Python is an object. So, we can use the in-built id() method, which allows us to check an object’s memory location. This is known as the identity of the object. Let’s create a list and determine the location of the list and its elements:

Python mutable

As you can see, both the list and its element have different locations in memory. Since we know lists are mutable, we can alter the value of its elements. Let’s do that and see how it affects the location values:

Mutable in Python

The location of the list did not change, but that of the element did. This means a new object was created for the element and saved in the list. This is what is meant by mutable. A mutable object can change its state or contents after creation, but an immutable object cannot.

But we can call tuples pseudo-immutable because even though they are immutable, they can contain mutable objects whose values can be modified!

Tuple immutability | data structures in python

As you can see from the example above, we could change the values of an immutable object, list, contained within a tuple.

Tuple Assignment

Tuple packing and unpacking are useful operations you can perform to assign values to a tuple of elements from another tuple in a single line.

We already saw tuple packing when we made our planet tuple. Tuple unpacking is just the opposite-assigning values to variables from a tuple:

Tuple packing and unpacking

It is handy for swapping values in a single line. Honestly, this was one of the first things that got me excited about Python: being able to do so much with such little coding!

Python swapping

Changing Tuple Values

Although I said that tuple values cannot be changed, you can actually make changes to it by converting it to a list using list(). When you are done making the changes, you can again convert it back to a tuple using tuple().

Altering values

This change, however, is expensive as it involves making a copy of the tuple. But tuples come in handy when you don’t want others to change the content of the data structure.

Data Structure #3: Dictionary in Python

A dictionary is another Python data structure to store heterogeneous objects that are immutable but unordered. This means that when you try to access the elements, they might not be in exactly the same order as the one in which you inserted them.

But what sets dictionaries apart from lists is how elements are stored. Elements in a dictionary are accessed via their key values instead of their index, as we did in a list. So, dictionaries contain key-value pairs instead of just single elements.

Generating Dictionary

Dictionaries are generated by writing keys and values within a { curly } bracket separated by a semi-colon. Each key-value pair is separated by a comma:

Python Dictionary | data structures in python

Using the key of the item, we can easily extract the associated value of the item:

Dictionary values

These keys are unique. But even if you have a dictionary with multiple items with the same key, the item value will be the one associated with the last key:

DIctionary multiple values

Dictionaries are handy to access items quickly because, unlike lists and tuples, a dictionary does not have to iterate over all the items to find a value. Dictionary uses the item key to find the item value quickly. This concept is called hashing.

Accessing Keys and Values

You can access the keys from a dictionary using the keys() method and the values using the values() method. These we can view using a for-loop or turn them into a list using list():

Dictionary accessing values | data structures in python

We can even access these values simultaneously using the items() method, which returns the respective key and value pair for each element of the dictionary.

Dictionary items

Data Structure #4: Sets in Python

Sometimes, you don’t want multiple occurrences of the same element in your list or tuple. It is here that you can use a set data structure. A Set is an unordered but mutable collection of elements that contains only unique values.

Python Sets | data structures in python

You will see that the values are not in the same order as entered in the set. This is because sets are unordered.

Add and Remove Elements from a Set

To add values to a set, use the add() method. It lets you add any value except mutable objects:

Set add elements | data structures in python

To remove values from a set, you have two options to choose from:

  • The first is the remove() method, which gives an error if the element is not present in the Set
  • The second is the discard() method, which removes elements but gives no error when the element is not present in the Set
Set remove

If the value does not exist, remove() will give an error, but discard() won’t.

Set discard | data structures in python

Set Operations

Using Python Sets, you can perform operations like union, intersection, and difference between two sets, just like you would in mathematics.

The Union of two sets gives values from both sets. But the values are unique. So if both the sets contain the same value, only one copy will be returned:

Sets Union | data structures in python

The Intersection of two sets returns only those values that are common to both sets:

Sets Intersection

The Difference of a set and another gives only those values that are not present in the first set:

Sets Difference

User-Defined Data Structures

User-defined data structures refer to data structures that are created by the programmer based on their specific requirements and needs. These data structures are not built-in to the programming language but are designed and implemented by the programmer to store and organize data in a way that suits their application. User-defined data structures allow programmers to tailor the data storage and manipulation to match the problem they are trying to solve. Let’s look at the different types of user-defined data structures in Python.

Arrays

Arrays are a fundamental data structure that stores elements of the same data type in contiguous memory locations. They have a fixed size and provide constant-time access to elements.

Sample Code:

# Creating an array in Python
numbers = [10, 20, 30, 40, 50]

# Accessing elements of an array
print(numbers[2]) # Output: 30

# Modifying an element
numbers[1] = 25
print(numbers) # Output: [10, 25, 30, 40, 50]

Lists

Lists, also known as dynamic arrays, are similar to arrays but can grow or shrink in size dynamically. They’re implemented using arrays and provide more flexibility.

Sample Code:

# Creating a list in Python
names = ["Alice", "Bob", "Charlie"]

# Adding an element to the end of the list
names.append("David")
print(names) # Output: ["Alice", "Bob", "Charlie", "David"]

# Removing an element from the list
names.remove("Bob")
print(names) # Output: ["Alice", "Charlie", "David"]

Stack

A stack is a linear data structure that follows the Last In First Out (LIFO) principle. Elements are added and removed from the top of the stack.

Sample Code:

# Implementing a stack using Python's list
stack = []

# Pushing elements onto the stack
stack.append(10)
stack.append(20)
stack.append(30)

# Popping elements from the stack
print(stack.pop()) # Output: 30
print(stack.pop()) # Output: 20

Queue

A queue is a linear data structure that follows the First In First Out (FIFO) principle. Elements are added at the rear and removed from the front.

Sample Code:

# Implementing a queue using Python's collections module
from collections import deque

queue = deque()

# Enqueue elements
queue.append(5)
queue.append(10)
queue.append(15)

# Dequeue elements
print(queue.popleft()) # Output: 5
print(queue.popleft()) # Output: 10

Trees

A tree is a hierarchical data structure consisting of nodes connected by edges. Each node has a parent (except the root) and zero or more children.

Sample Code:

# Defining a simple binary tree node
class TreeNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None

# Creating a binary tree
root = TreeNode(10)
root.left = TreeNode(5)
root.right = TreeNode(15)

Linked Lists

A linked list is a linear data structure where each element (node) points to the next element. They are more memory-efficient than arrays and allow dynamic resizing.

Sample Code:

# Defining a linked list node
class ListNode:
def __init__(self, value):
self.value = value
self.next = None

# Creating a linked list
head = ListNode(10)
head.next = ListNode(20)
head.next.next = ListNode(30)

Graphs

A graph is a collection of nodes (vertices) connected by edges. Graphs can be directed (edges have a direction) or undirected.

Sample Code:

# Using Python's NetworkX library to create a simple undirected graph
import networkx as nx
import matplotlib.pyplot as plt

G = nx.Graph()
G.add_nodes_from([1, 2, 3])
G.add_edges_from([(1, 2), (2, 3)])

nx.draw(G, with_labels=True, font_weight='bold')
plt.show()

HashMaps (Dictionaries)

A hashmap (or dictionary) is a data structure that stores key-value pairs. It provides fast access to values using keys.

Sample Code:

# Creating a dictionary in Python
phonebook = {
"Alice": "123-456-7890",
"Bob": "987-654-3210",
"Charlie": "555-123-4567"
}

# Accessing values using keys
print(phonebook["Alice"]) # Output: 123-456-7890

Conclusion

Isn’t Python a beautiful language? It provides you with many different options to handle your data more efficiently. Learning about data structures in Python is a key aspect of your own learning journey. This article should serve as a good introduction to the in-built data structures in Python. If it got you interested in Python, and you are itching to know more about it in detail and how to use it in your everyday data science or analytics work, I recommend going through the following articles and courses:

Frequently Asked Questions

Q1. What are the Python data structures?

Ans. The built-in data structures in Python include lists, tuples, sets, and dictionaries. Apart from these, there are user-defined data structures in Python such as arrays, strings, queues, lists, stacks, trees, linked lists, graphs, and hashmaps.

Q2. What are the three types of data structures in Python?

Ans. Lists, Tuples, and Dictionaries are the three types of data structures in Python.

Q3. What are the 4 different data types in Python?

Ans. Python has 4 built-in data types: Integer (int), Float (float), String (str), and Boolean (bool)

Q4. What are the 2 main types of data structures?

Ans. The 2 main types of data structures are primitive data structures and non-primitive or composite data structures.

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Enrique Pérez
Enrique Pérez 13 Jun, 2020

Hi Aniruddha, Excellent Article. THANKS A LOT!

Mohammed sani
Mohammed sani 18 Jun, 2020

Thanks a lot for letting us know

B.Nikhil raj
B.Nikhil raj 02 Jul, 2020

Nice artical Mr.ANIRUDDHA BHANDARI .One small suggestion is you provide more information about data types elaborating more of each datatype

Harsh
Harsh 01 Sep, 2020

well a great article ! What about advanced datastructures such as tree and how we deal with dynamic programing in python ...is it easy compared to java??

Related Courses