# An Intuitive and Easy Guide to Python Sets- Must for Becoming Data Science Professional

This article was published as a part of the Data Science Blogathon

**Introduction**

In today’s world, in each and every domain, the utmost thing is **Data Storage**. While working with Python, which provides different types of data structures to organize your data. So, learning these data structures becomes an essential part of your journey to learn Python either from a **Software Engineering** perspective or from a **Data Science** perspective. In Python, among all of the data structures available, some are mutable and some are immutable. But In this article, our main focus will be on the sets.

Image Source: **Link**

Now, a question comes to mind that 🙁 “**When we use Sets in Python?” **So, sets are used in Python when:

- The order of data does not matter.
- We do not want any repetitions in the data elements.
- We have to perform mathematical operations such as union, intersection, etc.

**So, without any further delay, Let’s get started, 😎**

**Table of Contents**

**1.** Introduction: What is a Python Set?

**2.** How do you create a set in Python?

**3. **Basic Functionalities on Python sets

- Finding the length of a set
- Accessing the elements of a set
- Adding the elements to an existing set
- Removing elements from a set or completely delete a set

**4. **Mathematical Operations on Python sets

- Union of sets
- Intersection of sets
- Difference of sets
- Symmetric Difference of sets

**5. **Alternative Container: Frozen Set

- Create a Frozen Set in Python
- Accessing the elements of a Frozen set

** **

**What is a Python set?**

A set is basically a data type that consists of a collection of unordered elements and it is a mutable (changeable) collection of unique elements i.e, do not have repeated copies of elements. Elements in sets can be of any data type, unlike arrays, which are not type-specific. The values of a set are unindexed, therefore, indexing operations cannot be performed on sets.

Sets are written with curly brackets (**{}**), being the elements separated by commas.

**How do you create a set in Python?**

We can create the sets in Python using either of the following two ways:

- Enclosing elements within
**curly braces ({})** - By using the
**set()**function

**Using curly braces**

Sets in Python can be created with the help of **curly braces({})**.

**For Example-**

**Python Code:**

__Output:__

{'Data Scientist', 'Data Engineer', 'Data Analyst', 'Analytics Vidhya'}

**Using set() function**

Sets in Python can also be created using the built-in function **set([iterable])**. This function takes an iterable (i.e. any type of sequence, collection, or iterator), as an argument and returns a set that contains unique items from the input i.e, duplicated values are removed.

**For Example-**

# Create a set using set() function my_set = set(['Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer']) print(my_set) # Check the Data Type of my_set print(type(my_set))

__Output:__

{'Data Scientist', 'Data Engineer', 'Data Analyst', 'Analytics Vidhya'}

**Creating an Empty Python Set**

We can also create an empty set using the **set()** function.

**For Example-**

# Creating an empty set using set() function empty_set = set() print(empty_set)

__Output:__

set()

**Basic Functionalities on Sets in Python**

We can perform a number of operations such as adding elements, deleting elements, finding the length of a set, etc. To know what all methods can be used on sets, we can use the **dir() **function. Let’s see this on a given set.

# Check all functionalities which we can perform on a Python Set my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'} print(list(dir(my_set)))

__Output:__

**1. Finding the Length of a Python Set**

To find the length of a set in Python, we use the **len()** function. This function takes the name of the set as a parameter and returns an integer value that is equal to the number of elements present in the set.

**For Example-**

# Finding the length of a set my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'} print("The length of a given set is", len(my_set))

__Output:__

The length of a given set is 4

**2. Accessing the Elements of a Set**

We cannot access the set elements using the index numbers as we specified before that the elements of a set are not indexed. Therefore, if we want to access the elements of a set, then we can use a** for loop** and access its elements.

**For Example-**

# Printing all the elements of a set using For loop my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'} for element in my_set: print(element)

__Output:__

Data Scientist Data Engineer Data Analyst Analytics Vidhya

**3. Adding the elements to an existing Set**

We can add the new elements to a set using either of the two functions,

- Adding a single element –
**add()**function - Adding more than one elements –
**update()**function

**For Example-**

# Adding a single element 'Business Analyst' to an existing set my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'} my_set.add('Business Analyst') print(my_set)

__Output:__

{'Data Engineer', 'Data Analyst', 'Business Analyst', 'Data Scientist', 'Analytics Vidhya'}

**For Example-**

# Adding more than one elements 'Business Analyst' and 'Data Mastermind' to an existing set my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'} my_set.update(['Business Analyst', 'Data Mastermind']) print(my_set)

__Output:__

{'Data Mastermind', 'Data Engineer', 'Data Analyst', 'Business Analyst', 'Data Scientist', 'Analytics Vidhya'}

**4. Removing an element from a Set or completely delete a set**

We can use the following ways to either removing elements from a set or deleting a complete set:

- Using the
**set.remove(x)**method - Using the
**set.discard(x)**method - Using the
**set.pop()**method - Using the
**set.clear()**method - Using the
**del**Keyword

**The remove function**

The **set.remove(x)** method takes one parameter **x** and removes that element **x** from a set. If the given element to this function does not exist, then raises an exception (**KeyError**).

In the below example, you can see that **‘Analytics Vidhya’ **has been removed from the set using the remove() function. But when we specify **‘Business Analyst’** i.e, some element as a parameter to remove() that does not exist in the set, it will throw an error.

**For Example-**

# Remove the element using remove() function my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'} my_set.remove('Analytics Vidhya') print(my_set) my_set.remove('Business Analyst') print(my_set)

__Output:__

**The discard function**

The **set.discard(x)** method takes one parameter **x** and removes that element **x** from a set if it is present. Now, if we want to remove some element from the set, and we are not sure whether that element is actually present in the set or not, then we can use this function. In comparison to the **remove method**, the **discard method** does not raise an exception **(KeyError**) if the element to be removed does not exist.

**For Example-**

# Remove the element using discard() function my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'} my_set.discard('Analytics Vidhya') print(my_set) my_set.discard('Business Analyst') print(my_set)

__Output:__

{'Data Scientist', 'Data Engineer', 'Data Analyst'} {'Data Scientist', 'Data Engineer', 'Data Analyst'}

In the above example, you can see that **‘Analytics Vidhya’** has been removed from **my_set **but discard() has not thrown an error when I used **my_set.discard(‘Business Analyst’) **even though **‘Business Analyst’ **is not present in my set.

**The pop function**

The **set.pop() **method also removes set elements, but since a set is unordered, we will not know which element has been removed from the set.

In the below example, you can see that the **pop()** function removes some random element has been removed, which in this case is** ‘Data Scientist’**.

**For Example-**

# Remove the element using pop() function my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'} my_set.pop() print(my_set)

__Output:__

{'Data Engineer', 'Data Analyst', 'Analytics Vidhya'}

**The clear function**

The **set.clear() **method deletes all the elements present in a given set.

**For Example-**

# Remove all the element using clear() function my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'} my_set.clear() print(my_set)

__Output:__

set()

In the above example, we can see that after the** clear() **operation, **my_set** becomes an empty set.

**The del Keyword**

When we want to completely delete the set, we can use the **del** keyword to do so.

**For Example-**

# Delete the complete set my_set = {'Analytics Vidhya', 'Data Scientist', 'Data Analyst', 'Data Engineer'} del my_set print(my_set)

__Output:__

In the above example, we can see that after running the code, it will throw an error because **my_set** is deleted after performing the operation.

**Mathematical operations on Python Sets**

We can use sets in Python to compute mathematical operations such as,

**Union,****Intersection,****Difference, and****Symmetric difference**

These logical operations can be represented with a diagram, which is known as the **Venn diagram**. Venn diagrams are widely used in **Mathematics, Statistics**, and **Computer science **to visualize the differences and similarities between the sets.

Image Showing all the Mathematical Operations which we can perform on Sets

Image Source: **Link**

**1. Union of Sets**

The **union**** **of two sets A and B is defined as the set containing the elements that are in A, B, or both, and is denoted by **A ∪ B**.

Figure Showing the Union of Two Sets

Image Source:** Link**

To compute this operation with Python, we can use either of the following two ways:

- Using the ‘
**| ‘**operator - Using
**set.union()**method

**Using the ‘|’ operator**

We can concatenate two sets in Python using the **‘|’ **operator.

**For Example-**

# Union of two sets using '|' operator my_set_1 = {1, 'Analytics Vidhya', 3.45} my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1} print(" The union of two sets is given byn", my_set_1 | my_set_2)

__Output:__

The union of two sets is given by {1, 'Analytics Vidhya', 3.45, '1'}

**Using set.union() method**

To concatenate two or more sets, we can also use the **union()** function.

**For Example-**

# Union of two sets using set.union() method my_set_1 = {1, 'Analytics Vidhya', 3.45} my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1} print(" The union of two sets is given byn", my_set_1.union(my_set_2))

__Output:__

The union of two sets is given by {1, 3.45, 'Analytics Vidhya', '1'}

**2. Intersection of Sets**

The **intersection** of two sets A and B is defined as the set that consists of the elements that are common to both sets and is denoted by** ****A ∩ B**.

Figure Showing the Intersection of Two Sets

Image Source: **Link**

In Python, we can compute the intersection of two sets using either of the following two ways:

- Using
**‘&’**operator - Using
**set.intersection()**method

**Using the ‘&’ operator**

We can determine the intersection of two or more sets using the **‘&’** operator.

**For Example-**

# Intersection of two sets using '&' operator my_set_1 = {1, 'Analytics Vidhya', 3.45} my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1} print(" The intersection of two sets is given byn", my_set_1 & my_set_2)

__Output:__

The intersection of two sets is given by {1, 3.45, 'Analytics Vidhya'}

**Using set.intersection() method**

We can also determine the intersection of two or more sets using the **intersection()** function.

**For Example-**

# Intersection of two sets using set.intersection() method my_set_1 = {1, 'Analytics Vidhya', 3.45} my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1} print(" The intersection of two sets is given byn", my_set_1.intersection(my_set_2))

__Output:__

The intersection of two sets is given by {1, 3.45, 'Analytics Vidhya'}

**3. Difference of Sets**

When we take the difference between two sets, then it produces a new set that consists of elements that are present only in one of those sets. This means that all elements except the common elements of those sets will be returned.

In simple and short terms, we can say that the **difference** between two sets A and B is defined as the set of all elements of set A that are not contained in set B and is denoted by **A-B**.

Figure Showing the Difference Between the Two Sets

Image Source: **Link**

To compute the difference between two sets in Python, we can use either of the following two ways**:**

- Using the
**‘-‘**operator - Using
**set.difference()**method

**Using the ‘-‘ operator**

To find the difference between the two sets, we can use the** ‘-’ **operator.

**For Example-**

# Difference of two sets using '-' operator my_set_1 = {1, 'Analytics Vidhya', 3.45} my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1} print(" The difference of set 1 from set 2 is given byn", my_set_1 - my_set_2) print(" The difference of set 2 from set 1 is given byn", my_set_2 - my_set_1)

__Output:__

The difference of set 1 from set 2 is given by set() The difference of set 2 from set 1 is given by {'1'}

**Using set.difference() method**

The difference of sets can be determined using the built-in **difference()** function also.

**For Example-**

# Difference of two sets using set.difference() method my_set_1 = {1, 'Analytics Vidhya', 3.45} my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1} print(" The difference of set 1 from set 2 is given byn", my_set_1.difference(my_set_2)) print(" The difference of set 2 from set 1 is given byn", my_set_2.difference(my_set_1))

__Output:__

The difference of set 1 from set 2 is given by set() The difference of set 2 from set 1 is given by {'1'}

**4. Symmetric difference of Sets**

The **symmetric difference** of two sets A and B is defined as the set of elements that are in either of the sets A and B, but not in both, and is denoted by **A △ B**.

Figure Showing the Symmetric Difference Between Two Sets

Image Source:** Link**

In Python, we can find the **symmetric difference **of two sets using either of the following two ways:

- Using the ‘
**^’**operator - Using
**set.symmetric_difference()**method

**Using the ‘^‘ operator**

To find the symmetric difference between two sets, we can use the **‘^’ **operator.

**For Example-**

# Symmetric Difference of two sets using '^' operator my_set_1 = {1, 'Analytics Vidhya', 3.45} my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1} print(" The symmetric difference of two sets is given byn", my_set_1 ^ my_set_2) print(" The symmetric difference of two sets is given byn", my_set_2 ^ my_set_1)

__Output:__

The symmetric difference of two sets is given by {'1'} The symmetric difference of two sets is given by {'1'}

**Using set.symmetric_difference() method**

The difference of sets can be also determined using the built-in **symmetric_ difference()** function.

**For Example-**

# Symmetric Difference of two sets using set.symmetric_difference() method my_set_1 = {1, 'Analytics Vidhya', 3.45} my_set_2 = {'1', 3.45, 'Analytics Vidhya', 1} print(" The symmetric difference of two sets is given byn", my_set_1.symmetric_difference(my_set_2)) print(" The symmetric difference of two sets is given byn", my_set_2.symmetric_difference(my_set_1))

__Output:__

The symmetric difference of two sets is given by {'1'} The symmetric difference of two sets is given by {'1'}

** **

**Alternative container: Frozenset**

Sometimes we make a set that we do not change i.e, don’t change the elements of our set at all, so at that time we can make use of frozen sets which I discussed in this section.

A **frozenset** object is a Python set that cannot be modified i.e, once created, cannot be changed. This means that it is immutable, unlike a normal set that I have discussed previously. One of the applications of frozen sets is to serve as a key in dictionary key-value pairs.

Since Frozen sets are immutable, therefore, you cannot perform operations such as add(), remove(), update(), etc. If you trying to add an element to a **frozenset,** then it raises an exception **(AttributeError)**. (See the below Example)

**For Example-**

simple_set = {1,'Analytics Vidhya', 4.6, 'r'} frozen_set = frozenset(simple_set) print("The frozenset corresponding to a given set is n", frozen_set) frozen_set.add('Business Analyst') print(frozen_set)

__Output:__

**How to Create a Frozenset?**

We create a frozenset in Python using the **frozenset([iterable]) **method, providing an iterable as input. This function takes any iterable items and converts them to immutable.

**For Example-**

simple_set = {1,'Analytics Vidhya', 4.6, 'r'} frozen_set = frozenset(simple_set) print("The frozenset corresponding to a given set is n", frozen_set)

__Output:__

The frozenset corresponding to a given set is frozenset({1, 'r', 4.6, 'Analytics Vidhya'})

The above output consists of the set **frozen_set** which is a frozen version of **simple_set**.

**Accessing Elements of a Frozen Set**

Elements of a frozen set can be accessed by using a **for loop**.

**For Example-**

frozen_set =frozenset([1,'Analytics Vidhya', 4.6, 'r']) for element in frozen_set: print(element)

__Output:__

1 r 4.6 Analytics Vidhya

The above output shows that using the **for loop**, all the elements of the **frozen_set** have been returned one after the other.

**Other Blog Posts by Me**

You can also check my previous blog posts.

**Previous Data Science Blog posts.**

**LinkedIn**

Here is **my Linkedin profile** in case you want to connect with me. I’ll be happy to be connected with you.

**Email**

For any queries, you can mail me on **Gmail**.

**End Notes**

*Thanks for reading!*

I hope that you have enjoyed the article. If you like it, share it with your friends also.* *Something not mentioned or want to share your thoughts? Feel free to comment below And I’ll get back to you. 😉

*The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.*