How to Work With CSV Files in Python?

Harika Last Updated : 26 Nov, 2024
10 min read

CSV is a file format you will frequently come across while working in the field of Data Science. It is a type of text file that stores tabular data for better readability, easier understanding, and faster processing.  If you’re wondering how to read CSV file in python , CSV files can be converted from a JSON file or created using Python or Java.

In this article, we’ll cover the basics of CSV files and how to handle them in Python. You’ll learn what a CSV file is in Python, and we’ll show you how to read CSV files using Python’s built-in csv library. We’ll also explain how to create a CSV file in Python and manage CSV file handling easily. Whether you use the csv module or libraries like Pandas, this guide will help you read CSV files in Python step-by-step. Plus, we’ll touch on working with different data types and structures using Pandas and NumPy for efficient CSV handling.

This article was published as a part of the Data Science Blogathon.

What is a CSV?

Understanding how to read CSV files in Python is essential for any data scientist. CSV, which stands for “Comma Separated Values,” serves as the fundamental format for storing tabular data as plain text. As data scientists, we frequently encounter CSV data in our daily workflows. Therefore, mastering the ability to read CSV files in Python is crucial for efficiently handling and analyzing data sets.

Structure of CSV in Python

We have a file named “Salary_Data.csv.” The first line of a CSV file is the header. It contains the names of the fields/features, which are shown on top as the column names in the file.

After the header, each line of the file is an observation/a record. The values of a record are separated by “commas.”

Structure of CSV in Python

Read csv file in Python

There are two main ways to read CSV files in Python:

  • Using the csv module: This is the built-in module for working with CSV files in Python. It provides basic functionality for reading and writing CSV data.

Here’s an example of how to read a CSV file using csv.reader:

import csv

# Open the CSV file in read mode
with open('data.csv', 'r') as csvfile:
  # Create a reader object
  csv_reader = csv.reader(csvfile)
  
  # Iterate through the rows in the CSV file
  for row in csv_reader:
    # Access each element in the row
    print(row)
  • Using the Pandas library: Pandas is a powerful library for data analysis in Python. It offers a more convenient way to read and manipulate CSV data.

Here’s an example of how to read a CSV file using Pandas:

import pandas as pd

# Read the CSV file into a DataFrame
df = pd.read_csv('data.csv')

# Access data in the DataFrame using column names or indexing
print(df['column_name'])
print(df.iloc[0])  # Access first row

List of Methods to Read a CSV File in Python

  • Read CSV file using csv.reader
  • Read CSV file using .readlines() function
  • Read CSV file using Pandas
  • Read CSV file using csv.DictReader

How to Read CSV Files in Python with Procedural Steps?

There are many different ways to read data in a CSV file, which we will now see one by one.

Steps to Read CSV Files in Python Using csv.reader

You can read CSV files using the csv.reader object from Python’s csv module. Steps to read a CSV file using csv reader:

  1. Import the CSV library

    import csv

  2. Open the CSV file

    The .open() method in python is used to open files and return a file object.

    file = open('Salary_Data.csv')
    type(file)

    The type of file is “_io.TextIOWrapper” which is a file object that is returned by the open() method.

  3. Use the csv.reader object to read the CSV file

    csvreader = csv.reader(file)

  4. Extract the field names

    Create an empty list called a header. Use the next() method to obtain the header.
    The .next() method returns the current row and moves to the next row.
    The first time you run next(), it returns the header, and the next time you run, it returns the first record, and so on.

    header = []
    header = next(csvreader)
    header

    Field names in CSV header [python read csv]

  5. Extract the rows/records

    Create an empty list called rows and iterate through the csvreader object and append each row to the rows list.

    rows = []
    for row in csvreader:
    rows.append(row)
    rows

  6. Close the file

    .close() method is used to close the opened file. Once it is closed, we cannot perform any operations on it.

    file.close()

Complete Code for Read CSV Python

import csv
file = open("Salary_Data.csv")
csvreader = csv.reader(file)
header = next(csvreader)
print(header)
rows = []
for row in csvreader:
    rows.append(row)
print(rows)
file.close()

Naturally, we might forget to close an open file. To avoid that, we can use the with() statement to automatically release the resources. In simple terms, there is no need to call the .close() method if we are using with() statement.

Implementing Code Using with() Statement

Basic Syntax: with open(filename, mode) as alias_filename:

Modes:

  • ‘r’ – to read an existing file,
  • ‘w’ – to create a new file if the given file doesn’t exist and write to it,
  • ‘a’ – to append to existing file content,
  • ‘+’ –  to create a new file for reading and writing
import csv
rows = []
with open("Salary_Data.csv", 'r') as file:
    csvreader = csv.reader(file)
    header = next(csvreader)
    for row in csvreader:
        rows.append(row)
print(header)
print(rows)
CSV python file [python read csv]

Also Read: The Evolution and Future of Data Science Innovation

How to Read CSV Files in Python Using .readlines()?

Now the question is – “Is it possible to fetch the header and rows using only open() and with() statements and without the csv library?” Let’s see…

.readlines() method is the answer. It returns all the lines in a file as a list. Each item on the list is a row of our CSV file.

The first row of the file.readlines() is the header, and the rest are the records.

with open('Salary_Data.csv') as file:
    content = file.readlines()
header = content[:1]
rows = content[1:]
print(header)
print(rows)
CSV file using .readlines() [python read csv]

**The ‘n’ from the output can be removed using .strip() method.

What if we have a huge dataset with hundreds of features and thousands of records? Would it be possible to handle lists??

Here comes pandas library into the picture.

How to Read CSV Files in python Using Pandas?

Let’s have a look at how pandas are used to read data in a CSV file.

Step1: Import pandas library

import pandas as pd

Step2: Load CSV files to pandas using read_csv()

Basic Syntax: pandas.read_csv(filename, delimiter=’,’)

data= pd.read_csv("Salary_Data.csv")
data
csv file python pandas [python read csv]

Step3: Extract the field names

.columns is used to obtain the header/field names.

data.columns
.columns in csv python pandas [python read csv]

Step4: Extract the rows

All the data of a data frame can be accessed using the field names.

data.Salary

Read CSV File in Python Using csv.DictReader

A dictionary in how to Read CSV file in Python is like a hash table, containing keys and values. To create a dictionary, you use the dict() method with specified keys and values. If you’re working with CSV files in Python, the csv module’s .DictReader comes in handy for reading them. Here’s a simple guide on how to use Python to read CSV file

Step1: Import the csv module

import csv

Step2: Open the CSV file using the .open() function with the mode set to ‘r’ for reading.

with open('Salary_Data.csv', 'r') as csvfile:

Step3: Create a DictReader object using the csv.DictReader() method.

reader = csv.DictReader(csvfile)

Step4: Use the csv.DictReader object to read the CSV file.

Iterate through the rows of the CSV file using a ‘for’ loop and the DictReader object to see the field names as keys along with their respective values.

for row in reader:
       print(row)

List of Methods to Write a CSV file in python

  • Write CSV file using csv.writer
  • Write CSV file using writelines() function
  • Write CSV file using Pandas
  • Write CSV file using csv.DictWriter

How to Write to a Python CSV?

We can write to a CSV file in multiple ways.

Write CSV file Using csv.writer

The csv.writer() function returns a writer object that converts the input data into a delimited string.
For example, let’s assume we are recording the data of 3 students (Name, M1 Score, M2 Score)

header = ['Name', 'M1 Score', 'M2 Score']
data = [['Alex', 62, 80], ['Brad', 45, 56], ['Joey', 85, 98]]

Now let’s see how this data can be written to a CSV file using csv.writer:

Step1: Import csv library.

import csv

Step2: Define a filename and Open the file using open().
Step3: Create a csvwriter object using csv.writer().
Step4: Write the header.
Step5: Write the rest of the data.

Code for steps 2-5

filename = 'Students_Data.csv'
with open(filename, 'w', newline="") as file:
    csvwriter = csv.writer(file) # 2. create a csvwriter object
    csvwriter.writerow(header) # 4. write the header
    csvwriter.writerows(data) # 5. write the rest of the data

Below is how our CSV file looks.

Write CSV File Using .writelines()

.writelines() iterates through each list, converts the list elements to a string, and then writes it to the csv file.

header = ['Name', 'M1 Score', 'M2 Score']
data = [['Alex', 62, 80], ['Brad', 45, 56], ['Joey', 85, 98]]
filename = 'Student_scores.csv'
with open(filename, 'w') as file:
    for header in header:
        file.write(str(header)+', ')
    file.write('n')
    for row in data:
        for x in row:
            file.write(str(x)+', ')
        file.write('n')

Write CSV Using Pandas

Follow these steps to write to a CSV file using pandas:

Step1: Import pandas library

import pandas as pd

Step2: Create a pandas dataframe using pd.DataFrame

Syntax: pd.DataFrame(data, columns)

The data parameter takes the records/observations, and the columns parameter takes the columns/field names.

header = ['Name', 'M1 Score', 'M2 Score']
data = [['Alex', 62, 80], ['Brad', 45, 56], ['Joey', 85, 98]]
data = pd.DataFrame(data, columns=header)

Step3: Write to a CSV file using to_csv()

Syntax:DataFrame.to_csv(filename, sep=’,’, index=False)

**separator is ‘,’ by default.

index=False to remove the index numbers.

data.to_csv('Stu_data.csv', index=False)

Below is how our CSV looks like

Writing csv using pandas [python read csv]

Write CSV File Using csv.DictWriter

You can write data into a CSV file using the csv module .DictReader following the below steps.

Step1: Import the csv module

import csv

Step2: Using the .open() function, create a new file object with the mode as ‘w’ for writing

Create a new file object using the open() function, specifying the file name with the mode set as ‘w’ for writing.

 with open('Students_Data.csv', 'w', newline='') as csvfile:

Step3: Type in the data you want to write to the CSV file as a list of dictionaries

data = [{'Name': 'Alex', 'M1 Score': 62, 'M2 Score': 80},
        {'Name': 'Brad', 'M1 Score': 45, 'M2 Score': 56},
        {'Name': 'Joey', 'M1 Score': 85, 'M2 Score': 98}]

Step4: Create a csv.DictWriter object specifying the file object, the fieldname parameters, and the delimiter

Note that the delimiter by default is ‘,’

fieldnames = ['Name', 'M1 Score', 'M2 Score'] writer = csv.DictWriter(csvfile, fieldnames=fieldnames)

Step5: Write the header row using the writeheader() method.

    writer.writeheader()

Step6: Use the writerows() method to write the data to the CSV file

    writer.writerows(data)

This will create a new file named ‘Students_Data.csv’ with Name, M1 Score, and M2 Score as the header/column names and the data values under the data variable.

Conclusion

By now, I’m sure you are all familiar with the various techniques for handling CSV files in Python, including the essential process of Python read CSV file. We trust this article has been informative for all. Feel free to share it with your study buddies to spread the knowledge and enhance everyone’s Python skills.

Knowing how to read and write CSV files in Python is an essential skill for any data scientist or analyst. It can save time, improve productivity, and make data processing more efficient. Whether you’re just starting out or looking to take your skills to the next level, our Data Science Black Belt program is an excellent resource to enhance your knowledge in data science. The program covers basics of Python programming to advanced machine learning concepts. With hands-on projects and case studies, you’ll gain practical experience and learn how to apply your skills to real-world problems.

Key Takeaways

  • Creating a Comma Separated Values (CSV) file is the simplest way of converting complex data into a readable text file.
  • A file in the CSV format shows you organized tabular data similar to an excel sheet.
  • You can read a CSV file in Python using csv.reader, .readlines(), or csv.DictReader, and write into one by using .writer, .DictWriter, or .writelines().
  • Pandas can be used for both reading and writing data in a CSV.

Frequently Asked Questions

Q1. How to write data to a CSV file in Python?

A. You can write data to a CSV file in Python using pandas, or csv modules such as .writer and .DictWriter, or by the .writelines() method.

Q2. How to read a CSV file as text in Python?

A. There are many ways to read CSV files as plain text in Python including using csv.reader, .readlines(), pandas, or csv.DictReader.

Q3. How do I read a CSV file in Python rows?

A. To read a CSV file in Python row by row, you can use the csv.reader module and iterate through each row using a for loop. This method allows you to efficiently process the contents of the CSV file in Python.

Q4. How to create CSV in Python?

A. To create a CSV file in Python, you can use the built-in csv module. First, import the module and open a new file using the ‘with open’ statement. Then create a csv writer object and use it to write rows of data to the file. Finally, close the file.

Q5. How to read CSV file in Python deadline?

To read a CSV file in Python before the deadline, utilize the pandas library’s read_csv function, which provides efficient methods for reading CSV files into a DataFrame for further analysis. Additionally, you can leverage Python’s capabilities to write CSV files using the csv.writer module, enabling you to handle data manipulation tasks effectively.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Hi, my name is Harika. I am a Data Engineer and I thrive on creating innovative solutions and improving user experiences. My passion lies in leveraging data to drive innovation and create meaningful impact.

Responses From Readers

Clear

gabriel
gabriel

Hey! Thank you! But, what if the headers get more than 1 unique row?

George Thomas
George Thomas

What a great article! ” Your information is very helpful for becoming a better blogger. Keep sharing.

Esteve
Esteve

thanks a million for the article

We use cookies essential for this site to function well. Please click to help us improve its usefulness with additional cookies. Learn about our use of cookies in our Privacy Policy & Cookies Policy.

Show details