Javascript required
Skip to content Skip to sidebar Skip to footer

Best Way to Read Csv File in Python

Intro: In this article, I will walk you lot through the different ways of reading and writing CSV files in Python.

Tabular array of Contents:

  1. What is a CSV?
  2. Reading a CSV
  3. Writing to a CSV

1. What is a CSV?

CSV stands for "Comma Separated Values." It is the simplest form of storing data in tabular form equally plain text. It is of import to know to work with CSV because we more often than not rely on CSV data in our day-to-24-hour interval lives as data scientists.

Structure of CSV:

Nosotros have a file named "Salary_Data.csv." The beginning line of a CSV file is the header and contains the names of the fields/features.

After the header, each line of the file is an observation/a record. The values of a record are separated by "comma."

two. Reading a CSV

CSV files can be handled in multiple means in Python.

2.i Using csv.reader

Reading a CSV using Python'southward inbuilt module chosen csv using csv.reader object.

Steps to read a CSV file:

i. Import the csv library

import csv

2. Open the CSV file

The .open() method in python is used to open up files and return a file object.

file = open('Salary_Data.csv')  type(file)

The type of file is "_io.TextIOWrapper" which is a file object that is returned by the open() method.

3. Use the csv.reader object to read the CSV file

csvreader = csv.reader(file)

4. Extract the field names

Create an empty list called header. Use the next() method to obtain the header.

The .next() method returns the electric current row and moves to the next row.

The first time you run next() it returns the header and the next time yous run it returns the first tape and and then on.

header = [] header = adjacent(csvreader) header

v. Extract the rows/records

Create an empty list called rows and iterate through the csvreader object and suspend each row to the rows list.

rows = [] for row in csvreader:         rows.suspend(row) rows

6. Close the file

.close() method is used to shut the opened file. Once information technology is closed, we cannot perform any operations on it.

file.close()

Complete Lawmaking:

import csv file = open("Salary_Data.csv") csvreader = csv.reader(file) header = next(csvreader) impress(header) rows = [] for row in csvreader:     rows.append(row) print(rows) file.shut()

Naturally, we might forget to shut an open file. To avoid that we tin can use the with()statement to automatically release the resources. In uncomplicated terms, there is no need to call the .close() method if we are using with() statement.

Implementing the above code using with() statement:

Syntax: with open(filename, mode) as alias_filename:

Modes:

'r' – to read an existing file,
'w' – to create a new file if the given file doesn't exist and write to information technology,
'a' – to append to existing file content,
'+' –  to create a new file for reading and writing

import csv rows = [] with open("Salary_Data.csv", 'r) equally file:     csvreader = csv.reader(file)     header = next(csvreader)     for row in csvreader:         rows.suspend(row) impress(header) print(rows)

ii.2 Using .readlines()

Now the question is – "Is it possible to fetch the header, rows using but open up() and with() statements and without the csv library?" Let'due south see…

.readlines() method is the answer. It returns all the lines in a file as a list. Each detail of the list is a row of our CSV file.

The starting time row of the file.readlines() is the header and the rest of them are the records.

with open('Salary_Data.csv') as file:     content = file.readlines() header = content[:one] rows = content[1:] print(header) print(rows)

**The 'north' from the output can exist removed using .strip() method.

What if we take a huge dataset with hundreds of features and thousands of records. Would it be possible to handle lists??

Hither comes the pandas library into the picture.

2.3 Using pandas

Steps of reading CSV files using pandas

1. Import pandas library

import pandas as pd

2. Load CSV files to pandas using read_csv()

Basic Syntax: pandas.read_csv(filename, delimiter=',')

information= pd.read_csv("Salary_Data.csv") information

3. Excerpt the field names

.columns is used to obtain the header/field names.

data.columns

4. Extract the rows

All the data of a information frame can be accessed using the field names.

information.Salary

3. Writing to a CSV file

We tin write to a CSV file in multiple ways.

3.ane Using csv.author

Let's assume we are recording three Students data(Name, M1 Score, M2 Score)

header = ['Proper noun', 'M1 Score', 'M2 Score'] data = [['Alex', 62, 80], ['Brad', 45, 56], ['Joey', 85, 98]]

Steps of writing to a CSV file:

one. Import csv library

import csv

2. Define a filename and Open the file using open up()

3. Create a csvwriter object using csv.writer()

4. Write the header

5. Write the rest of the information

code for steps 2-5

filename = 'Students_Data.csv' with open up(filename, 'w', newline="") as file:     csvwriter = csv.writer(file) # ii. create a csvwriter object     csvwriter.writerow(header) # 4. write the header     csvwriter.writerows(data) # v. write the rest of the information

Below is how our CSV file looks.

three.ii Using .writelines()

Iterate through each listing and convert the list elements to a string and write to the csv file.

header = ['Proper noun', 'M1 Score', 'M2 Score'] data = [['Alex', 62, 80], ['Brad', 45, 56], ['Joey', 85, 98]] filename = 'Student_scores.csv' with open(filename, 'w') every bit file:     for header in header:         file.write(str(header)+', ')     file.write('n')     for row in data:         for x in row:             file.write(str(x)+', ')         file.write('n')

3.3. Using pandas

Steps to writing to a CSV using pandas

i. Import pandas library

import pandas equally pd

2. Create a pandas dataframe using pd.DataFrame

Syntax: pd.DataFrame(data, columns)

The information parameter takes the records/observations and the columns parameter takes the columns/field names.

header = ['Name', 'M1 Score', 'M2 Score'] data = [['Alex', 62, 80], ['Brad', 45, 56], ['Joey', 85, 98]] data = pd.DataFrame(data, columns=header)

three. Write to a CSV file using to_csv()

Syntax: DataFrame.to_csv(filename, sep=',', index=False)

**separator is ',' by default.

index=False to remove the index numbers.

data.to_csv('Stu_data.csv', alphabetize=False)

Below is how our CSV looks like

End Notes:

Thank you for reading till the conclusion. By the finish of this article, nosotros are familiar with dissimilar ways of handling CSV files in Python.

I hope this article is informative. Experience free to share information technology with your study buddies.

References:

Bank check out the complete code from the GitHub repo.

Other Blog Posts by me

Experience costless to cheque out my other blog posts from my Analytics Vidhya Profile.

Y'all tin can find me on LinkedIn, Twitter in case you would want to connect. I would be glad to connect with you.

For immediate exchange of thoughts, delight write to me at harikabont[email protected].

The media shown in this commodity are not owned by Analytics Vidhya and are used at the Writer's discretion.

gylescrist2001.blogspot.com

Source: https://www.analyticsvidhya.com/blog/2021/08/python-tutorial-working-with-csv-file-for-data-science/