Pandas Read CSV in Python

Last Updated : 24 Jul, 2023

CSV files are the Comma Separated Files. To access data from the CSV file, we require a function read_csv() from Pandas that retrieves data in the form of the data frame.

Syntax of read_csv()

Here is the Pandas read CSV syntax with its parameters.

Syntax: pd.read_csv(filepath_or_buffer, sep=’ ,’ , header=’infer’, index_col=None, usecols=None, engine=None, skiprows=None, nrows=None)

Parameters:

filepath_or_buffer: Location of the csv file. It accepts any string path or URL of the file.

sep: It stands for separator, default is ‘, ‘.

header: It accepts int, a list of int, row numbers to use as the column names, and the start of the data. If no names are passed, i.e., header=None, then, it will display the first column as 0, the second as 1, and so on.

usecols: Retrieves only selected columns from the CSV file.

nrows: Number of rows to be displayed from the dataset.

index_col: If None, there are no index numbers displayed along with records.

skiprows: Skips passed rows in the new data frame.

Read CSV File using Pandas read_csv

Before using this function, we must import the Pandas library, we will load the CSV file using Pandas.

PYTHON3

# Import pandas
import pandas as pd
 
# reading csv file 
df = pd.read_csv("people.csv")
print(df.head())

Output:

  First Name Last Name     Sex                       Email Date of birth  Job Title        
0     Shelby   Terrell    Male        elijah57@example.net    1945-10-26  Games developer 
1    Phillip   Summers  Female       bethany14@example.com    1910-03-24  Phytotherapist  
2   Kristine    Travis    Male       bthompson@example.com    1992-07-02  Homeopath  
3    Yesenia  Martinez    Male   kaitlinkaiser@example.com    2017-08-03  Market researcher
4       Lori      Todd    Male  buchananmanuel@example.net    1938-12-01  Veterinary surgeon

Using sep in read_csv()

In this example, we will take a CSV file and then add some special characters to see how the sep parameter works.

Python3

# sample = "totalbill_tip, sex:smoker, day_time, size
# 16.99, 1.01:Female|No, Sun, Dinner, 2
# 10.34, 1.66, Male, No|Sun:Dinner, 3
# 21.01:3.5_Male, No:Sun, Dinner, 3
#23.68, 3.31, Male|No, Sun_Dinner, 2
# 24.59:3.61, Female_No, Sun, Dinner, 4
# 25.29, 4.71|Male, No:Sun, Dinner, 4"
 
# Importing pandas library
import pandas as pd
 
# Load the data of csv
df = pd.read_csv('sample.csv',
                 sep='[:, |_]',
                 engine='python')
 
# Print the Dataframe
print(df)

Output:

        totalbill   tip Unnamed: 2   sex smoker Unnamed: 5     day    time  Unnamed: 8  size 
16.99         NaN  1.01     Female    No    NaN        Sun     NaN  Dinner         NaN     2
10.34         NaN  1.66        NaN  Male    NaN         No     Sun  Dinner         NaN     3
21.01        3.50  Male        NaN    No    Sun        NaN  Dinner     NaN         3.0  None
23.68         NaN  3.31        NaN  Male     No        NaN     Sun  Dinner         NaN     2
24.59        3.61   NaN     Female    No    NaN        Sun     NaN  Dinner         NaN     2
25.29         NaN  4.71       Male   NaN     No        Sun     NaN  Dinner         NaN     4

Using usecols in read_csv()

Here, we are specifying only 3 columns,i.e.[“First Name”, “Sex”, “Email”] to load and we use the header 0 as its default header.

Python3

df = pd.read_csv('people.csv',
        header=0,
        usecols=["First Name", "Sex", "Email"])
# printing dataframe
print(df.head())

Output:

  First Name     Sex                       Email
0     Shelby    Male        elijah57@example.net
1    Phillip  Female       bethany14@example.com
2   Kristine    Male       bthompson@example.com
3    Yesenia    Male   kaitlinkaiser@example.com
4       Lori    Male  buchananmanuel@example.net

Using index_col in read_csv()

Here, we use the “Sex” index first and then the “Job Title” index, we can simply reindex the header with index_col parameter.

Python3

df = pd.read_csv('people.csv',
        header=0,
        index_col=["Sex", "Job Title"],
        usecols=["Sex", "Job Title", "Email"])
 
print(df.head())

Output:

                                                Email
Sex    Job Title                                     
Male   Games developer           elijah57@example.net
Female Phytotherapist           bethany14@example.com
Male   Homeopath                bthompson@example.com
       Market researcher    kaitlinkaiser@example.com
       Veterinary surgeon  buchananmanuel@example.net

Using nrows in read_csv()

Here, we just display only 5 rows using nrows parameter.

Python3

df = pd.read_csv('people.csv',
        header=0,
        index_col=["Sex", "Job Title"],
        usecols=["Sex", "Job Title", "Email"],
                nrows=3)
 
print(df)

Output:

                                        Email
Sex    Job Title                             
Male   Games developer   elijah57@example.net
Female Phytotherapist   bethany14@example.com
Male   Homeopath        bthompson@example.com

Using skiprows in read_csv()

The skiprows help to skip some rows in CSV, i.e, here you will observe that the rows mentioned in skiprows have been skipped from the original dataset.

Python3

df= pd.read_csv("people.csv")
print("Previous Dataset: ")
print(df)
# using skiprows
df = pd.read_csv("people.csv", skiprows = [1,5])
print("Dataset After skipping rows: ")
print(df)

Output:

Previous Dataset:
  First Name Last Name     Sex                       Email Date of birth           Job Title 
0     Shelby   Terrell    Male        elijah57@example.net    1945-10-26     Games developer
1    Phillip   Summers  Female       bethany14@example.com    1910-03-24      Phytotherapist  
2   Kristine    Travis    Male       bthompson@example.com    1992-07-02           Homeopath  
3    Yesenia  Martinez    Male   kaitlinkaiser@example.com    2017-08-03   Market researcher
4       Lori      Todd    Male  buchananmanuel@example.net    1938-12-01  Veterinary surgeon 
5       Erin       Day    Male         tconner@example.org    2015-10-28  Management officer  
6  Katherine      Buck  Female     conniecowan@example.com    1989-01-22             Analyst
7    Ricardo    Hinton    Male     wyattbishop@example.com    1924-03-26      Hydrogeologist  
Dataset After skipping rows:               
  First Name Last Name     Sex                       Email Date of birth           Job Title 
0     Shelby   Terrell    Male        elijah57@example.net    1945-10-26     Games developer
1   Kristine    Travis    Male       bthompson@example.com    1992-07-02           Homeopath  
2    Yesenia  Martinez    Male   kaitlinkaiser@example.com    2017-08-03   Market researcher
3       Lori      Todd    Male  buchananmanuel@example.net    1938-12-01  Veterinary surgeon      
4  Katherine      Buck  Female     conniecowan@example.com    1989-01-22             Analyst
5    Ricardo    Hinton    Male     wyattbishop@example.com    1924-03-26      Hydrogeologist

Suggest improvement

Python | Pandas tseries.offsets.DateOffset

Saving a Pandas Dataframe as a CSV

Share your thoughts in the comments

Introduction

Creating Objects

Viewing Data

Selection & Slicing

Operations

Manipulating Data

Grouping Data

Merging, Joining, Concatenating and Comparing

Working with Date and Time

Working With Text Data

Working with CSV and Excel files

Visualization

Applications and Projects

Introduction

Creating Objects

Viewing Data

Selection & Slicing

Operations

Manipulating Data

Grouping Data

Merging, Joining, Concatenating and Comparing

Working with Date and Time

Working With Text Data

Working with CSV and Excel files

Visualization

Applications and Projects

Pandas Read CSV in Python

Syntax of read_csv()

Read CSV File using Pandas read_csv

PYTHON3

Using sep in read_csv()

Python3

Using usecols in read_csv()

Python3

Using index_col in read_csv()

Python3

Using nrows in read_csv()

Python3

Using skiprows in read_csv()

Python3

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?