How to import an excel file into Python using Pandas?

It is not always possible to get the dataset in CSV format. So, Pandas provides us the functions to convert datasets in other formats to the Data frame. An excel file has a ‘.xlsx’ format. 

Before we get started,  we need to install a few libraries. 

pip install pandas
pip install xlrd

 For importing an Excel file into Python using Pandas we have to use pandas.read_excel() function.

Syntax: pandas.read_excel(io, sheet_name=0, header=0, names=None,….)

Return: DataFrame or dict of DataFrames.



Let’s suppose the Excel file looks like this:

Excel file

Now, we can dive into the code. 

Example 1: Read an Excel file.

Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
  
df = pd.read_excel("sample.xlsx")
print(df)

chevron_right


Output:

dataframe

Example 2: To select a particular column, we can pass a parameter “index_col“. 



Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
  
# Here 0th column will be extracted
df = pd.read_excel("sample.xlsx",
                   index_col = 0)  
  
print(df)

chevron_right


Output:

select a particular column

Example 3: In case you don’t prefer the initial heading of the columns, you can change it to indexes using the parameter “header”.

Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
  
df = pd.read_excel('sample.xlsx',
                   header = None)
print(df)

chevron_right


Output:

dataframe without header

Example 4: If you want to change the data type of a particular column you can do it using the parameter “dtype“.

Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
  
df = pd.read_excel('sample.xlsx'
                   dtype = {"Products": str,
                            "Price":float})
print(df)

chevron_right


Output:



data type change

Example 5: In case you have unknown values, then you can handle it using the parameter “na_values“. It will convert the mentioned unknown values into “NaN” 

Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
df = pd.read_excel('sample.xlsx'
                   na_values =['item1'
                               'item2'])
print(df)

chevron_right


Output:

Dataframe with NaN value

Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.