Excel sheet is one of the most ubiquitous forms of files in the IT Industry. Everyone who uses a computer at one time or the other have come across and worked with excel spreadsheets. This popularity of excel is due to its vast range of applications in the field of storing and manipulation of data in a tabular and systematic form. Moreover, excel sheets are very instinctive and user-friendly, which makes it ideal for manipulating large datasets even for less technical folks. If you are looking for places to learn to manipulate and automate stuff in excel files using Python, look no more. You are at the right place.

In this article, you will learn how to use Pandas to work with Excel spreadsheets. At the end of the article, you will have the knowledge of:

- Necessary modules needed for this and how to set them up in your system.
- Reading data from excel file into pandas using Python.
- Exploring the data from excel files in Pandas.
- Using functions to manipulate and reshape the data in Pandas.

### Installation

To install pandas in Anaconda, we can use the following command in Anaconda Terminal:

conda install pandas

To install pandas in regular Python (Non-Anaconda), we can use the following command in the command prompt:

pip install pandas

### Getting Started

First of all, we need to import the pandas module which can be done by running the command:

`import` `pandas as pds ` |

*chevron_right*

*filter_none*

**Input File:** Let’s suppose the excel file looks like this

**Sheet 1:**

**Sheet 2:**

Now we can import the excel file using the read_excel function in pandas, as shown below:

`file` `=` `(` `'path_of_excel_file'` `) ` `newData ` `=` `pds.read_excel(` `file` `) ` `newData ` |

*chevron_right*

*filter_none*

**Output:**

The second statement reads the data from excel and stores it into a pandas Data Frame which is represented by the variable `newData`

. If there are multiple sheets in the excel workbook, the command will import data of the first sheet. To make a data frame with all the sheets in the workbook, the easiest method is to create different data frames separately and then concatenate them. The read_excel method takes argument `sheet_name`

and `index_col`

where we can specify the sheet of which the data frame should be made of and `index_col`

specifies the title column.

**Example:**

`sheet1 ` `=` `pds.read_excel(` `file` `, ` ` ` `sheet_name ` `=` `0` `, ` ` ` `index_col ` `=` `0` `) ` ` ` `sheet2 ` `=` `pds.read_excel(` `file` `, ` ` ` `sheet_name ` `=` `1` `, ` ` ` `index_col ` `=` `0` `) ` ` ` `newData ` `=` `pds.concat([sheet1, sheet2]) ` |

*chevron_right*

*filter_none*

The third statement concatenates both the sheets. Now to check the whole data frame, we can simply run the following command:

`newData ` |

*chevron_right*

*filter_none*

**Output:**

To view 5 columns from the top and from the bottom of the data frame, we can run the command:

`newData.head() ` `newData.tail() ` |

*chevron_right*

*filter_none*

**Output:**

This `head()`

and `tail()`

method also take arguments as numbers for the number of columns to show.

The `shape()`

method can be used to view the number of rows and columns in the data frame as follows:

`newData.shape ` |

*chevron_right*

*filter_none*

**Output:**

If any column contains numerical data, we can sort that column using the `sort_values()`

method in pandas as follows:

`sorted_column ` `=` `newData.sort_values([` `'Height'` `], ascending ` `=` `False` `) ` |

*chevron_right*

*filter_none*

Now, lets suppose we want the top 5 values of the sorted column, we can use the `head()`

method here:

`sorted_column[` `'Height'` `].head(` `5` `) ` |

*chevron_right*

*filter_none*

**Output:**

We can do that with any numerical column of the data frame as shown below:

`newData[` `'Weight'` `].head() ` |

*chevron_right*

*filter_none*

**Output:**

Now, suppose our data is mostly numerical. We can get the statistical information like mean, max, min, etc. about the data frame using the `describe()`

method as shown below:

`newData.describe() ` |

*chevron_right*

*filter_none*

**Output:**

This can also be done separately for all the numerical columns using the following command:

`newData[` `'Weight'` `].mean() ` |

*chevron_right*

*filter_none*

**Output:**

Other statistical information can also be calculated using the respective methods.

Like in excel, formulas can also be applied and calculated columns can be created as follows:

`newData[` `'calculated_column'` `]` `=` `newData[“Height”] ` `+` `newData[“Weight”] ` `newData[` `'calculated_column'` `].head() ` |

*chevron_right*

*filter_none*

**Output:**

After operating on the data in the data frame, we can export the data back to an excel file using the method `to_excel`

. For this we need to specify an output excel file where the transformed data is to be written, as shown below:

`newData.to_excel(` `'Output File.xlsx'` `) ` |

*chevron_right*

*filter_none*

**Output:**

## Recommended Posts:

- Joining Excel Data from Multiple files using Python Pandas
- How to Filter and save the data as new files in Excel with Python Pandas?
- Creating a dataframe using Excel files
- Working with wav files in Python using Pydub
- Working with csv files in Python
- Working with zip files in Python
- Working with PDF files in Python
- How to extract Time data from an Excel file column using Pandas?
- Convert CSV to Excel using Pandas in Python
- Find the profit and loss in the given Excel sheet using Pandas
- How to Create a Pivot table with multiple indexes from an excel sheet using Pandas in Python?
- How to import an excel file into Python using Pandas?
- How to sort date in excel using Pandas?
- How to extract Email column from Excel file and find out the type of mail using Pandas?
- How to import excel file and find a specific column using Pandas?
- How to extract date from Excel file using Pandas?
- Find the profit and loss percent in the given Excel sheet using Pandas
- Find the sum and maximum value of the two column in excel file using Pandas
- Python | Working with date and time using Pandas
- Working with database using Pandas

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.