Open In App

Excel Automation with Openpyxl in Python

Last Updated : 08 Jun, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Excel is a powerful tool that allows you to store, manipulate and analyze data. It is widely used in various industries for data analysis and reporting. However, as the size of the data increases, Excel becomes slow and it takes a lot of time to perform complex operations. This is where Python comes into play. Python is a versatile programming language that can be used for a wide range of applications, including data analysis, machine learning, and automation. In this blog, we will discuss how to automate your Excel reporting with Python.

Concepts Related to the Topic:

  • Excel – Excel is a spreadsheet software developed by Microsoft. It is used to store, manipulate and analyze data.
  • Python – Python is a high-level programming language that is used for a wide range of applications, including web development, data analysis, and automation.
  • Pandas – Pandas is a Python library used for data manipulation and analysis. It provides data structures for efficiently storing and manipulating large datasets.
  • Openpyxl – Openpyxl is a Python library used for reading and writing Excel files. It allows you to perform various operations on Excel files using Python.

Steps Needed:

  1. Install Python and required libraries – First, you need to install Python and the required libraries like Pandas and Openpyxl.
  2. Load Excel file – Next, you need to load the Excel file using Pandas. Pandas provides a function called ‘read_excel’ that allows you to read Excel files.
  3. Perform Data Manipulation – Once you have loaded the data, you can perform various operations like filtering, sorting, and grouping using Pandas.
  4. Write Data to Excel – Once you have performed the required operations, you can write the data back to the Excel file using openpyxl.

Different Possible Approaches:

  • Using Pandas and Openpyxl – You can use Pandas to load the Excel file, perform data manipulation, and then write the data back to the Excel file using openpyxl.
  • Using xlwings – xl-wings is a Python library that allows you to interact with Excel using Python. You can use xlwings to automate Excel tasks like data entry, formatting, and charting.
  • Using PyXLL – PyXLL is a Python library that allows you to write Python functions in Excel. You can use PyXLL to perform complex calculations and data manipulation in Excel using Python.

Example 1:

Load Excel File and Write the Data :

Python3




import pandas as pd
from openpyxl import load_workbook
  
# Load Excel File and give path to your file
df = pd.read_excel('data.xlsx'
  
# Perform Data Manipulation
df = df[df['Sales'] > 1000]
  
# Write Data to Excel File
book = load_workbook('data.xlsx')
writer = pd.ExcelWriter('data.xlsx', engine='openpyxl')
writer.book = book
df.to_excel(writer, index=False)
writer.save()


In this example, we are using two libraries: Pandas and Openpyxl. Pandas is used to load the Excel file and perform data manipulation while Openpyxl is used to write the data back to the Excel file.

First, we import the required libraries using the ‘import’ statement. We then use the ‘pd.read_excel’ function to load the Excel file ‘data.xlsx’ into a Pandas DataFrame called ‘df’. 

File link:- Link

Next, we perform data manipulation on the DataFrame by filtering the data where ‘Sales’ is greater than 1000. 

This is done using the following line of code:    

df = df[df[‘Sales’] > 1000]

Finally, we use Openpyxl to write the filtered data back to the Excel file. We start by loading the Excel file using the ‘load_workbook’ function and then creating an ExcelWriter object using the ‘pd.ExcelWriter’ function. We set the book attribute of the writer object to the loaded workbook and then use the ‘to_excel’ method of the DataFrame object to write the filtered data to the worksheet. Finally, we save the changes to the Excel file using the ‘save’ method of the writer object.
Screenshot:
 

Example 1 Screenshot

Example 2:

Load Excel File, plot charts and Manipulate Data
 

Python3




#import necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
  
# Import data from Excel file
sales_data = pd.read_excel('/content/productSales.xlsx') #give file path
  
# Create a bar chart
plt.bar(sales_data['product'], sales_data['sales'])
plt.title('Sales by Product')
plt.xlabel('Product')
plt.ylabel('Sales')
plt.show()
  
# Manipulate data as needed
sales_data = sales_data.groupby('product').sum()
  
# Export data back into Excel file
writer = pd.ExcelWriter('sales_report.xlsx')
sales_data.to_excel(writer, sheet_name='Sales Data')
writer.save()
writer.close()


In this code, we imported the necessary libraries pandas and matplotlib.pyplot, reads data from an Excel file (‘productSales.xlsx’) using pandas’ read_excel() function, creates a bar chart using matplotlib’s plt.bar() function and then manipulates the data by grouping it by ‘product’ and summing the sales for each product using pandas’ group by () and sum() functions. After manipulating the data, we export it back into a new Excel file (‘sales_report.xlsx’) using pandas’ to_excel() function and save and close the file using the save() and close() methods of the ExcelWriter object.
File link of product sales:-  Link
Screenshot:

Example 2 Screenshot



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads