Get values of all rows in a particular column in openpyxl – Python
Last Updated :
23 Jan, 2023
In this article, we will explore how to get the values of all rows in a particular column in a spreadsheet using openpyxl in Python. We will start by discussing the basics of openpyxl and how to install and import it. Then, we will walk through for example, how to extract the values of a particular column from a spreadsheet and store them in a list.
To install openpyxl using pip, open a terminal window or command prompt and enter the following command:
pip install openpyxl
Raw Spreadsheet Data
For example, suppose you have a spreadsheet with the following data:
Name |
Age |
Gender |
John |
30 |
Male |
Jane |
25 |
Female |
Bob |
35 |
Male |
Alice |
28 |
Female |
Extracting the Values of a Particular Column
Now that you know how to read a spreadsheet and access the rows and cells in a sheet, you can extract the values of a particular column from the sheet. To do this, you can simply iterate over the rows in the sheet and check the value of the cell in the column you are interested in. You can then store the values in a list or other data structure for further processing.
Example 1
To extract the values of the “Name” column, you could use the following code:
Python3
import openpyxl
workbook = openpyxl.load_workbook( "data.xlsx" )
sheet = workbook.worksheets[ 0 ]
names = []
for row in sheet:
name = row[ 0 ].value
names.append(name)
print (names)
|
Output:
['Name', 'John', 'Jane', 'Bob', 'Alice']
Example 2
Note that the first row in the sheet contains the column names, so it is included in the list. If you want to exclude the column names from the list, you can add a check to skip the first row:
Python3
import openpyxl
workbook = openpyxl.load_workbook( "data.xlsx" )
sheet = workbook.worksheets[ 0 ]
names = []
for i, row in enumerate (sheet):
if i = = 0 :
continue
name = row[ 0 ].value
names.append(name)
print (names)
|
Output:
['John', 'Jane', 'Bob', 'Alice']
Extracting the Values of a Particular Column Name
In the previous example, we extracted the values of a particular column by specifying the index of the column. However, in some cases, it may be more convenient to extract the values of a column by its name, rather than its index. To do this, you can use the iter_cols() method of the Sheet object to iterate over the columns in the sheet, and check the value of the first cell in each column to determine the column name.
Example 1
To extract the values of the “Name” column by its name, you could use the following code:
Python3
import openpyxl
workbook = openpyxl.load_workbook( "data.xlsx" )
sheet = workbook.worksheets[ 0 ]
names = []
for column in sheet.iter_cols():
column_name = column[ 0 ].value
if column_name = = "Age" :
for cell in column:
names.append(cell.value)
print (names)
|
Output:
['Age', 30.0, 25.0, 35.0, 28.0]
Example 2
To exclude the column name from the list, you can add a check to skip the first cell:
Python3
import openpyxl
workbook = openpyxl.load_workbook( "data.xlsx" )
sheet = workbook.sheets[ 0 ]
names = []
for column in sheet.iter_cols():
column_name = column[ 0 ].value
if column_name = = "Age" :
for i, cell in enumerate (column):
if i = = 0 :
continue
names.append(cell.value)
print (names)
|
Output:
[30.0, 25.0, 35.0, 28.0]
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...