Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages that makes importing and analyzing data much easier. here we are learning how to Extract rows using Pandas .iloc[] in Python.
Pandas .iloc[] Syntax
Syntax: pandas.DataFrame.iloc[]
Parameters: Index position of rows in integer or list of integer.
Return type: Data frame or Series depending on parameters
What is Pandas .iloc[] in Python?
In the Python Pandas library, .iloc[]
is an indexer used for integer-location-based indexing of data in a DataFrame. It allows users to select specific rows and columns by providing integer indices, making it a valuable tool for data manipulation and extraction based on numerical positions within the DataFrame. This indexer is particularly useful when you want to access or manipulate data using integer-based positional indexing rather than labels.
Dataset Used: To download the CSV used in the code, click here.
Extracting Rows using Pandas .iloc[] in Python
The Pandas library provides a unique method to retrieve rows from a DataFrame. Dataframe.iloc[] method is used when the index label of a data frame is something other than numeric series of 0, 1, 2, 3….n or in case the user doesn’t know the index label. Rows can be extracted using an imaginary index position that isn’t visible in the Dataframe.
There are various method to Extracting rows using Pandas .iloc[] in Python here we are using some generally used methods which are following:
- Selecting rows using Pandas .iloc and loc
- Selecting Multiple Rows using Pandas .iloc[] in Python
- Select Rows by Name or Index usingPandas .iloc[] in Python
Selecting rows using Pandas .iloc and loc
In this example, the same index number row is extracted by both .iloc[] and.loc[] methods and compared. Since the index column by default is numeric, hence the index label will also be integers.
Python3
import pandas as pd
data = pd.read_csv( 'nba.csv' )
row1 = data.loc[ 3 ]
row2 = data.iloc[ 3 ]
row1 = = row2
|
Output:
Name True
Team True
Number True
Position True
Age True
Height True
Weight True
College True
Salary True
Name: 3, dtype: bool
As shown in the output image, the results returned by both methods are the same.
Selecting Multiple Rows using Pandas .iloc[] in Python
In this example, multiple rows are extracted, first by passing a list and then by passing integers to extract rows between that range. After that, both values are compared.
Python3
import pandas as pd
data = pd.read_csv( 'nba.csv' )
row1 = data.iloc[[ 4 , 5 , 6 , 7 ]]
row2 = data.iloc[ 4 : 8 ]
row1 = = row2
|
Output:
Name Team Number Position Age Height Weight College Salary
4 True True True True True True True False True
5 True True True True True True True False True
6 True True True True True True True True True
7 True True True True True True True True True
As shown in the output image, the results returned by both methods are the same. All values are True except values in the college column since those were NaN values.
Select Rows by Name or Index usingPandas .iloc[] in Python
This code uses Pandas to create a DataFrame with information about individuals (Geek1 to Geek5) regarding their age and salary. It sets the ‘Name’ column as the index for clarity. The original DataFrame is displayed, and then it demonstrates the extraction of a single row (Geek1) and multiple rows (Geek2 to Geek3) using Pandas .iloc[]
for integer-location based indexing. The extracted rows are printed for verification.
Python3
import pandas as pd
data = pd.DataFrame({
'Name' : [ 'Geek1' , 'Geek2' , 'Geek3' , 'Geek4' , 'Geek5' ],
'Age' : [ 25 , 30 , 22 , 35 , 28 ],
'Salary' : [ 50000 , 60000 , 45000 , 70000 , 55000 ]
})
data.set_index( 'Name' , inplace = True )
print ( "Original DataFrame:" )
print (data)
row_alice = data.iloc[ 0 , :]
print ( "\nExtracted Row (Geek1):" )
print (row_alice)
rows_geek2_to_geek3 = data.iloc[ 1 : 3 , :]
print ( "\nExtracted Rows (Geek2 to Geek3):" )
print (rows_geek2_to_geek3)
|
Output :
Original DataFrame:
Age Salary
Name
Geek1 25 50000
Geek2 30 60000
Geek3 22 45000
Geek4 35 70000
Geek5 28 55000
Extracted Row (Geek1):
Age 25
Salary 50000
Name: Geek1, dtype: int64
Extracted Rows (Geek2 to Geek3):
Age Salary
Name
Geek2 30 60000
Geek3 22 45000
Conclusion
In Conclusion, Pandas .iloc[]
in Python is a powerful tool for extracting rows based on integer-location indexing. Its value shines in datasets where numerical positions matter more than labels. This feature allows selective retrieval of individual rows or slices, making it essential for efficient data manipulation and analysis. The versatility of .iloc[]
enhances flexibility in data extraction, enabling seamless access to specific portions of datasets. As a fundamental component of Pandas, .iloc[]
significantly contributes to the efficiency and clarity of data-related tasks for developers and data scientists.
Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape,
GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out -
check it out now!
Last Updated :
04 Dec, 2023
Like Article
Save Article