Indexing in Pandas means selecting rows and columns of data from a Dataframe. It can be selecting all the rows and the particular number of columns, a particular number of rows, and all the columns or a particular number of rows and columns each. Indexing is also known as Subset selection.
Creating a Dataframe to Select Rows & Columns in Pandas
There are various ways in which Pandas select columns by index, here we are explaining three generally used methods for select column index those that are follows.
- Pandas Select Columns by Index Using [ ]
- Pandas Select Columns by Index Using Loc
- Pandas Select Columns by Index Using iloc
Create a List of Tuples
In this example, code defines a list of tuples representing employee information with name, age, city, and salary. A list of tuples, say column names are: ‘Name’, ‘Age’, ‘City’, and ‘Salary’.
Python3
import pandas as pd
employees = [( 'Stuti' , 28 , 'Varanasi' , 20000 ),
( 'Saumya' , 32 , 'Delhi' , 25000 ),
( 'Aaditya' , 25 , 'Mumbai' , 40000 ),
( 'Saumya' , 32 , 'Delhi' , 35000 ),
( 'Saumya' , 32 , 'Delhi' , 30000 ),
( 'Saumya' , 32 , 'Mumbai' , 20000 ),
( 'Aaditya' , 40 , 'Dehradun' , 24000 ),
( 'Seema' , 32 , 'Delhi' , 70000 )
]
df = pd.DataFrame(employees,
columns = [ 'Name' , 'Age' ,
'City' , 'Salary' ])
df
|
Output:
Name Age City Salary
0 Stuti 28 Varanasi 20000
1 Saumya 32 Delhi 25000
2 Aaditya 25 Mumbai 40000
3 Saumya 32 Delhi 35000
4 Saumya 32 Delhi 30000
5 Saumya 32 Mumbai 20000
6 Aaditya 40 Dehradun 24000
7 Seema 32 Delhi 70000
Pandas Select Columns by Name in Pandas DataFrame using [ ]
The [ ] is used to select a column by mentioning the respective column name, we can use various way [ ] for select row Select Rows & Columns by Name or Index in Pandas DataFrame , here we are explaining some generally used method of [ ].
- Select a Single Column
- Select Multiple Columns
- Select Rows For Specific Column Value
- Select Rows Based on Condition on Salary
Pandas Select Columns by Index in Single Column
Example : In this example code extracts the “City” column from a DataFrame (assumed to be named `df`) using square bracket notation and assigns it to the variable `result`. Finally, it displays the extracted column or series.
Python3
result = df[ "City" ]
result
|
Output:
0 Varanasi
1 Delhi
2 Mumbai
3 Delhi
4 Delhi
5 Mumbai
6 Dehradun
7 Delhi
Name: City, dtype: object
Pandas Select Columns by Index in Multiple Columns
Example : In this example code selects and creates a new DataFrame (`result`) containing only the columns “Name,” “Age,” and “Salary” from the original DataFrame (`df`), and then displays the resulting DataFrame.
Python3
result = df[[ "Name" , "Age" , "Salary" ]]
result
|
Output:
Name Age Salary
0 Stuti 28 20000
1 Saumya 32 25000
2 Aaditya 25 40000
3 Saumya 32 35000
4 Saumya 32 30000
5 Saumya 32 20000
6 Aaditya 40 24000
7 Seema 32 70000
Pandas Select Rows For Specific Column Value
Example : In this example code filters and selects rows from a DataFrame (`df`) where the ‘City’ column is equal to ‘Delhi’, and then prints the resulting DataFrame containing employees from Delhi.
Python3
delhi_employees = df[df[ 'City' ] = = 'Delhi' ]
print (delhi_employees)
|
Output :
Name Age City Salary
1 Saumya 32 Delhi 25000
3 Saumya 32 Delhi 35000
4 Saumya 32 Delhi 30000
7 Seema 32 Delhi 70000
Select Rows Based on Condition on Salary
Example : In this example code creates a list of employees with salaries exceeding 30000 from a collection called “employees” and prints the resulting list named “high_salary_employees”.
Python3
high_salary_employees = [employee for employee in employees if employee[ 3 ] > 30000 ]
print (high_salary_employees)
|
Output :
[('Aaditya', 25, 'Mumbai', 40000), ('Saumya', 32, 'Delhi', 35000), ('Seema', 32, 'Delhi', 70000)]
Select Rows by Name in Pandas DataFrame using loc
The .loc[] function selects the data by labels of rows or columns. It can select a subset of rows and columns. There are many ways to use this function.
- Select a Single Row
- Select Multiple Rows
- Select Multiple Rows and Particular Columns
- Select all the Rows With Some Particular Columns
Select a Single Row
Example : In this example code sets the “Name” column as the index for a DataFrame named ‘df’, then selects and displays the row with the index value “Stuti” using the .loc[] operator.
Python3
df.set_index( "Name" , inplace = True )
result = df.loc[ "Stuti" ]
result
|
Output:
Age 28
City Varanasi
Salary 20000
Name: Stuti, dtype: object
Select Multiple Rows
Example : In this example code sets the “Name” column as the index for a DataFrame named ‘df’, then selects and displays rows with index values “Stuti” and “Seema” using the .loc[] operator.
Python3
df.set_index( "Name" ,
inplace = True )
result = df.loc[[ "Stuti" , "Seema" ]]
result
|
Output:
Age City Salary
Name
Stuti 28 Varanasi 20000
Seema 32 Delhi 70000
Select Multiple Rows and Particular Columns
Syntax: Dataframe.loc[[“row1”, “row2″…], [“column1”, “column2”, “column3″…]]
Example : In this example code sets the “Name” column as the index, then selects the “City” and “Salary” columns for the rows with names “Stuti” and “Seema” in the DataFrame, displaying the result.
Python3
df.set_index( "Name" , inplace = True )
result = df.loc[[ "Stuti" , "Seema" ],
[ "City" , "Salary" ]]
result
|
Output:
City Salary
Name
Stuti Varanasi 20000
Seema Delhi 70000
Select all the Rows With Some Particular Columns
We use a single colon [ : ] to select all rows and the list of columns that we want to select as given below :
Syntax: Dataframe.loc[[:, [“column1”, “column2”, “column3”]
Example : In this example code sets the “Name” column as the index and extracts the “City” and “Salary” columns into a new DataFrame named ‘result’.
Python3
df.set_index( "Name" , inplace = True )
result = df.loc[:, [ "City" , "Salary" ]]
result
|
Output:
City Salary
Name
Stuti Varanasi 20000
Saumya Delhi 25000
Aaditya Mumbai 40000
Saumya Delhi 35000
Saumya Delhi 30000
Saumya Mumbai 20000
Aaditya Dehradun 24000
Seema Delhi 70000
Select Rows and Columns in Pandas DataFrame using iloc
The iloc[ ] is used for selection based on position. It is similar to loc[] indexer but it takes only integer values to make selections.There are many ways to use this function.
- Select a Single Row
- Select Multiple Rows
- Select Multiple Rows With Some Particular Columns
- Select all the Rows With Some Particular Columns
Select a Single Row
Example : In this example code selects the third row of a DataFrame (df) using integer-location based indexing (.iloc[]) and assigns it to the variable “result.” The last line prints or returns the selected row.
Python3
result = df.iloc[ 2 ]
result
|
Output:
Name Aaditya
Age 25
City Mumbai
Salary 40000
Name: 2, dtype: object
Select Multiple Rows
Example : In this example code uses the `.iloc[]` operator to select specific rows (index 2, 3, and 5) from a DataFrame named ‘df’ and assigns the resulting rows to the variable ‘result’. The last line displays the selected rows in the DataFrame.
Python3
result = df.iloc[[ 2 , 3 , 5 ]]
result
|
Output:
Name Age City Salary
2 Aaditya 25 Mumbai 40000
3 Saumya 32 Delhi 35000
5 Saumya 32 Mumbai 20000
Select Multiple Rows With Some Particular Columns
Example : In this example code uses the `.iloc[]` operator to select specific rows (2, 3, and 5) and columns (0 and 1) from a DataFrame named `df`, creating a new DataFrame called `result`, and then displays the selected data.
Python3
result = df.iloc[[ 2 , 3 , 5 ],
[ 0 , 1 ]]
result
|
Output:
Name Age
2 Aaditya 25
3 Saumya 32
5 Saumya 32
Select all the Rows With Some Particular Columns
Example : In this example code uses the `.iloc[]` operator to select all rows from a DataFrame (`df`) while keeping only the columns at positions 0 and 1, and stores the result in the variable `result`. Finally, it displays the modified DataFrame.
Python3
result = df.iloc[:, [ 0 , 1 ]]
result
|
Output:
Name Age
0 Stuti 28
1 Saumya 32
2 Aaditya 25
3 Saumya 32
4 Saumya 32
5 Saumya 32
6 Aaditya 40
7 Seema 32
Last Updated :
18 Dec, 2023
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...