Slicing Pandas Dataframe

Last Updated : 12 Feb, 2024

Slicing Pandas DataFrames is a powerful technique, allowing extraction of specific data subsets based on integer positions. In this article, let’s understand examples showcasing row and column slicing, cell selection, and boolean conditions.

Slicing Pandas Dataframe

With the help of Pandas, we can perform slicing in Dataframe. Slicing in pandas dataframes using iloc[] is a powerful technique in Python for extracting specific subsets of data. The iloc[] method allows you to locate and extract rows and columns based on their integer positions.

To perform slicing with iloc[], you specify the row and column indices you want to include in your sliced dataframe. The syntax is similar to traditional array slicing, making it intuitive for Python users. For example, df.iloc[1:5, 2:4] extracts rows 2 to 5 and columns 3 to 4 from the dataframe.

Slicing a DataFrame in Pandas includes the following steps:

Create a DataFrame
Slice the DataFrame

Let’s import pandas library , and create pandas dataframe from custom nested list.

Python3

import pandas as pd
 
# Initializing the nested list with Data set
player_list = [['M.S.Dhoni', 36, 75, 5428000],
               ['A.B.D Villers', 38, 74, 3428000],
               ['V.Kohli', 31, 70, 8428000],
               ['S.Smith', 34, 80, 4428000],
               ['C.Gayle', 40, 100, 4528000],
               ['J.Root', 33, 72, 7028000],
               ['K.Peterson', 42, 85, 2528000]]
 
# creating a pandas dataframe
df = pd.DataFrame(player_list, columns=['Name', 'Age', 'Weight', 'Salary'])
df # data frame before slicing

Output:

 Name  Age  Weight   Salary
0      M.S.Dhoni   36      75  5428000
1  A.B.D Villers   38      74  3428000
2        V.Kohli   31      70  8428000
3        S.Smith   34      80  4428000
4        C.Gayle   40     100  4528000
5         J.Root   33      72  7028000
6     K.Peterson   42      85  2528000

1. Slicing Using iloc

A. Slicing Rows in dataframe in python

Python3

# Slicing rows in data frame
df1 = df.iloc[0:4]
 
# data frame after slicing
df1

Output:

Name  Age  Weight   Salary
0      M.S.Dhoni   36      75  5428000
1  A.B.D Villers   38      74  3428000
2        V.Kohli   31      70  8428000
3        S.Smith   34      80  4428000

In the above example, we sliced the rows from the data frame.

B. Slicing Columns in dataframe in python

Python3

# Slicing columnss in data frame
df1 = df.iloc[:, 0:2]
 
# data frame after slicing
df1

Output:

Name  Age
0      M.S.Dhoni   36
1  A.B.D Villers   38
2        V.Kohli   31
3        S.Smith   34
4        C.Gayle   40
5         J.Root   33
6     K.Peterson   42

In the above example, we sliced the columns from the data frame.

C. Selecting a Specific Cell in Dataframe in Python

Python3

specific_cell_value = df.iloc[2, 3]  # Row 2, Column 3 (Salary)
print("Specific Cell Value:", specific_cell_value)

Output:

Specific Cell Value: 8428000

D. Using Boolean Conditions in Dataframe in Python

Python3

filtered_data = df[df['Age'] > 35].iloc[:, :]  # Select rows where Age is greater than 35
print("\nFiltered Data based on Age > 35:\n", filtered_data)

Output:

Filtered Data based on Age > 35:
             Name  Age  Weight   Salary
0      M.S.Dhoni   36      75  5428000
1  A.B.D Villers   38      74  3428000
4        C.Gayle   40     100  4528000
6     K.Peterson   42      85  2528000

2. Slicing Using loc[]

We can also, implement slicing through loc there are some limitations:

loc relies on labels, and if your DataFrame has custom labels, you need to be careful with how you specify them.
If labels are integers, there might be confusion between using integer positions and actual labels.

For this, we need to set index as labels manually with following code:

Python3

df_custom = df.set_index('Name')
df_custom

Output:

    Age    Weight    Salary
Name            
M.S.Dhoni    36    75    5428000
A.B.D Villers    38    74    3428000
V.Kohli    31    70    8428000
S.Smith    34    80    4428000
C.Gayle    40    100    4528000
J.Root    33    72    7028000
K.Peterson    42    85    2528000

A. Slicing Rows in Dataframe in Python

Python3

sliced_rows_custom = df_custom.loc['A.B.D Villers':'S.Smith']
sliced_rows_custom

Output:

    Age    Weight    Salary
Name            
A.B.D Villers    38    74    3428000
V.Kohli    31    70    8428000
S.Smith    34    80    4428000

B. Selecting Specified cell in Dataframe in Python

Python3

specific_cell_value = df_custom.loc['V.Kohli', 'Salary']
print("\nValue of the Specific Cell (V.Kohli, Salary):", specific_cell_value)

Output:

Value of the Specific Cell (V.Kohli, Salary): 8428000

Conclusion

In summary, both iloc[] and loc[]provide versatile slicing capabilities in Pandas. While iloc[] is integer-based, loc[] relies on labels, requiring careful consideration when working with custom indices or mixed data types.