Pandas provides a rich collection of functions to perform data analysis in Python. While performing data analysis, quite often we require to filter the data to remove unnecessary rows or columns.
We have already discussed earlier how to drop rows or columns based on their labels. However, in this post we are going to discuss several approaches on how to drop rows from the dataframe based on certain condition applied on a column. Retain all those rows for which the applied condition on the given column evaluates to
To download the CSV used in code, click here.
You are given the “nba.csv” dataset. Drop all the players from the dataset whose age is below 25 years.
Solution #1 : We will use vectorization to filter out such rows from the dataset which satisfy the applied condition.
In this dataframe, currently, we are having 458 rows and 9 columns. Let’s use vectorization operation to filter out all those rows which satisfy the given condition.
As we can see in the output, the returned dataframe only contains those players whose age is greater than or equal to 25 years.
Solution #2 : We can use the
DataFrame.drop() function to drop such rows which does not satisfy the given condition.
As we can see in the output, we have successfully dropped all those rows which do not satisfy the given condition applied to the ‘Age’ column.
- How to Drop rows in DataFrame by conditions on column values?
- How to select rows from a dataframe based on column values ?
- Python | Creating a Pandas dataframe column based on a given condition
- Python | Delete rows/columns from DataFrame using Pandas.drop()
- How to Drop Rows with NaN Values in Pandas DataFrame?
- Drop rows from Pandas dataframe with missing values or NaN in columns
- How to drop rows in Pandas DataFrame by index labels?
- Drop a list of rows from a Pandas DataFrame
- Count all rows or those that satisfy some condition in Pandas dataframe
- Selecting rows in pandas DataFrame based on conditions
- Sort rows or columns in Pandas Dataframe based on values
- Find duplicate rows in a Dataframe based on all or selected columns
- Return the Index label if some condition is satisfied over a column in Pandas Dataframe
- Loop or Iterate over all or certain columns of a dataframe in Python-Pandas
- Create a new column in Pandas DataFrame based on the existing columns
- How to Sort a Pandas DataFrame based on column names or row index?
- Create a DataFrame from a Numpy array and specify the index column and column headers
- Get column index from column name of a given Pandas DataFrame
- Create a Pandas DataFrame from a Numpy array and specify the index column and column headers
- Convert given Pandas series into a dataframe with its index as another column on the dataframe
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.