Let’s discuss how to randomly select rows from Pandas DataFrame. A random selection of rows from a DataFrame can be achieved in different ways.
Create a simple dataframe with dictionary of lists.
Mathod #1: Using sample() method
Sample method returns a random sample of items from an axis of object and this object of same type as your caller.
Example 2: Using parameter n, which selects n numbers of rows randomly.
Select n numbers of rows randomly using
sample(n=n). Each time you run this, you get n different rows.
Example 3: Using
One can do fraction of axis items and get rows. For example, if
frac= .5 then sample method return 50% of rows.
First selects 70% rows of whole df dataframe and put in another dataframe df1 after that we select 50% frac from df1.
Example 5: Select some rows randomly with
replace = false
Parameter replace give permission to select one rows many time(like). Default value of replace parameter of
sample() method is False so you never select more than total number of rows.
Example 6: Select more than n rows where n is total number of rows with the help of
Example 7: Using weights
Example 8: Using axis
The axis accepts number or name.
sample() method also allows users to sample columns instead of rows using the axis argument.
Example 9: Using random_state
With a given DataFrame, the sample will always fetch same rows. If random_state is None or
np.random, then a randomly-initialized RandomState object is returned.
Method #2: Using NumPy
Numpy chose how many index include for random selection and we can allow replacement.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.