Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
sample() is used to generate a sample random row or column from the function caller data frame.
DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)
n: int value, Number of random rows to generate.
frac: Float value, Returns (float value * length of data frame values ). frac cannot be used with n.
replace: Boolean value, return sample with replacement if True.
random_state: int value or numpy.random.RandomState, optional. if set to a particular integer, will return same rows as sample in every iteration.
axis: 0 or ‘row’ for Rows and 1 or ‘column’ for Columns.
Return type: New object of same type as caller.
To download the CSV file used, Click Here.
Example #1: Random row from Data frame
In this example, two random rows are generated by the .sample() method and compared later.
As shown in the output image, the two random sample rows generated are different from each other.
Example #2: Generating 25% sample of data frame
In this example, 25% random sample data is generated out of the Data frame.
As shown in the output image, the length of sample generated is 25% of data frame. Also the sample is generated randomly.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.
- Python | pandas.to_markdown() in Pandas
- Add a Pandas series to another Pandas series
- Python | Pandas Index.insert()
- Python | Pandas DatetimeIndex.inferred_freq
- Python | Pandas PeriodIndex.start_time
- Python | Pandas PeriodIndex.week
- Python | Pandas Timestamp.second
- Python | Pandas Series.asobject
- Python | Pandas str.join() to join string/list elements with passed delimiter
- Python | Pandas DataFrame.reset_index()
- Python | Pandas dataframe.notna()
- Python | Pandas PeriodIndex.weekday
- Python | Pandas Series.dt.floor
- Python | Pandas Index.get_slice_bound()
- Python | Pandas Dataframe.duplicated()
- Python | Pandas dataframe.notnull()
- Python | Pandas series.cumprod() to find Cumulative product of a Series
- Use Pandas to Calculate Statistics in Python
- Python | Pandas Timestamp.date
- Python | Pandas Timestamp.ctime
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.
Improved By : pacificlion