We can try different approaches for splitting Dataframe to get the desired results. Let’s take an example of a dataset of diamonds.
Method 1: Splitting Pandas Dataframe by row index
In the below code, the dataframe is divided into two parts, first 1000 rows, and remaining rows. We can see the shape of the newly formed dataframes as the output of the given code.
Method 2: Splitting Pandas Dataframe by groups formed from unique column values
Here, we will first grouped the data by column value “color”. The newly formed dataframe consists of grouped data with color = “E”.
Method 3 : Splitting Pandas Dataframe in predetermined sized chunks
In the above code, we can see that we have formed a new dataset of a size of 0.6 i.e. 60% of total rows (or length of the dataset), which now consists of 32364 rows. These rows are selected randomly.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course