Open In App

Split large Pandas Dataframe into list of smaller Dataframes

Last Updated : 05 Sep, 2020
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will learn about the splitting of large dataframe into list of smaller dataframes. This can be done mainly in two different ways :

  1. By splitting each row
  2. Using the concept of groupby

Here we use a small dataframe to understand the concept easily and this can also be implemented in an easy way. The Dataframe consists of student id, name, marks, and grades. Let’s create the dataframe.

Python3




# importing packages
import pandas as pd
  
# dictionary of data
dct = {'ID': {0: 23, 1: 43, 2: 12,
              3: 13, 4: 67, 5: 89,
              6: 90, 7: 56, 8: 34},
         
       'Name': {0: 'Ram', 1: 'Deep',
                2: 'Yash', 3: 'Aman',
                4: 'Arjun', 5: 'Aditya',
                6: 'Divya', 7: 'Chalsea',
                8: 'Akash'},
         
       'Marks': {0: 89, 1: 97, 2: 45, 3: 78,
                 4: 56, 5: 76, 6: 100, 7: 87,
                 8: 81},
         
       'Grade': {0: 'B', 1: 'A', 2: 'F', 3: 'C',
                 4: 'E', 5: 'C', 6: 'A', 7: 'B',
                 8: 'B'}
       }
  
# create dataframe
df = pd.DataFrame(dct)
  
# view dataframe
df


Output:

Below is the implementation of the above concepts with some examples :

Example 1: By splitting each row

Here, we use the loop of iteration for each row. Every row is accessed by using DataFrame.loc[] and stored in a list. This list is the required output which consists of small DataFrames. In this example, the dataset (consists of 9 rows data) is divided into smaller dataframes by splitting each row so the list is created of 9 smaller dataframes as shown below in output.

Python3




# split dataframe by row
splits = [df.loc[[i]] for i in df.index]
  
# view splitted dataframe
print(splits)
  
# check datatype of smaller dataframe
print(type(splits[0]))
  
# view smaller dataframe
print(splits[0])


Output:

 

Example 2: Using Groupby

Here, we use the DataFrame.groupby() method for splitting the dataset by rows. The same grouped rows are taken as a single element and stored in a list. This list is the required output which consists of small DataFrames. In this example, the dataset (consists of 9 rows data) is divided into smaller dataframes using groupby method on column “Grade”. Here, the total number of distinct grades is 5 so the list is created of 5 smaller dataframes as shown below in output.

Python3




# split dataframe using gropuby
splits = list(df.groupby("Grade"))
  
# view splitted dataframe
print(splits)
  
# check datatype of smaller dataframe
print(type(splits[0][1]))
  
# view smaller dataframe
print(splits[0][1])


Output:



Previous Article
Next Article

Similar Reads

Python | Pandas Split strings into two List/Columns using str.split()
Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string. It works similarly to Python's default split() method but it can only be applied to an individual string. Pandas <code
4 min read
Split a text column into two columns in Pandas DataFrame
Let's see how to split a text column into two columns in Pandas DataFrame. Method #1 : Using Series.str.split() functions. Split Name column into two different columns. By default splitting is done on the basis of single space by str.split() function. # import Pandas as pd import pandas as pd # create a new data frame df = pd.DataFrame({'Name': ['J
3 min read
Split a String into columns using regex in pandas DataFrame
Given some mixed data containing multiple values as a string, let's see how can we divide the strings using regex and make multiple columns in Pandas DataFrame. Method #1: In this method we will use re.search(pattern, string, flags=0). Here pattern refers to the pattern that we want to search. It takes in a string with the following values: \w matc
3 min read
Convert given Pandas series into a dataframe with its index as another column on the dataframe
First of all, let we understand that what are pandas series. Pandas Series are the type of array data structure. It is one dimensional data structure. It is capable of holding data of any type such as string, integer, float etc. A Series can be created using Series constructor. Syntax: pandas.Series(data, index, dtype, copy) Return: Series object.
1 min read
Python | Pandas Reverse split strings into two List/Columns using str.rsplit()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas provide a method to split string around a passed separator or delimiter. After that, the string can be stored as a list in a seri
3 min read
How to Merge DataFrames of different length in Pandas ?
In this article, we will discuss how to merge the two dataframes with different lengths in Pandas. It can be done using the merge() method. Syntax: DataFrame.merge(parameters) Below are some examples that depict how to merge data frames of different lengths using the above method: Example 1: Below is a program to merge two student data frames of di
2 min read
Compare Pandas Dataframes using DataComPy
It's well known that Python is a multi-paradigm, general-purpose language that is widely used for data analytics because of its extensive library support and an active community. The most commonly known methods to compare two Pandas dataframes using python are: Using difflib Using fuzzywuzzy Regex Match These methods are widely in use by seasoned a
2 min read
How to Union Pandas DataFrames using Concat?
concat() function does all of the heavy liftings of performing concatenation operations along an axis while performing optional set logic (union or intersection) of the indexes (if any) on the other axes. The concat() function combines data frames in one of two ways: Stacked: Axis = 0 (This is the default option).[caption width="800"]Axis=0[/captio
1 min read
How to compare values in two Pandas Dataframes?
Let's discuss how to compare values in the Pandas dataframe. Here are the steps for comparing values in two pandas Dataframes: Step 1 Dataframe Creation: The dataframes for the two datasets can be created using the following code: C/C++ Code import pandas as pd # elements of first dataset first_Set = {'Prod_1': ['Laptop', 'Mobile Phone', 'Desktop',
2 min read
How to Join Pandas DataFrames using Merge?
Joining and merging DataFrames is that the core process to start out with data analysis and machine learning tasks. It's one of the toolkits which each Data Analyst or Data Scientist should master because in most cases data comes from multiple sources and files. In this tutorial, you'll how to join data frames in pandas using the merge technique. M
3 min read