Skip to content
Related Articles
Open in App
Not now

Related Articles

How to get a cartesian product of a huge Dataset using Pandas in Python?

Improve Article
Save Article
  • Last Updated : 21 Apr, 2022
Improve Article
Save Article

In this article, we will discuss how to do a cartesian product of a huge Dataset. The function which we are using here to do cartesian product is the merge function which is the entry point for all standard database join operations between DataFrame objects.


data1 = pd.DataFrame({‘dataset_name_1’: [dataset_1]})

data2 = pd.DataFrame({‘dataset_name_2’: [dataset_2]})

data3 = pd.merge(data1.assign(key=1), data2.assign(key=1), on=’key’).drop(‘key’, axis=1)


  • dataset_name_1, dataset_name_2: Here, these names refer to the dataset names of which cartesian product has to be done.
  • dataset_1, dataset_2: Here, these terms refer to the complete dataset of which cartesian product has to be done.
  • data1: It refers to a data frame object.
  • data2: It refers to another data frame object.
  • on: The column names which have to be joined.

Stepwise Implementation:

Step 1: First of all, import the library Pandas.

import pandas as pd

Step 2: Then, obtain the datasets on which you want to perform a cartesian product.

data1 = pd.DataFrame({'column_name': [dataset_1]})
data2 = pd.DataFrame({'column_name': [dataset_2]})

Step 3: Further, use a merge function to perform the cartesian product on the datasets obtained.

data3 = pd.merge(data1.assign(key=1), data2.assign(key=1),
                 on='key').drop('key', axis=1)

Step 4: Finally, print the cartesian product obtained.




# Python program to get Cartesian
# product of huge dataset
# Import the library Pandas
import pandas as pd
# Obtaining the dataset 1
data1 = pd.DataFrame({'P': [1,3,5]})
# Obtaining the dataset 2
data2 = pd.DataFrame({'Q': [2,4,6]})
# Doing cartesian product of datasets 1 and 2 
data3 = pd.merge(data1.assign(key=1), data2.assign(key=1), 
                 on='key').drop('key', axis=1)
# Printing the cartesian product of both datasets


   P  Q
0  1  2
1  1  4
2  1  6
3  3  2
4  3  4
5  3  6
6  5  2
7  5  4
8  5  6
My Personal Notes arrow_drop_up
Related Articles

Start Your Coding Journey Now!