Open In App
Related Articles

Different ways to create Pandas Dataframe

Improve Article
Improve
Save Article
Save
Like Article
Like

Pandas DataFrame is a 2-dimensional labeled data structure like any table with rows and columns. The size and values of the dataframe are mutable,i.e., can be modified. It is the most commonly used panda object. Pandas DataFrame can be created in multiple ways. Let’s discuss different ways to create a DataFrame one by one. DataFrame() function is used to create a dataframe in Pandas.

Pandas Dataframe() Syntax

Syntax: pandas.DataFrame(data, index, columns)

Parameters:

  • data: It is a dataset from which a dataframe is to be created. It can be a list, dictionary, scalar value, series, ndarrays, etc.
  • index: It is optional, by default the index of the dataframe starts from 0 and ends at the last data value(n-1). It defines the row label explicitly.
  • columns: This parameter is used to provide column names in the dataframe. If the column name is not defined by default, it will take a value from 0 to n-1.

Returns: DataFrame object

Create Pandas Dataframe in Python

There are several ways to create a Dataframe in Pandas Dataframe. Here are some of the most common methods:

  • Create Pandas DataFrame from list of lists
  • Create Pandas DataFrame from dictionary of numpy array/list
  • Creating Dataframe from list of dicts
  • Create Pandas DataFrame from list of dictionaries
  • Create Pandas Dataframe from dictionary of Pandas Series
  • Creating DataFrame using zip() function
  • Creating a DataFrame by proving index label explicitly

Creating an Empty DataFrame

The DataFrame() function of pandas is used to create a dataframe. df variable is the name of the dataframe in our example:

Python3




# Importing Pandas to create DataFrame
import pandas as pd
 
# Creating Empty DataFrame and Storing it in variable df
df = pd.DataFrame()
 
# Printing Empty DataFrame
print(df)


Output: 

Empty DataFrame
Columns: []
Index: []

Creating Pandas DataFrame from lists of lists

To create a Pandas DataFrame from a list of lists, you can use the pd.DataFrame() function. This function takes a list of lists as input and creates a DataFrame with the same number of rows and columns as the input list.

Python3




# Import pandas library
import pandas as pd
 
# initialize list of lists
data = [['tom', 10], ['nick', 15], ['juli', 14]]
 
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Name', 'Age'])
 
# print dataframe.
print(df)


Output: 

 Name  Age
0   tom   10
1  nick   15
2  juli   14

Create Pandas DataFrame from Dictionary of numpy array/List

To create DataFrame from dict of narray/list, all the narray must be of same length. If index is passed then the length index should be equal to the length of arrays. If no index is passed, then by default, index will be range(n) where n is the array length.

Python3




# Python code demonstrate creating
# DataFrame from dict narray / lists
# By default addresses.
 
import pandas as pd
 
# initialize data of lists.
data = {'Name': ['Tom', 'nick', 'krish', 'jack'],
        'Age': [20, 21, 19, 18]}
 
# Create DataFrame
df = pd.DataFrame(data)
 
# Print the output.
print(df)


Output: 

 Name  Age
0    Tom   20
1   nick   21
2  krish   19
3   jack   18

Note: While creating dataframe using dictionary, the keys of dictionary will be column name by default. We can also provide column name explicitly using column parameter.

Create pandas DataFrame from List of Dictionaries

Pandas DataFrame can be created by passing lists of dictionaries as a input data. By default dictionary keys will be taken as columns.

Python3




# Python code demonstrate how to create
# Pandas DataFrame by lists of dicts.
import pandas as pd
 
# Initialize data to lists.
data = [{'a': 1, 'b': 2, 'c': 3},
        {'a': 10, 'b': 20, 'c': 30}]
 
# Creates DataFrame.
df = pd.DataFrame(data)
 
# Print the data
print(df)


Output: 

a   b   c
0   1   2   3
1  10  20  30

Another example to create pandas DataFrame by passing lists of dictionaries and row indexes.

Python3




# Python code demonstrate to create
# Pandas DataFrame by passing lists of
# Dictionaries and row indices.
import pandas as pd
 
# Initialize data of lists
data = [{'b': 2, 'c': 3}, {'a': 10, 'b': 20, 'c': 30}]
 
# Creates pandas DataFrame by passing
# Lists of dictionaries and row index.
df = pd.DataFrame(data, index=['first', 'second'])
 
# Print the data
print(df)


Output: 

b   c     a
first    2   3   NaN
second  20  30  10.0

Create pandas Dataframe from dictionary of Pandas Series

To create DataFrame from Dict of series, dictionary can be passed to form a DataFrame. The resultant index is the union of all the series of passed indexed.

Python3




# Python code demonstrate creating
# Pandas Dataframe from Dicts of series.
 
import pandas as pd
 
# Initialize data to Dicts of series.
d = {'one': pd.Series([10, 20, 30, 40],
                      index=['a', 'b', 'c', 'd']),
     'two': pd.Series([10, 20, 30, 40],
                      index=['a', 'b', 'c', 'd'])}
 
# creates Dataframe.
df = pd.DataFrame(d)
 
# print the data.
print(df)


Output: 

   one  two
a   10   10
b   20   20
c   30   30
d   40   40

Creating DataFrame using zip() function

Two lists can be merged by using list(zip()) function. Now, create the pandas DataFrame by calling pd.DataFrame() function.

Python3




# Python program to demonstrate creating
# pandas Dataframe from lists using zip.
 
import pandas as pd
 
# List1
Name = ['tom', 'krish', 'nick', 'juli']
 
# List2
Age = [25, 30, 26, 22]
 
# get the list of tuples from two lists.
# and merge them by using zip().
list_of_tuples = list(zip(Name, Age))
 
# Assign data to tuples.
list_of_tuples
 
 
# Converting lists of tuples into
# pandas Dataframe.
df = pd.DataFrame(list_of_tuples,
                  columns=['Name', 'Age'])
 
# Print data.
print(df)


Output: 

 Name  Age
0    tom   25
1  krish   30
2   nick   26
3   juli   22

Creating a DataFrame by proving index label explicitly

To create a DataFrame by providing the index label explicitly, you can use the index parameter of the pd.DataFrame() constructor. The index parameter takes a list of index labels as input, and the DataFrame will use these labels for the rows of the DataFrame.

Python3




# Python code demonstrate creating
# pandas DataFrame with indexed by
 
# DataFrame using arrays.
import pandas as pd
 
# initialize data of lists.
data = {'Name': ['Tom', 'Jack', 'nick', 'juli'],
        'marks': [99, 98, 95, 90]}
 
# Creates pandas DataFrame.
df = pd.DataFrame(data, index=['rank1',
                               'rank2',
                               'rank3',
                               'rank4'])
 
# print the data
print(df)


Output: 

 Name  marks
rank1   Tom     99
rank2  Jack     98
rank3  nick     95
rank4  juli     90

Whether you're preparing for your first job interview or aiming to upskill in this ever-evolving tech landscape, GeeksforGeeks Courses are your key to success. We provide top-quality content at affordable prices, all geared towards accelerating your growth in a time-bound manner. Join the millions we've already empowered, and we're here to do the same for you. Don't miss out - check it out now!

Last Updated : 05 Dec, 2023
Like Article
Save Article
Previous
Next
Similar Reads
Complete Tutorials