Open In App

Different ways to create Pandas Dataframe

Improve
Improve
Improve
Like Article
Like
Save Article
Save
Share
Report issue
Report

Pandas DataFrame is a 2-dimensional labeled data structure like a table with rows and columns. The size and values of the DataFrame are mutable, i.e., can be modified. 

DataFrame is mostly used in data analysis and data manipulation. It lets you store data in tabular form like SQL database, MS Excel, or Google Sheets, making it easier to perform arithmetic operations on the data.

It is the most commonly used Pandas object. The DataFrame() function is used to create a DataFrame in Pandas. You can also create Pandas DataFrame in multiple ways.

Pandas Dataframe() Syntax

pandas.DataFrame(data, index, columns)

Parameters:

  • data: It is a dataset from which a DataFrame is to be created. It can be a list, dictionary, scalar value, series, and arrays, etc.
  • index: It is optional, by default the index of the DataFrame starts from 0 and ends at the last data value(n-1). It defines the row label explicitly.
  • columns: This parameter is used to provide column names in the DataFrame. If the column name is not defined by default, it will take a value from 0 to n-1.

Returns: 

  • DataFrame object

Now that we have discussed about DataFrame() function, let’s look at different ways to create a DataFrame:

Different Ways to Create Dataframe in Python

There are several ways to create a Pandas Dataframe in Python. You can create a DataFrame with the following methods:

  • Create Pandas DataFrame using DataFrame() function
  • Create Pandas DataFrame from list of lists
  • Create Pandas DataFrame from the dictionary of  ndarray/list
  • Create Pandas DataFrame from list of dictionaries
  • Create Pandas DataFrame from a dictionary of Series
  • Creating DataFrame using the zip() function
  • Creating a DataFrame by proving the index label explicitly

Create an Empty DataFrame using DataFrame() Method

DataFrame in Python can be created by the DataFrame() function of the Pandas library. Just call the function with the DataFrame constructor to create a DataFrame.

Example: Creating an empty DataFrame using the DataFrame() function in Python

Python3




# Importing Pandas to create DataFrame
import pandas as pd
 
# Creating Empty DataFrame and Storing it in variable df
df = pd.DataFrame()
 
# Printing Empty DataFrame
print(df)


Output: 

Empty DataFrame
Columns: []
Index: []

Create DataFrame from lists of lists

To create a Pandas DataFrame from a list of lists, you can use the pd.DataFrame() function. This function takes a list of lists as input and creates a DataFrame with the same number of rows and columns as the input list.

Example: Creating DataFrame from lists of lists using the DataFrame() method

Python3




# Import pandas library
import pandas as pd
 
# initialize list of lists
data = [['tom', 10], ['nick', 15], ['juli', 14]]
 
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Name', 'Age'])
 
# print dataframe.
print(df)


Output: 

 Name  Age
0 tom 10
1 nick 15
2 juli 14

Create DataFrame from Dictionary of  ndArray/Lists

To create DataFrame from a dictionary of ndarrays/lists, all the arrays must be of the same length. If an index is passed then the length index should be equal to the length of the arrays. 

If no index is passed, then by default, the index will be range(n) where n is the array length.

Example: Creating DataFrame from a dictionary of ndarray/lists

Python3




# Python code demonstrate creating
# DataFrame from dict narray / lists
# By default addresses.
 
import pandas as pd
 
# initialize data of lists.
data = {'Name': ['Tom', 'nick', 'krish', 'jack'],
        'Age': [20, 21, 19, 18]}
 
# Create DataFrame
df = pd.DataFrame(data)
 
# Print the output.
print(df)


Output: 

 Name  Age
0 Tom 20
1 nick 21
2 krish 19
3 jack 18

Note: While creating DataFrame using a dictionary, the keys of the dictionary will be column names by default. We can also provide column names explicitly using column parameter.

Create DataFrame from List of Dictionaries

Pandas DataFrame can be created by passing lists of dictionaries as input data. By default, dictionary keys will be taken as columns.

Python3




# Python code demonstrate how to create
# Pandas DataFrame by lists of dicts.
import pandas as pd
 
# Initialize data to lists.
data = [{'a': 1, 'b': 2, 'c': 3},
        {'a': 10, 'b': 20, 'c': 30}]
 
# Creates DataFrame.
df = pd.DataFrame(data)
 
# Print the data
print(df)


Output: 

a   b   c
0 1 2 3
1 10 20 30

Another example is to create a Pandas DataFrame by passing lists of dictionaries and row indexes.

Python3




# Python code demonstrate to create
# Pandas DataFrame by passing lists of
# Dictionaries and row indices.
import pandas as pd
 
# Initialize data of lists
data = [{'b': 2, 'c': 3}, {'a': 10, 'b': 20, 'c': 30}]
 
# Creates pandas DataFrame by passing
# Lists of dictionaries and row index.
df = pd.DataFrame(data, index=['first', 'second'])
 
# Print the data
print(df)


Output: 

b   c     a
first 2 3 NaN
second 20 30 10.0

Create DataFrame from a dictionary of Series

To create a DataFrame from a dictionary of series, a dictionary can be passed to form a DataFrame. The resultant index is the union of all the series of passed indexed.

Example: Creating a DataFrame from a dictionary of series.

Python3




# Python code demonstrate creating
# Pandas Dataframe from Dicts of series.
 
import pandas as pd
 
# Initialize data to Dicts of series.
d = {'one': pd.Series([10, 20, 30, 40],
                      index=['a', 'b', 'c', 'd']),
     'two': pd.Series([10, 20, 30, 40],
                      index=['a', 'b', 'c', 'd'])}
 
# creates Dataframe.
df = pd.DataFrame(d)
 
# print the data.
print(df)


Output: 

   one  two
a 10 10
b 20 20
c 30 30
d 40 40

Create DataFrame using the zip() function

Two lists can be merged by using the zip() function. Now, create the Pandas DataFrame by calling pd.DataFrame() function.

Example: Creating DataFrame using zip() function.

Python3




# Python program to demonstrate creating
# pandas Dataframe from lists using zip.
 
import pandas as pd
 
# List1
Name = ['tom', 'krish', 'nick', 'juli']
 
# List2
Age = [25, 30, 26, 22]
 
# get the list of tuples from two lists.
# and merge them by using zip().
list_of_tuples = list(zip(Name, Age))
 
# Assign data to tuples.
list_of_tuples
 
 
# Converting lists of tuples into
# pandas Dataframe.
df = pd.DataFrame(list_of_tuples,
                  columns=['Name', 'Age'])
 
# Print data.
print(df)


Output: 

 Name  Age
0 tom 25
1 krish 30
2 nick 26
3 juli 22

Create a DataFrame by proving the index label explicitly

To create a DataFrame by providing the index label explicitly, you can use the index parameter of the pd.DataFrame() constructor. The index parameter takes a list of index labels as input, and the DataFrame will use these labels for the rows of the DataFrame.

Example: Creating a DataFrame by proving the index label explicitly

Python3




# Python code demonstrate creating
# pandas DataFrame with indexed by
 
# DataFrame using arrays.
import pandas as pd
 
# initialize data of lists.
data = {'Name': ['Tom', 'Jack', 'nick', 'juli'],
        'marks': [99, 98, 95, 90]}
 
# Creates pandas DataFrame.
df = pd.DataFrame(data, index=['rank1',
                               'rank2',
                               'rank3',
                               'rank4'])
 
# print the data
print(df)


Output: 

 Name  marks
rank1 Tom 99
rank2 Jack 98
rank3 nick 95
rank4 juli 90

Conclusion

Python Pandas DataFrame is similar to a table with rows and columns. It is a two-dimensional data structure and is very useful for data analysis and data manipulation. 

In this tutorial, we have discussed multiple ways of creating a Pandas DataFrame. With this tutorial, you will be able to handle any complex requirement of creating DataFrame.



Last Updated : 18 Jan, 2024
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads