Open In App

How to plot a dataframe using Pandas?

Improve
Improve
Like Article
Like
Save
Share
Report

Pandas is one of the most popular Python packages used in data science. Pandas offer a powerful, and flexible data structure ( Dataframe & Series ) to manipulate, and analyze the data. Visualization is the best way to interpret the data. 

Python has many popular plotting libraries that make visualization easy. Some of them are Matplotlib, Seaborn, and Python Plotly. It has great integration with Matplotlib. We can plot a Dataframe using the plot() method. But we need a Dataframe to plot. We can create a Dataframe by just passing a dictionary to the DataFrame() method of the Pandas library. 

Plot a Dataframe using Pandas

Making a different Plot from a Pandas DataFrame is easy. First, we create a simple Pandas DataFrame to make it easier to understand.

  1. Scatter Plot
  2. Area Plot
  3. Bar Plot
  4. Violin Plot
  5. Line Plot
  6. Box Plot
  7. Histogram Plot

Create a Dataframe

Let’s create a simple Dataframe: In this example, code imports the Pandas and Matplotlib libraries creates a dictionary representing student data, and uses it to create a Pandas DataFrame. The `head()` function displays the first five rows of the DataFrame.

Python3




# importing required library
# In case pandas is not installed on your machine
# use the command 'pip install pandas'.
import pandas as pd
import matplotlib.pyplot as plt
 
# A dictionary which represents data
data_dict = {'name': ['p1', 'p2', 'p3', 'p4', 'p5', 'p6'],
             'age': [20, 20, 21, 20, 21, 20],
             'math_marks': [100, 90, 91, 98, 92, 95],
             'physics_marks': [90, 100, 91, 92, 98, 95],
             'chem_marks': [93, 89, 99, 92, 94, 92]
             }
 
# creating a data frame object
df = pd.DataFrame(data_dict)
 
# show the dataframe
# bydefault head() show
# first five rows from top
df.head()


Output: 

  name  age  math_marks  physics_marks  chem_marks
0   p1   20         100             90          93
1   p2   20          90            100          89
2   p3   21          91             91          99
3   p4   20          98             92          92
4   p5   21          92             98          94

Create Plots in Pandas Dataframe

There are a number of plots available to interpret the data. Each graph is used for a purpose. there are various way to create plots in pandas dataframe here we are discussing some generally used method for create plots in pandas dataframe those are following.

Plot Dataframe using Pandas Scatter Plot

To get the scatterplot of a dataframe all we have to do is to just call the plot() method by specifying some parameters.

kind=’scatter’,x= ‘some_column’,y=’some_colum’,color=’somecolor’

Example : In this example code creates a scatter plot using a DataFrame ‘df’ with ‘math_marks’ on the x-axis and ‘physics_marks’ on the y-axis, plotted in red. The plot is titled ‘ScatterPlot’ and displayed using Matplotlib.

Python3




# scatter plot
df.plot(kind='scatter',
        x='math_marks',
        y='physics_marks',
        color='red')
 
# set the title
plt.title('ScatterPlot')
 
# show the plot
plt.show()


Output: 

Pandas Plotting

There are many ways to customize plots this is the basic one. 

Plot a Dataframe Pandas using Area Plot

An area plot is a data visualization technique that displays quantitative data over a two-dimensional surface, depicting the magnitude of values and the cumulative total as filled-in areas, providing a visual representation of trends and patterns.

Example :In this example Python code uses the pandas, numpy, and matplotlib libraries to create a sample DataFrame with ‘X’, ‘Y1’, and ‘Y2’ columns, then generates and displays an area plot with ‘X’ on the x-axis and ‘Y1’ and ‘Y2’ on the y-axis, titled ‘Area Plot’.

Python3




import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
 
# Creating a sample DataFrame
data = {'X': np.arange(1, 11),
        'Y1': np.random.randint(1, 10, size=(10)),
        'Y2': np.random.randint(1, 10, size=(10))}
df = pd.DataFrame(data)
 
# Plotting Area Plot
df.plot(x='X', kind='area', stacked=False)
plt.title('Area Plot')
plt.xlabel('X')
plt.ylabel('Y')
plt.show()


Output :

area-plot

Area Plot

Plot a Pandas DataFrame using Bar Plot

Similarly, we have to specify some parameters for plot() method to get the bar plot. 

kind=’bar’,x= ‘some_column’,y=’some_colum’,color=’somecolor’

Example : In this example code creates a bar plot using the ‘physics_marks’ data from the DataFrame ‘df’ with names on the x-axis, green bars, and a title ‘BarPlot’. The plot is displayed using Matplotlib’s `show()` function.

Python3




# bar plot
df.plot(kind='bar',
        x='name',
        y='physics_marks',
        color='green')
 
# set the title
plt.title('BarPlot')
 
# show the plot
plt.show()


Output: 

Pandas Plotting

Plot a Pandas DataFrame using Violin Plot

A violin plot is a data visualization that combines aspects of a box plot and a kernel density plot, providing insights into the distribution, central tendency, and probability density of a dataset.

Example : In this example the code generates and plots a violin plot using Seaborn to visualize the distribution of ‘Values’ in two categories (‘A’ and ‘B’) from a sample DataFrame.

Python3




import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
 
# Creating a sample DataFrame
data = {'Category': ['A'] * 100 + ['B'] * 100,
        'Values': np.concatenate([np.random.normal(0, 1, 100), np.random.normal(3, 1, 100)])}
df = pd.DataFrame(data)
 
# Plotting Violin Plot
plt.figure(figsize=(8, 6))
sns.violinplot(x='Category', y='Values', data=df)
plt.title('Violin Plot')
plt.xlabel('Category')
plt.ylabel('Values')
plt.show()


Output :

violin-plot

Violin Plot

Create Plots in Pandas using Line Plot

The line plot of a single column is not always useful, to get more insights we have to plot multiple columns on the same graph. To do so we have to reuse the axes. 

kind=’line’,x= ‘some_column’,y=’some_colum’,color=’somecolor’,ax=’someaxes’  

Example : In this example the code uses Matplotlib to create a line plot with three lines representing math, physics, and chemistry marks from a DataFrame (‘df’) with student data, all displayed on the same axis (‘ax’), and the plot is titled ‘LinePlots’.

Python3




# Get current axis
ax = plt.gca()
 
# line plot for math marks
df.plot(kind='line',
        x='name',
        y='math_marks',
        color='green', ax=ax)
 
# line plot for physics marks
df.plot(kind='line', x='name',
        y='physics_marks',
        color='blue', ax=ax)
 
# line plot for chemistry marks
df.plot(kind='line', x='name',
        y='chem_marks',
        color='black', ax=ax)
 
# set the title
plt.title('LinePlots')
 
# show the plot
plt.show()


Output:

Pandas Plotting

Create plots in pandas using Box Plot

Box plot is majorly used to identify outliers, we can information like median, maximum, minimum, quartiles and so on. Let’s see how to plot it.

Example : In this example These two lines of code use the Pandas library to create a box plot of a DataFrame (assumed to be named ‘df’) and then display the plot using Matplotlib.

Python3




df.plot.box()
plt.show()


Output:

plot1

Plotting with Pandas and Matplotlib Histogram Plot

A histogram plot is a graphical representation of the distribution of a dataset, displaying the frequency of values within specified intervals (bins) along a continuous range. It provides a visual summary of the data’s underlying frequency distribution.

Example : In this example the code uses the pandas library to create a DataFrame with 100 random values from a standard normal distribution, then utilizes matplotlib to plot a histogram with 20 bins, displaying the frequency distribution of the values.

Python3




import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
 
# Creating a sample DataFrame
data = {'Values': np.random.randn(100)}
df = pd.DataFrame(data)
 
# Plotting Histogram
df['Values'].plot(kind='hist', bins=20, edgecolor='black')
plt.title('Histogram Plot')
plt.xlabel('Values')
plt.ylabel('Frequency')
plt.show()


Output :

histogram

Histogram Plot



Last Updated : 18 Dec, 2023
Like Article
Save Article
Share your thoughts in the comments
Similar Reads