2D Histogram is used to analyze the relationship among two data variables which has wide range of values. A 2D histogram is very similar like 1D histogram. The class intervals of the data set are plotted on both x and y axis. Unlike 1D histogram, it drawn by including the total number of combinations of the values which occur in intervals of x and y, and marking the densities. It is useful when there is a large amount of data in a discrete distribution, and simplifies it by visualizing the points where the frequencies if variables are dense.
Creating a 2D Histogram
Matplotlib library provides an inbuilt function matplotlib.pyplot.hist2d()
which is used to create 2D histogram.Below is the syntax of the function:
matplotlib.pyplot.hist2d(x, y, bins=(nx, ny), range=None, density=False, weights=None, cmin=None, cmax=None, cmap=value)
Here (x, y)
specify the coordinates of the data variables, the length of the X data and Y variables should be same.The number of bins can be specified by the attribute bins=(nx, ny)
where nx
and ny
is the number of bins to be used in the horizontal and vertical directions respectively.cmap=value
is used to set the color scale.The range=None
is an optional parameter used to set rectangular area in which data values are counted for plot.density=value
is optional parameter accepting boolean values used to normalize histogram.
The code below code creates a simple 2D histogram using matplotlib.pyplot.hist2d()
function having some random values of x and y:
# Import libraries import numpy as np import matplotlib.pyplot as plt import random # Creating dataset n = 100 x = np.random.standard_normal(n) y = 3.0 * x fig = plt.subplots(figsize = ( 10 , 7 )) # Creating plot plot.hist2d(x, y) plot.title( "Simple 2D Histogram" ) # show plot plot.show() |
Output:
Customizing 2D Histogram
The matplotlib.pyplot.hist2d()
function has a wide range of methods which we can use to customize and create the plot for better view and understanding.
# Import libraries import numpy as np import matplotlib.pyplot as plt import random # Creating dataset x = np.random.normal(size = 500000 ) y = x * 3 + 4 * np.random.normal(size = 500000 ) fig = plt.subplots(figsize = ( 10 , 7 )) # Creating plot plot.hist2d(x, y) plot.title( "Simple 2D Histogram" ) # show plot plot.show() |
Output:
Some of the customization of the above graph are listed below:
Changing the bin scale:-
# Import libraries import numpy as np import matplotlib.pyplot as plt import random # Creating dataset x = np.random.normal(size = 500000 ) y = x * 3 + 4 * np.random.normal(size = 500000 ) # Creating bins x_min = np. min (x) x_max = np. max (x) y_min = np. min (y) y_max = np. max (y) x_bins = np.linspace(x_min, x_max, 50 ) y_bins = np.linspace(y_min, y_max, 20 ) fig, ax = plt.subplots(figsize = ( 10 , 7 )) # Creating plot plt.hist2d(x, y, bins = [x_bins, y_bins]) plt.title( "Changing the bin scale" ) ax.set_xlabel( 'X-axis' ) ax.set_ylabel( 'X-axis' ) # show plot plt.tight_layout() plot.show() |
Output:
Changing the color scale and adding color bar:-
# Import libraries import numpy as np import matplotlib.pyplot as plt import random # Creating dataset x = np.random.normal(size = 500000 ) y = x * 3 + 4 * np.random.normal(size = 500000 ) # Creating bins x_min = np. min (x) x_max = np. max (x) y_min = np. min (y) y_max = np. max (y) x_bins = np.linspace(x_min, x_max, 50 ) y_bins = np.linspace(y_min, y_max, 20 ) fig, ax = plt.subplots(figsize = ( 10 , 7 )) # Creating plot plt.hist2d(x, y, bins = [x_bins, y_bins], cmap = plt.cm.nipy_spectral) plt.title( "Changing the color scale and adding color bar" ) # Adding color bar plt.colorbar() ax.set_xlabel( 'X-axis' ) ax.set_ylabel( 'X-axis' ) # show plot plt.tight_layout() plot.show() |
Output:
Filtering data:-
# Import libraries import numpy as np import matplotlib.pyplot as plt import random # Creating dataset x = np.random.normal(size = 500000 ) y = x * 3 + 4 * np.random.normal(size = 500000 ) # Creating bins x_min = np. min (x) x_max = np. max (x) y_min = np. min (y) y_max = np. max (y) x_bins = np.linspace(x_min, x_max, 50 ) y_bins = np.linspace(y_min, y_max, 20 ) # Creating data filter data = np.c_[x, y] for i in range ( 10000 ): x_idx = random.randint( 0 , 500000 ) data[x_idx, 0 ] = - 9999 data = data[data[:, 0 ]! = - 9999 ] fig, ax = plt.subplots(figsize = ( 10 , 7 )) # Creating plot plt.hist2d(data[:, 0 ], data[:, 1 ], bins = [x_bins, y_bins]) plt.title( "Filtering data" ) ax.set_xlabel( 'X-axis' ) ax.set_ylabel( 'X-axis' ) # show plot plt.tight_layout() plot.show() |
Output:
Using matplotlib hexbin function:-
# Import libraries import numpy as np import matplotlib.pyplot as plt import random # Creating dataset x = np.random.normal(size = 500000 ) y = x * 3 + 4 * np.random.normal(size = 500000 ) fig, ax = plt.subplots(figsize = ( 10 , 7 )) # Creating plot plt.title( "Using matplotlib hexbin function" ) plt.hexbin(x, y, bins = 50 ) ax.set_xlabel( 'X-axis' ) ax.set_ylabel( 'Y-axis' ) # show plot plt.tight_layout() plot.show() |
Output:
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.