Data visualization with different Charts in Python

Data Visualization is the presentation of data in graphical format. It helps people understand the significance of data by summarizing and presenting huge amount of data in a simple and easy-to-understand format and helps communicate information clearly and effectively.

Consider this given Data-set for which we will be plotting different charts :

 

Different Types of Charts for Analyzing & Presenting Data

 
1. Histogram :
The histogram represents the frequency of occurrence of specific phenomena which lie within a specific range of values and arranged in consecutive and fixed intervals.



In below code histogram is plotted for Age, Income, Sales. So these plots in the output shows frequency of each unique value for each attribute.

filter_none

edit
close

play_arrow

link
brightness_4
code

# import pandas and matplotlib
import pandas as pd
import matplotlib.pyplot as plt
  
# create 2D array of table given above
data = [['E001', 'M', 34, 123, 'Normal', 350],
        ['E002', 'F', 40, 114, 'Overweight', 450],
        ['E003', 'F', 37, 135, 'Obesity', 169],
        ['E004', 'M', 30, 139, 'Underweight', 189],
        ['E005', 'F', 44, 117, 'Underweight', 183],
        ['E006', 'M', 36, 121, 'Normal', 80],
        ['E007', 'M', 32, 133, 'Obesity', 166],
        ['E008', 'F', 26, 140, 'Normal', 120],
        ['E009', 'M', 32, 133, 'Normal', 75],
        ['E010', 'M', 36, 133, 'Underweight', 40] ]
  
# dataframe created with
# the above data array
df = pd.DataFrame(data, columns = ['EMPID', 'Gender'
                                    'Age', 'Sales',
                                    'BMI', 'Income'] )
  
# create histogram for numeric data
df.hist()
  
# show plot
plt.show()

chevron_right


Output :

 
2. Column Chart :
A column chart is used to show a comparison among different attributes, or it can show a comparison of items over time.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Dataframe of previous code is used here
  
# Plot the bar chart for numeric values
# a comparison will be shown between
# all 3 age, income, sales
df.plot.bar()
  
# plot between 2 attributes
plt.bar(df['Age'], df['Sales'])
plt.xlabel("Age")
plt.ylabel("Sales")
plt.show()

chevron_right


Output :

 
3. Box plot chart :
A box plot is a graphical representation of statistical data based on the minimum, first quartile, median, third quartile, and maximum. The term “box plot” comes from the fact that the graph looks like a rectangle with lines extending from the top and bottom. Because of the extending lines, this type of graph is sometimes called a box-and-whisker plot. For quantile and median refer to this Quantile and median.

filter_none

edit
close

play_arrow

link
brightness_4
code

# For each numeric attribute of dataframe
df.plot.box()
  
# individual attribute box plot
plt.boxplot(df['Income'])
plt.show()

chevron_right


Output :

 
4. Pie Chart :
A pie chart shows a static number and how categories represent part of a whole the composition of something. A pie chart represents numbers in percentages, and the total sum of all segments needs to equal 100%.

filter_none

edit
close

play_arrow

link
brightness_4
code

plt.pie(df['Age'], labels = {"A", "B", "C",
                             "D", "E", "F",
                             "G", "H", "I", "J"},
                               
autopct ='% 1.1f %%', shadow = True)
plt.show()
  
plt.pie(df['Income'], labels = {"A", "B", "C",
                                "D", "E", "F",
                                "G", "H", "I", "J"},
                                  
autopct ='% 1.1f %%', shadow = True)
plt.show()
  
plt.pie(df['Sales'], labels = {"A", "B", "C",
                               "D", "E", "F",
                               "G", "H", "I", "J"},
autopct ='% 1.1f %%', shadow = True)
plt.show()

chevron_right


Output :

 
5. Scatter plot :
A scatter chart shows the relationship between two different variables and it can reveal the distribution trends. It should be used when there are many different data points, and you want to highlight similarities in the data set. This is useful when looking for outliers and for understanding the distribution of your data.

filter_none

edit
close

play_arrow

link
brightness_4
code

# scatter plot between income and age
plt.scatter(df['income'], df['age'])
plt.show()
  
# scatter plot between income and sales
plt.scatter(df['income'], df['sales'])
plt.show()
  
# scatter plot between sales and age
plt.scatter(df['sales'], df['age'])
plt.show()

chevron_right


Output :



My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.




Article Tags :

1


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.