Categorical Variable/Data (or Nominal variable):
Such variables take on a fixed and limited number of possible values. For example – grades, gender, blood group type, etc. Also, in the case of categorical variables, the logical order is not the same as categorical data e.g. “one”, “two”, “three”. But the sorting of these variables uses logical order. For example, gender is a categorical variable and has categories – male and female and there is no intrinsic ordering to the categories. A purely categorical variable is one that simply allows you to assign categories, but you cannot clearly order the variables.
Terms related to Variability Metrics :
- Mode : Most frequently occurring value in the given data
Data = ["Car", "Bat", "Bat", "Car", "Bat", "Bat", "Bat", "Bike"] Mode = "Bat"
- Expected Value : When working in machine learning, categories have to be associated with a numeric value, so as to give understanding to the machine. This gives an average value based on a category’s probability of occurrence i.e. Expected Value.
It is calculated by –
-> Multiply each outcome by its probability of occurring. -> Sum these values
So, it is the sum of values times their probability of occurrence often used to sum up factor variable levels.
- Bar Charts : Frequency of each category plotted as bars.
Loading Libraries –
matplotlib.pyplot as plt
numpy as npchevron_right
Indexing Data –
"Total Labels : "
"Indexing : "
Total Labels : 6 Indexing : [0 1 2 3 4 5]
Bar Graph –
'No of Vehicles'
plt.xticks(index, label, fontsize
'Market Share for Each Genre 1995-2017'
- Pie Charts : Frequency of each category plotted as pie or wedges. It is a circular graph, where the arc length of each slice is proportional to the quantity it represents.
- Exploring Data Distribution | Set 1
- Exploring Data Distribution | Set 2
- Exploring Correlation in Python
- Sagemaker - Exploring Ground truth labeling | ML
- Python | Pandas.Categorical()
- Python | Pandas Categorical DataFrame creation
- Seaborn | Categorical Plots
- Grouping Categorical Variables in Pandas Dataframe
- Processing of Raw Data to Tidy Data in R
- Data Integration in Data Mining
- Difference between a Data Analyst and a Data Scientist
- Difference Between Data Science and Data Engineering
- Difference Between Data Science and Data Mining
- Difference Between Big Data and Data Science
- Difference Between Data Science and Data Analytics
- Difference Between Data Science and Data Visualization
- How Data Visualization Enables us to Monitor COVID-19 Data?
- Different Sources of Data for Data Analysis
- Python - Convert Tick-by-Tick data into OHLC (Open-High-Low-Close) Data
- Object Oriented Programming in Python | Set 2 (Data Hiding and Object Printing)
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.
Improved By : nidhi_biet