R – Statistics
Statistics is a form of mathematical analysis that concerns the collection, organization, analysis, interpretation, and presentation of data. The statistical analysis helps to make the best usage of the vast data available and improving the efficiency of solutions.
R is a programming language and is used for environment statistical computing and graphics. The following is an introduction to basic statistical concepts like plotting graphs such as bar charts, pie charts, Histograms, and boxplots.
In this post, we will be learning about plotting charts for a single variable. The following software is required to learn and implement statistics in R:
- R software
- RStudio IDE
Functions for plotting graphs in Statistics
Following is a list of functions that are required to plot graphs for the representation of Statistical data:
- plot() Function:
This function is used to Draw a scatter plot with axes and titles.
plot(x, y = NULL, ylim = NULL, xlim = NULL, type = "b"....)
- data() function:
This function is used to load specified data sets.
data(list = character(), lib.loc = NULL, package = NULL.....)
- table() Function:
the table function is used to build a contingency table of the counts at each combination of factor levels.
table(x, row.names = NULL, ...)
- barplot() Function:
It creates a bar plot with vertical/horizontal bars.
barplot(height, width = 1, names.arg = NULL, space = NULL...)
- pie() Function:
This function is used to create a pie chart.
pie(x, labels = names(x), radius = 0.6, edges = 100, clockwise = TRUE ...)
- barplot() Function:
- hist() Function:
hist()creates a histogram of the given data values.
hist(x, breaks = "Sturges", probability = !freq, freq = NULL,...)
Note: You can find the information about each function using the “?” symbol
before the beginning of each function.
R built-in datasets are very useful to start with and developing skills, So we will be using a few Built-in datasets.
Let’s start by creating a simple bar chart by using chickwts dataset and learn how to use datasets and few functions of RStudio.
A Bar chart represents categorical data with rectangular bars where the bars can be plotted vertically or horizontally.
In the above code ‘?’ in front of a particular function means that it gives information about that function with its syntax. In R ‘#’ is used for commenting single line and there is no multiline comment in R. Here we are using
chickwts as the dataset and feed is the attribute in the dataset.
A pie chart is a circular statistical graph that is divided into slices to show the different sizes of the data.
Histograms are the representation of the distribution of data(numerical or categorical). It is similar to a bar chart but it groups data in terms of ranges.
Box Plot is a function for graphically depicting groups of numerical data using quartiles. It represents the distribution of data and understanding mean, median, and variance.
USJudgeRating is a Build-in dataset with 6 attributes and RTEN is one of the attribute among it which is rating between 0 to 10 inclusive. We used it to for plotting a boxplot with different attributes of boxplot function.