# Cumulative Frequency and Probability Table in R

In this article, we are going to see how to calculate the cumulative frequency and probability table in R programming language.

### Functions Used

**Table():**** **Tables in R are used for better organizing and summarizing the categorical variables. The table() method takes the cross-classifying factors belonging in a vector to build a contingency or frequency table of the counts at each combination of values. A contingency table is basically a tabulation of the counts and/or percentages for multiple variables. It excludes the counting of any missing values from the factor variable supplied to the method. The output returned is in the form of a table, where the first column contains the distinct values followed by their respective counts. This method can be used for cross-tabulation and statistical analysis.

frequency_table <- table (vec)

**Cumsum():**** **The cumulative frequency can be computed by the summation of each frequency value from a frequency distribution table to include the sum of its predecessors. The last value of this table will be equivalent to the total for all observations. The cumulative frequency table can be calculated by the frequency table, using the cumsum() method. This method returns a vector whose corresponding elements are the cumulative sums.

cumsum ( frequency_table)

**Example 1: **Here we are going to create a frequency table.

## R

`set.seed` `(1)` `vec <- ` `sample` `(` `c` `(` `"Geeks"` `, ` `"CSE"` `, ` `"R"` `, ` `"Python"` `),` ` ` `50 , replace = ` `TRUE` `)` ` ` `# generating frequency table` `data <- ` `table` `(vec)` ` ` `# frequency table ` `print ` `(` `"Frequency Table"` `)` `print ` `(data)` ` ` `print ` `(` `"Cumulative Frequency Table"` `)` `cumfreq_data <- ` `cumsum` `(data)` `print ` `(cumfreq_data)` |

**Output:**

[1] "Frequency Table" vec CSE Geeks Python R 16 16 7 11 [1] "Cumulative Frequency Table" CSE Geeks Python R 16 32 39 50

**Example 2: **Creating a probability table.

The probability table is the fraction of total samples that belong to a particular class. Therefore, it is obtained by the division of frequency table values by the total number of observations, that is the length of the vector. The output is in form of a table, where the first column contains the distinct values followed by their respective probabilities of occurrence.

prob_table <- freq_table/number of observations

**Code:**

## R

`set.seed` `(1)` `vec <- ` `sample` `(` `c` `(` `"Geeks"` `,` `"CSE"` `,` `"R"` `,` `"Python"` `)` ` ` `,50 , replace = ` `TRUE` `)` ` ` `# generating frequency table` `data <- ` `table` `(vec)` ` ` `# frequency table ` `print ` `(` `"Frequency Table"` `)` `print ` `(data)` ` ` `print ` `(` `"Probability Table"` `)` `prob_data <- data/50` `print ` `(prob_data)` |

**Output:**

[1] "Frequency Table" vec CSE Geeks Python R 16 16 7 11 [1] "Probability Table" vec CSE Geeks Python R 0.32 0.32 0.14 0.22

**Example 3: **Creating Cumulative Frequency & Probability Table,

All the columns obtained can be merged together to form a data frame where the respective components form columns of the data frame. The data frame column names can also be assigned using the colnames(df) method.

## R

`set.seed` `(1)` `vec <- ` `sample` `(` `c` `(` `"Geeks"` `,` `"CSE"` `,` `"R"` `,` `"Python"` `)` ` ` `,50 , replace = ` `TRUE` `)` ` ` `# generating frequency table` `data <- ` `table` `(vec)` ` ` `# probability` `num_obsrv <- 50` `prob_data <- data/num_obsrv` ` ` `# cumulative frequency` `cumfreq_data <- ` `cumsum` `(data)` `data_frame <- ` `data.frame` `(data, cumfreq_data,prob_data)` ` ` `colnames` `(data_frame) <- ` `c` `(` `"data"` `,` `"frequency"` `,` ` ` `"cumulative_frequency"` `,` ` ` `"data"` `,` `"probability"` `)` `print ` `(data_frame)` |

**Output:**

data frequency cumulative_frequency data probability CSE CSE 16 16 CSE 0.32 Geeks Geeks 16 32 Geeks 0.32 Python Python 7 39 Python 0.14 R R 11 50 R 0.22