# Analyzing Data in Subsets Using R

Last Updated : 27 Mar, 2024

In this article, we will explore various methods to analyze data in subsets using R Programming Language.

## How to analyze data in the subsets

Analyzing data encompasses employing diverse methodologies to acquire insights, recognize patterns, and draw significant conclusions from datasets. This encompasses activities such as computing summary statistics, visualizing data, and identifying trends within the dataset. R language offers various methods or functions to analyze data in the subsets. By using these methods, can work more efficiently. Some of the methods are:

### Analyzing data in subsets by using subset() Function

```subset(x, subset, select, . . . .)
```

This method is used to analyze the data present in the subsets. In the below example, we created a data frame and analyzed the data in the subsets.

R ```# Example data data <- data.frame( ID = 1:10, Category = rep(c("A", "B"), each = 5), Value = rnorm(10) ) print(data) # Subsetting using subset() function subset_A <- subset(data, Category == "A") subset_B <- subset(data, Category == "B") print("Analyzing the data in subsets") print(subset_A) # Print subsets print(subset_B) ```

Output:

```  ID Category      Value
1   1        A  1.5658719
2   2        A  0.3142731
3   3        A -1.4552153
4   4        A  0.9014216
5   5        A -0.2758858
6   6        B  1.3345081
7   7        B -1.0618629
8   8        B  1.1188082
9   9        B -1.3202145
10 10        B  1.2453632

[1] "Analyzing the data in subsets"
ID Category      Value
1  1        A  1.5658719
2  2        A  0.3142731
3  3        A -1.4552153
4  4        A  0.9014216
5  5        A -0.2758858

ID Category     Value
6   6        B  1.334508
7   7        B -1.061863
8   8        B  1.118808
9   9        B -1.320214
10 10        B  1.245363
```

In the below example, we created a data frame and analyzed the data in the subsets.

R ```# creating data frame data <- data.frame( ID = 1:6, Name = rep(c("X", "Y"), each = 3), Value = rnorm(6) ) print(data) # Subsetting using subset() function subset_X <- subset(data, Name == "X") subset_Y <- subset(data, Name == "Y") print(" Analyzing the data in subsets") print(subset_X) print(subset_Y) ```

Output:

``` ID Name       Value
1  1    X -0.02737704
2  2    X  0.31270382
3  3    X -0.92980339
4  4    Y  0.43035869
5  5    Y  0.30612408
6  6    Y  0.89034199

[1] " Analyzing the data in subsets"
ID Name       Value
1  1    X -0.02737704
2  2    X  0.31270382
3  3    X -0.92980339

ID Name     Value
4  4    Y 0.4303587
5  5    Y 0.3061241
6  6    Y 0.8903420```

### Subsetting the data Frame

These method is used to analyze the data present in subsets. In the below example, we created a data frame and analyzed the data.

R ```# Sample data frame df <- data.frame( student_id = 1:10, test_score = c(80, 85, 90, 75, 95, 82, 78, 88, 92, 70), gender = c("M", "F", "M", "F", "M", "F", "M", "F", "M", "F") ) # Subset of male students male_students <- df[df\$gender == "M", ] print(male_students) print("Analyzing the data ") # Summary statistics for male students summary(male_students\$test_score) ```

Output:

``` student_id test_score gender
1          1         80      M
3          3         90      M
5          5         95      M
7          7         78      M
9          9         92      M

[1] "Analyzing the data "
Min.  1st Qu.  Median    Mean  3rd Qu.    Max.
70.0    78.5    84.0         84.2     90.5    95.0
```

In the below example, we created a data frame and analyzed the data in the subsets.

R ```# Sample sales data sales_data <- data.frame( transaction_id = 1:24, product_category = rep(c("Electronics", "Clothing", "Books"), each = 8), sales_amount = c(150, 200, 100, 120, 180, 80, 70, 90, 110, 95, 250, 300, 280, 320, 270, 40, 60, 50, 55, 45, 65, 78, 89, 34) ) # Subset of sales data for Electronics category electronics_sales <- sales_data[sales_data\$product_category == "Electronics", ] # Displaying the subset print(electronics_sales) ```

Output:

```  transaction_id product_category sales_amount1              1      Electronics          1502              2      Electronics          2003              3      Electronics          1004              4      Electronics          1205              5      Electronics          1806              6      Electronics           807              7      Electronics           708              8      Electronics           90
```

## Conclusion

In Conclusion, we learned various methods to analyze the data in subsets. R language offers versatile tools to analyze the data in subsets.