Association Rule Mining in R Language is an Unsupervised Non-linear algorithm to uncover how the items are associated with each other. In it, frequent Mining shows which items appear together in a transaction or relation. It’s majorly used by retailers, grocery stores, an online marketplace that has a large transactional database. The same way when any online social media, marketplace, and e-commerce websites know what you buy next using recommendations engines. The recommendations you get on item or variable, while you check out the order is because of Association rule mining boarded on past customer data. There are three common ways to measure association:

- Support
- Confidence
- Lift

#### Theory

In association rule mining, Support, Confidence, and Lift measure association.

**Support** says how popular an item is, as measured in the proportion of transactions in which an item set appears.

**Confidence** says how likely item Y is purchased when item X is purchased, expressed as {X -> Y}.

Thus it is measured by the proportion of transaction with item X in which item Y also appears. Confidence might misrepresent the importance of association.

**Lift** says how likely item Y is purchased when item X is purchased while controlling for how popular item Y is.

Apriori Algorithm is also used in association rule mining for discovering frequent itemsets in the transactions database. It was proposed by Agrawal & Srikant in 1993.

**Example:**

A customer does 4 transactions with you. In the first transaction, she buys 1 apple, 1 beer, 1 rice, and 1 chicken. In the second transaction, she buys 1 apple, 1 beer, 1 rice. In the third transaction, she buys 1 apple, 1 beer only. In fourth transactions, she buys 1 apple and 1 orange.

Support(Apple)= 4/4 So, Support of {Apple} is 4 out of 4 or 100%Confidence(Apple -> Beer)= Support(Apple, Beer)/Support(Apple) = (3/4)/(4/4) = 3/4 So, Confidence of {Apple -> Beer} is 3 out of 4 or 75%Lift(Beer -> Rice)= Support(Beer, Rice)/(Support(Beer) * Support(Rice)) = (2/4)/(3/4) * (2/4) = 1.33 So, Lift value is greater than 1 implies Rice is likely to be bought if Beer is bought.

#### The Dataset

** Market Basket** dataset consists of 15010 observations with Date, Time, Transaction and Item feature or columns. The date variable or column ranges from 30/10/2016 to 09/04/2017. Time is a categorical variable that tells the time. Transaction is a quantitative variable that helps in differentiation of transactions. Item is a categorical variable that links with a product.

`# Loading data ` `dataset ` `=` `read.transactions(` `'Market_Basket_Optimisation.csv'` `, ` ` ` `sep ` `=` `', '` `, rm.duplicates ` `=` `TRUE) ` ` ` `# Structure ` `str` `(dataset) ` |

*chevron_right*

*filter_none*

#### Performing Association Rule Mining on Dataset

Using the Association Rule Mining algorithm on the dataset which includes 15010 observations.

`# Installing Packages ` `install.packages(` `"arules"` `) ` `install.packages(` `"arulesViz"` `) ` ` ` `# Loading package ` `library(arules) ` `library(arulesViz) ` ` ` `# Fitting model ` `# Training Apriori on the dataset ` `set` `.seed ` `=` `220` `# Setting seed ` `associa_rules ` `=` `apriori(data ` `=` `dataset, ` ` ` `parameter ` `=` `list` `(support ` `=` `0.004` `, ` ` ` `confidence ` `=` `0.2` `)) ` ` ` `# Plot ` `itemFrequencyPlot(dataset, topN ` `=` `10` `) ` ` ` `# Visualising the results ` `inspect(sort(associa_rules, by ` `=` `'lift'` `)[` `1` `:` `10` `]) ` `plot(associa_rules, method ` `=` `"graph"` `, ` ` ` `measure ` `=` `"confidence"` `, shading ` `=` `"lift"` `) ` |

*chevron_right*

*filter_none*

**Output:**

**Model associa_rules:**The model minimum length is 1, the maximum length is 10, and the target rules with absolute support count 30.

**Item Frequency Plot:**So, mineral water is the best selling product followed by eggs, spaghetti, french fries, etc.

**Visualizing the model:**

So, the plot of graphs of 100 is displayed.

So, Association rule mining is widely used in Recommendation systems in E-Commerce, online marketplace and Social Media websites, etc, and widely used in the industry.

## Recommended Posts:

- Association Rule
- Web Mining
- Binning in Data Mining
- Ensemble Classifier | Data Mining
- Redundancy and Correlation in Data Mining
- Classification of Data Mining Systems
- Difference Between Data mining and Machine learning
- Comparison b/w Bagging and Boosting | Data Mining
- Difference Between Descriptive and Predictive Data Mining
- Relationship between Data Mining and Machine Learning
- Basic Concept of Classification (Data Mining)
- Rule-Based Classifier - Machine Learning
- Data Integration in Data Mining
- Classification in R Programming
- S3 class in R Programming
- R6 Classes in R Programming
- Variability in R Programming
- XGBoost in R Programming
- Functions in R Programming
- Classes in R Programming

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.