Association Rule Mining in R Programming

Association Rule Mining in R Language is an Unsupervised Non-linear algorithm to uncover how the items are associated with each other. In it, frequent Mining shows which items appear together in a transaction or relation. It’s majorly used by retailers, grocery stores, an online marketplace that has a large transactional database. The same way when any online social media, marketplace, and e-commerce websites know what you buy next using recommendations engines. The recommendations you get on item or variable, while you check out the order is because of Association rule mining boarded on past customer data. There are three common ways to measure association:

  • Support
  • Confidence
  • Lift

Theory

In association rule mining, Support, Confidence, and Lift measure association.

Support says how popular an item is, as measured in the proportion of transactions in which an item set appears.

Confidence says how likely item Y is purchased when item X is purchased, expressed as {X -> Y}.
Thus it is measured by the proportion of transaction with item X in which item Y also appears. Confidence might misrepresent the importance of association.



Lift says how likely item Y is purchased when item X is purchased while controlling for how popular item Y is.

Apriori Algorithm is also used in association rule mining for discovering frequent itemsets in the transactions database. It was proposed by Agrawal & Srikant in 1993.

Example:
A customer does 4 transactions with you. In the first transaction, she buys 1 apple, 1 beer, 1 rice, and 1 chicken. In the second transaction, she buys 1 apple, 1 beer, 1 rice. In the third transaction, she buys 1 apple, 1 beer only. In fourth transactions, she buys 1 apple and 1 orange.

Support(Apple) = 4/4 

So, Support of {Apple} is 4 out of 4 or 100%

Confidence(Apple -> Beer) =  Support(Apple, Beer)/Support(Apple)
                          = (3/4)/(4/4)
                          = 3/4

So, Confidence of {Apple -> Beer} is 3 out of 4 or 75%

Lift(Beer -> Rice) = Support(Beer, Rice)/(Support(Beer) * Support(Rice))
                   = (2/4)/(3/4) * (2/4)
                   = 1.33

So, Lift value is greater than 1 implies Rice is likely to be bought if Beer is bought.

The Dataset

Market Basket dataset consists of 15010 observations with Date, Time, Transaction and Item feature or columns. The date variable or column ranges from 30/10/2016 to 09/04/2017. Time is a categorical variable that tells the time. Transaction is a quantitative variable that helps in differentiation of transactions. Item is a categorical variable that links with a product.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Loading data
dataset = read.transactions('Market_Basket_Optimisation.csv'
                           sep = ', ', rm.duplicates = TRUE)
  
# Structure 
str(dataset)

chevron_right


Performing Association Rule Mining on Dataset

Using the Association Rule Mining algorithm on the dataset which includes 15010 observations.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Installing Packages
install.packages("arules")
install.packages("arulesViz")
  
# Loading package
library(arules)
library(arulesViz)
  
# Fitting model
# Training Apriori on the dataset
set.seed = 220 # Setting seed
associa_rules = apriori(data = dataset, 
                        parameter = list(support = 0.004
                                         confidence = 0.2))
  
# Plot
itemFrequencyPlot(dataset, topN = 10)
  
# Visualising the results
inspect(sort(associa_rules, by = 'lift')[1:10])
plot(associa_rules, method = "graph"
     measure = "confidence", shading = "lift")

chevron_right


Output:

  • Model associa_rules:

    The model minimum length is 1, the maximum length is 10, and the target rules with absolute support count 30.

  • Item Frequency Plot:

    So, mineral water is the best selling product followed by eggs, spaghetti, french fries, etc.

  • Visualizing the model:

    So, the plot of graphs of 100 is displayed.

So, Association rule mining is widely used in Recommendation systems in E-Commerce, online marketplace and Social Media websites, etc, and widely used in the industry.




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.