Ripper Algorithm

RIPPER Algorithm :

It stands for Repeated Incremental Pruning to Produce Error Reduction. The Ripper Algorithm is a Rule-based classification algorithm. It derives a set of rules from the training set. It is a widely used rule induction algorithm.

Uses of Ripper Algorithm:

It works well on datasets with imbalanced class distributions. In a dataset, if we have several records out of which most of the records belong to a particular class and the remaining records belong to different classes then the dataset is said to have an imbalanced distribution of class.
It works well with noisy datasets as it uses a validation set to prevent model overfitting.

Working of RIPPER

Case I: Training records belong to only two classes

Among the records given, it identifies the majority class ( which has appeared the most ) and takes this class as the default class. For example: if there are 100 records and 80 belong to Class A and 20 to Class B. then Class A will be default class.

For the other class, it tries to learn/derive various rules to detect that class.

Case II: Training records have more than two classes ( Multiple Classes )

Consider all the classes that are available and then arrange them on the basis of their frequency in a particular order ( say increasing).

Consider the classes are arranged as –

C1,C2,C3,......,Cn
C1 - least frequent
Cn - most frequent

The class with the maximum frequency (Cn) is taken as the default class.

How the rule is Derived:

In the first instance, it tries to derive rules for those records which belong to class C1. Records belonging to C1 will be considered as positive examples(+ve) and other classes will be considered as negative examples(-ve).

Sequential Covering Algorithm is used to generate the rules that discriminate between +ve and -ve examples.

Next, at this junction Ripper tries to derive rules for C2 distinguishing it from the other classes.

This process is repeated until stopping criteria is met, which is- when we are left with Cn (default class).

Ripper extracts rules from minority class to the majority class.

Rule Growing in RIPPER Algorithm:

Ripper makes use of general to a specific strategy of growing rules. It starts from an empty rule and goes on adding the best conjunct to the rule antecedent.
For evaluation of conjuncts the metric is chosen is FOIL’s Information Gain. Using this the best conjunct is chosen.
Stopping Criteria for adding the conjuncts – when the rule starts covering the negative (-ve) examples.
The new rule is pruned based on its performance on the validation set.

Rule Pruning Using RIPPER:

We need to identify whether a particular rule should be pruned or not. To determine this a metric is used, which is –

(P-N)/(P+N)

P = number of positive examples in the validation set covered by the rule.
N = number of negative examples in the validation set covered by the rule.

Whenever a conjunct is added or removed we calculate the value of the above metric for the original rule (before adding/removing) and the new rule (after adding/removing).
If the value of the new rule is better than the original rule then we can add/remove the conjunct. Otherwise, the conjunct will not be added/removed.
Pruning is done starting from the rightmost end. For example: Consider a rule –

ABCD ---> Y ,where A,B,C,D are conjuncts and Y is the class.

First it will remove the conjunct D and measure the metric value. If the quality of the 
metric is improved the conjunct D is removed.

If the quality does not improve then the pruning is checked for CD,BCD and so on.

Building the Ruleset in RIPPER Algorithm:

After a rule is derived, all the positive and negative examples covered by the rule are eliminated.
The rule is then added into the ruleset until it doesn’t violate the stopping condition. The stopping criteria which we can use are –

A) Minimum description length principle: For transferring the information from one end to another end you require a minimum number of bits. We want the rule to be represented using a minimum number of bits. If the new rule increases the total description length of the ruleset by d bits ( by default d is 64 bits), then RIPPER stops adding rules into the ruleset.

B) Error Rate – We will consider the rule and calculate its error rate (misclassification) w.r.t the validation set. The error rate of a particular rule should not exceed more than 50%.

This is how a RIPPER Algorithm works. For any queries do leave a comment down below.

Article Tags :

Machine Learning