Inductive Learning Algorithm
Inductive Learning Algorithm (ILA) is an iterative and inductive machine learning algorithm which is used for generating a set of a classification rule, which produces rules of the form “IF-THEN”, for a set of examples, producing rules at each iteration and appending to the set of rules. Basic Idea: There are basically two methods for knowledge extraction firstly from domain experts and then with machine learning. For a very large amount of data, the domain experts are not very useful and reliable. So we move towards the machine learning approach for this work. To use machine learning One method is to replicate the experts logic in the form of algorithms but this work is very tedious, time taking and expensive. So we move towards the inductive algorithms which itself generate the strategy for performing a task and need not instruct separately at each step. Need of ILA in presence of other machine learning algorithms: The ILA is a new algorithm which was needed even when other reinforcement learnings like ID3 and AQ were available.
- The need was due to the pitfalls which were present in the previous algorithms, one of the major pitfalls was lack of generalisation of rules.
- The ID3 and AQ used the decision tree production method which was too specific which were difficult to analyse and was very slow to perform for basic short classification problems.
- The decision tree-based algorithm was unable to work for a new problem if some attributes are missing.
- The ILA uses the method of production of a general set of rules instead of decision trees, which overcome the above problems
THE ILA ALGORITHM: General requirements at start of the algorithm:-
- list the examples in the form of a table ‘T’ where each row corresponds to an example and each column contains an attribute value.
- create a set of m training examples, each example composed of k attributes and a class attribute with n possible decisions.
- create a rule set, R, having the initial value false.
- initially all rows in the table are unmarked.
Steps in the algorithm:- Step 1: divide the table ‘T’ containing m examples into n sub-tables (t1, t2,…..tn). One table for each possible value of the class attribute. (repeat steps 2-8 for each sub-table) Step 2: Initialize the attribute combination count ‘ j ‘ = 1. Step 3: For the sub-table on which work is going on, divide the attribute list into distinct combinations, each combination with ‘j ‘ distinct attributes. Step 4: For each combination of attributes, count the number of occurrences of attribute values that appear under the same combination of attributes in unmarked rows of the sub-table under consideration, and at the same time, not appears under the same combination of attributes of other sub-tables. Call the first combination with the maximum number of occurrences the max-combination ‘ MAX’. Step 5: If ‘MAX’ = = null , increase ‘ j ‘ by 1 and go to Step 3. Step 6: Mark all rows of the sub-table where working, in which the values of ‘MAX’ appear, as classified. Step 7: Add a rule (IF attribute = “XYZ” –> THEN decision is YES/ NO) to R whose left-hand side will have attribute names of the ‘MAX’ with their values separated by AND, and its right-hand side contains the decision attribute value associated with the sub-table. Step 8: If all rows are marked as classified, then move on to process another sub-table and go to Step 2. else, go to Step 4. If no sub-tables are available, exit with the set of rules obtained till then. An example showing the use of ILA suppose an example set having attributes Place type, weather, location, decision and seven examples, our task is to generate a set of rules that under what condition what is the decision.
|Example no.||Place type||weather||location||decision|
step 1 subset 1
step (2-8) at iteration 1 row 3 & 4 column weather is selected and row 3 & 4 are marked. the rule is added to R IF weather is warm then a decision is yes. at iteration 2 row 1 column place type is selected and row 1 is marked. the rule is added to R IF place type is hilly then the decision is yes. at iteration 3 row 2 column location is selected and row 2 is marked. the rule is added to R IF location is Shimla then the decision is yes. at iteration 4 row 5&6 column location is selected and row 5&6 are marked. the rule is added to R IF location is Mumbai then a decision is no. at iteration 5 row 7 column place type & the weather is selected and row 7 is marked. rule is added to R IF place type is beach AND weather is windy then the decision is no. finally we get the rule set :- Rule Set
- Rule 1: IF the weather is warm THEN the decision is yes.
- Rule 2: IF place type is hilly THEN the decision is yes.
- Rule 3: IF location is Shimla THEN the decision is yes.
- Rule 4: IF location is Mumbai THEN the decision is no.
- Rule 5: IF place type is beach AND the weather is windy THEN the decision is no.