Training Dataset
This follows an example from Quinlan’s ID3
Algorithm for Decision Tree Induction
Basic algorithm (a greedy algorithm)- Tree is constructed in a top-down recursive divide-and-conquer manner
- At start, all the training examples are at the root
- Attributes are categorical (if continuous-valued, they are discretized in advance)
- Examples are partitioned recursively based on selected attributes
- Test attributes are selected on the basis of a heuristic or statistical measure (e.g., information gain)
Conditions for stopping partitioning- All samples for a given node belong to the same class
- There are no remaining attributes for further partitioning – majority voting is employed for classifying the leaf
- There are no samples left
Attribute Selection Measure: Information Gain (ID3/C4.5)
- Select the attribute with the highest information gain
- S contains si tuples of class Ci for i = {1, …, m}
- information measures info required to classify any arbitrary tuple
- entropy of attribute A with values {a1,a2,…,av}
- information gained by branching on attribute A
Attribute Selection by Information Gain Computation
- Class P: buys_computer = “yes”
- Class N: buys_computer = “no”
- I(p, n) = I(9, 5) =0.940
- Compute the entropy for age:
means “age <=30” has 5 out of 14 samples, with 2 yes’es and 3 no’s.
Hence
Similarly,
Output: A Decision Tree for “buys_computer”