Monday, November 3, 2008

Decision lists and Decision Trees

Decision Lists have been popular methods in 90’s in machine learning scientific publications. They produce a list of sorted production rules such as “IF condition_1 THEN conclusion_1 ELSE IF condition_2 THEN condition_2 ELSE IF…”.

Decision Lists and Decision Trees have a similar representation bias but not the same learning bias, DL use the “separate-and-conquer” principle instead of “divide-and-conquer” principle. They can produce more specialized rules but they can also lead to overfitting: setting the right learning parameters is very important for the decision lists algorithm.

The algorithm that we have implemented in TANAGRA is suggested by CN2 (Clark & Niblett, ML-1989). We have introduced two main modifications: (1) we use a hill-climbing algorithm instead of a best-first search; (2) a new parameter, minimal support of a rule, can be adjusted to avoid non-significant rules.

Keywords: CN2, decision list, decision tree, CART, discretization
Components: Supervised Learning, MDLPC, Decision List, C-RT, Bootstrap
Tutorial: en_Tanagra_DL.pdf
Dataset: dr_heart.bdm
Reference: P. Clark, « CN2 – Rule induction from examples ».