Saturday, November 8, 2008

Clustering trees

The aim of clustering is to build groups of individuals so that, the examples in the same group are similar, the examples in different groups are dissimilar.

Top down induction of clustering trees adapts the supervised decision/regression trees framework towards clustering. The groups are built by recursive partitioning of the dataset, the internal nodes of the tree are classically split with input attributes. The obtained model, the clustering tree, describes the groups; the learning algorithm selects automatically the relevant attributes.

The clustering trees approach is not very known; we show in this tutorial the interesting properties of this method. Our main references are the papers of Chavent (1998) and Blockeel (1998).

Keywords: clustering algorithm, clustering tree, groups characterization
Components: Multiple Correspodance Analysis, CTP, Contingency Chi-Square, K-Means
Tutorial: en_Tanagra_Clustering_Tree.pdf
Dataset: zoo.xls
References:
M. Chavent (1998), « A monothetic clustering method », Pattern Recognition Letters, 19, 989—996.
H. Blockeel, L. De Raedt, J. Ramon (1998), « Top-Down Induction of Clustering Trees », ICML, 55—63.