Wednesday, November 19, 2014

Discretization of continuous attributes (slides)

The discretization consists to transform a continuous attribute into a discrete (ordinal) attribute. The process determines a finite number of intervals from the available values, for which discrete numerical values are assigned. The two main issues of the process are: how to determine the number of intervals; how to determine the cut points.

In this slides, we present some discretization methods for the unsupervised and supervised contexts.

Keywords: discretization, data preprocessing, chi-merge, mdlpc, equal-frequency, equal-width, clustering, top-down, bottom-up, feature construction
Components (Tanagra): EQFREQ DISC, EQWIDTH DISC, MDLPC, BINARY BINNING, CONT TO DISC
Slides: Discretization
Tutorials:
Tanagra Tutorials, "Discretization of continuous features", may 2010.