Thursday, April 30, 2009

Multiple Correspondence Analysis (MCA)

The multiple correspondence analysis is a factor analysis approach. It deals with a tabular dataset where a set of examples are described by a set of categorical variables. The aim is to map the dataset in a reduced dimension space (usually two) which allows us to highlight the associations between the examples and the variables. It is useful to understand the underlying structure of a tabular dataset.

In this tutorial, we show how to implement this approach and how to interpret the results with Tanagra. The opportunity to copy/paste the results in a spreadsheet is certainly one of the most interesting functionalities of the software. Indeed, it gives us access to tools (tri, formatted, etc) in a well-known environment of the experts of the data processing. For example, the possibility of sorting the various tables according to the contributions and the COS2 proves really practical when one wishes to interpret the dimensions.

Keywords: factor analysis, multiple correspondence analysis
Components: Multiple correspondance analysis, View Dataset, Scatterplot with labels, View multiple scatterplot
Tutorial: en_Tanagra_Acm.pdf
Dataset: races_canines_acm.xls
References:
M. Tenenhaus, " Méthodes statistiques en gestion ", Dunod, 1996 ; pages 212 to 222 (in French).
Statsoft Inc., "Multiple Correspondence Analysis".
D. Garson, "Statnotes - Correspondence Analysis".