Friday, November 7, 2008

Correspondence analysis

Correspondence analysis is a visualization technique. It enables to see the association between rows and columns in a large contingency table. It belongs to "factorial analysis" approach. The method computes some axes, which are latent variables that we interpret in order to understand the proximities between rows and/or columns.

TANAGRA is not really intended for contingency table. So we use an artifice. The rows are specified from a discrete attribute, and the columns correspond to several continuous attributes in our dataset. We cannot treat a contingency table with more than 255 rows.

This tutorial is suggested by the presentation of Lebart, Morineau and Piron, in their book, « Statistique Exploratoire Multidimensionnelle », Dunod, 2000. Unfortunately, I don't think there is an English translation of this very good teaching book, which is really popular in France. However, I hope this tutorial is understandable without the book. If you read French, the description of the correspondence analysis is available at section 1.3 (pp. 67-107).

Keywords: analyse factorielle des correspondances, tableau de contingence, khi-2, chi-2, plan factoriel, contributions, cosinus carrés
Components: Correspondence analysis
Tutorial: en_Tanagra_Afc.pdf
Dataset: media_prof_afc.xls
Reference:
L. Lebart, A. Morineau, M. Piron, " Statistique exploratoire multidimensionnelle ", Dunod, 2000.
Statsoft Inc., "Correspondence Analysis".
D. Garson, "Statnotes - Correspondence Analysis".