Monday, January 7, 2013

PCA using R - KMO index and Bartlett's test

Principal Component Analysis (PCA) is a dimension reduction technique. We obtain a set of factors which summarize, as well as possible, the information available in the data. The factors are linear combinations of the original variables. The approach can handle only quantitative variables.

We have presented the PCA in previous tutorials. In this paper, we describe in details two indicators used for the checking of the interest of the implementation of the PCA on a dataset: the Bartlett's sphericity test and the KMO index. They are directly available in some commercial tools (e.g. SAS or SPSS). Here, we describe the formulas and we show how to program them under R. We compare the obtained results with those of SAS on a dataset.

Keywords: principal component analysis, pca, spss, sas, proc factor, princomp, kmo index, msa, measure of sampling adequacy, bartlett's sphericity test, xlsx package, psych package, R software
Components: VARHCA, PRINCIPAL COMPONENT ANALYSIS
Tutorial: en_Tanagra_KMO_Bartlett.pdf
Dataset: socioeconomics.zip
Références :
Tutoriel Tanagra - "Principal Component Analysis (PCA)"
Tutoriel Tanagra - "VARIMAX rotation in Principal Component Analysis"
SPSS - "Factor algorithms"
SAS - "The Factor procedure"