Sunday, March 4, 2012

PSPP, an alternative to SPSS

I spend a lot of time to analyze the available free statistical and data mining tools. There is not bad software, but some tools are more appropriate for some tasks. Thus, we must identify the one which is the best suited to our configuration. For that, we must know a large number of tools.

In this tutorial, we describe PSPP. It is presented as an alternative to the well-known SPSS: “PSPP is a program for statistical analysis of sampled data. It is a free replacement for the proprietary program SPSS, and appears very similar to it with a few exceptions”. Instead of to describe in detail each feature, the documentation is available on the website, we present some statistical techniques. We compare the results with those of Tanagra, R 2.13.2 and OpenStat (build 24/02/2012). This is also a way to validate them. If they provide different results, it means that there is a problem.

Keywords: pspp, R software, openstat, spss, descriptive statistics, t-test , welch test, comparison of means, comparison of variances, levene's test, chi-squared test, contingency table, cross tabs, analysis of variance, anova, multiple regression, roc curve, auc, area under curve
Components:  MORE UNIVARIATE CONT STAT, GROUP CHARACTERIZATION, CONTINGENCY CHI-SQUARE, LEVENE'S TEST, T-TEST, T-TEST UNEQUAL VARIANCE, PAIRED T-TEST, ONE-WAY ANOVA, MULTIPLE LINEAR REGRESSION, ROC CURVE
Tutorial: en_Tanagra_PSPP.pdf
Dataset: autos_pspp.zip
References:
GNU PSPP, http://www.gnu.org/software/pspp/
R Project for Statistical Computing, http://www.r-project.org/
OpenStat, http://www.statprograms4u.com/