Monday, October 9, 2017

Regression analysis in Python

Statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration.

In this tutorial, we will try to identify the potentialities of StatsModels by conducting a case study in multiple linear regression. We will discuss about: the estimation of model parameters using the ordinary least squares method, the implementation of some statistical tests, the checking of the model assumptions by analyzing the residuals, the detection of outliers and influential points, the analysis of multicollinearity, the calculation of the prediction interval for a new instance.

Keywords: regression, statsmodels, pandas, matplotlib
Tutorial: en_Tanagra_Python_StatsModels.pdf
Dataset and program: en_python_statsmodels.zip
References:
StatsModels: Statistics in Python