**during the classification process, i.e. when we apply the classifier on an unlabeled instance**, is less studied. However, the problem is important. Indeed, the model is designed to work only when the instance to label is fully described. If some values are not available, we cannot directly apply the model. We need a strategy to overcome this difficulty .

In this tutorial, we are in the supervised learning context. The classifier is a logistic regression model. All the descriptors are continuous. We want to evaluate on various datasets from the UCI repository the behavior of two imputations methods: the univariate approach and the multivariate approach. The constraint is that the imputation models must rely on information from the learning sample. We consider that this last one does not contain missing values.

We note that the occurrence of the missing value on the instance to classify is "missing completely at random" in our experiments i.e. each descriptor has the same probability to be missing.

**Keywords**: missing values, missing features, classification model, logistic regression, multiple linear regression, r software, glm, lm, NA

**Components**: Binary Logistic Regression

**Tutorial:**: en_Tanagra_Missing_Values_Deployment.pdf

**Dataset and programs (R language)**: md_logistic_reg_deployment.zip

**References**:

Howell, D.C., "Treatment of Missing Data".

M. Saar-Tsechansky, F. Provost, “Handling Missing Values when Applying Classification Models”, JMLR, 8, pp. 1625-1657, 2007.