Here are the slides I use for my course about “Naive Bayes Classifier”. The main originality of this presentation is that I show it is possible to extract an explicit model based on the calculations of the conditional distributions. This makes highly easier the deployment of the model in real case studies. This aspect is commonly unknown. The feature selection problem in the context of Naive Bayes Classifier learning is also highlighted.
Keywords: machine learning, supervised methods, naive bayes, independence assumption, independent feature model, feature selection, cfs
Slides: Naive Bayes classifier
References:
T. Mitchell, "Generative and discriminative classifiers: naive bayes and logistic regression", in "Machine Learning", McGraw Hill, 2010; Draft of January 2010.
Wikipedia, "Naive Bayes classifier".
This Web log maintains an alternative layout of the tutorials about Tanagra. Each entry describes shortly the subject, it is followed by the link to the tutorial (pdf) and the dataset. The technical references (book, papers, website,...) are also provided. In some tutorials, we compare the results of Tanagra with other free software such as Knime, Orange, R software, Python, Sipina or Weka.
Friday, March 28, 2014
Friday, March 14, 2014
Linear discriminant analysis (slides)
Here are the slides I use for my course about “Linear Discriminant Analysis” (LDA). The two main assumptions which enable to obtain a linear classifier are highlighted. The LDA is very interesting because we can interpret the classifier in different ways: it is a parametric method based on the MAP (maximum a posteriori) decision rule; it is a classifier based on a distance to the conditional centroids; it is a linear separator which defines various regions in the representation space.
Statistical tools for the overall model evaluation and the checking of the relevance of the predictive variables are presented.
Keywords: machine learning, supervised methods, discriminant analysis, predictive discriminant analysis, linear discriminant analysis, linear classification functions, wilks lambda, stepdisc, feature selection
Slides: linear discriminant analysis
References:
J. Gareth, D. Witten, T. Hastie, R. Tibshirani, "An introduction to statistical learning with applications in R", Springer, 2013.
R. Duda, P. Hart, G. Stork, "Pattern Classification", Wiley, 2000.
Statistical tools for the overall model evaluation and the checking of the relevance of the predictive variables are presented.
Keywords: machine learning, supervised methods, discriminant analysis, predictive discriminant analysis, linear discriminant analysis, linear classification functions, wilks lambda, stepdisc, feature selection
Slides: linear discriminant analysis
References:
J. Gareth, D. Witten, T. Hastie, R. Tibshirani, "An introduction to statistical learning with applications in R", Springer, 2013.
R. Duda, P. Hart, G. Stork, "Pattern Classification", Wiley, 2000.
Libellés :
Supervised Learning
Friday, March 7, 2014
Regression Trees
Here are the slides I use for my course about “Regression Trees”. Because this course comes after the one about “Decision Trees”, only the special features for the handling of a continuous target attribute are highlighted. The described algorithms correspond roughly to the AID and the CART approaches.
Keywords: machine learning, supervised methods, regression tree, aid, cart, continuous class attribute
Slides: regression trees
References:
L. Breiman, J. Friedman, R. Olshen and C. Stone, “Classification and Regression Trees”, Wadsworth Int. Group, 1984.
J. Morgan, J.A. Sonquist, "Problems in the Analysis of Survey Data and a Proposal", JASA, 58:415-435, 1963.
Keywords: machine learning, supervised methods, regression tree, aid, cart, continuous class attribute
Slides: regression trees
References:
L. Breiman, J. Friedman, R. Olshen and C. Stone, “Classification and Regression Trees”, Wadsworth Int. Group, 1984.
J. Morgan, J.A. Sonquist, "Problems in the Analysis of Survey Data and a Proposal", JASA, 58:415-435, 1963.
Libellés :
Regression analysis,
Supervised Learning
Saturday, March 1, 2014
Decision tree learning algorithms
Here are the slides I use for my course about the existing decision tree learning algorithms. Only the most popular ones are described: C4.5, CART and CHAID (a variant). The differences between these approaches are highlighted according: the splitting measure; the merging strategy during the splitting process; the approach for determining the right sized tree.
Keywords: machine learning, supervised methods, decision tree learning, classification tree, chaid, cart, c4.5
Slides: C4.5, CART and CHAID
References:
L. Breiman, J. Friedman, R. Olshen and C. Stone, “Classification and Regression Trees”, Wadsworth Int. Group, 1984.
G. Kass, “An exploratory technique for Investigating Large Quantities of Categorical Data”, Applied Statistics, 29(2), 1980, pp. 119-127.
R. Quinlan, “C4.5: Programs for machine learning”, Morgan Kaufman, 1993.
Keywords: machine learning, supervised methods, decision tree learning, classification tree, chaid, cart, c4.5
Slides: C4.5, CART and CHAID
References:
L. Breiman, J. Friedman, R. Olshen and C. Stone, “Classification and Regression Trees”, Wadsworth Int. Group, 1984.
G. Kass, “An exploratory technique for Investigating Large Quantities of Categorical Data”, Applied Statistics, 29(2), 1980, pp. 119-127.
R. Quinlan, “C4.5: Programs for machine learning”, Morgan Kaufman, 1993.
Libellés :
Decision tree,
Supervised Learning
Subscribe to:
Posts (Atom)