The aim of clustering variables is to divide a set of numeric variables into disjoint clusters (subset of variables). In these slides, we present an approach based on the concept of latent component. A subset of variables is summarized by a latent component which is the first factor from the principal component analysis. This is a kind of "centroid" variable which maximizes the sum of the squared correlation with the existing variables. Various clustering algorithms based on this idea are described: a hierarchical agglomerative algorithm; a top down approach; and an approach which is inspired by the k-means method.
Keywords: clustering, clustering variables, latent variable, latent component, clusters, groups, bottom-up, hierarchical agglomerative clustering, top down, varclus, k-means, pca, principal component analysis
Components (Tanagra): VARHCA, VARKMEANS, VARCLUS
Slides: Clustering variables
Tutorials:
Tanagra tutorials, "Variable clustering (VARCLUS)", 2008.
This Web log maintains an alternative layout of the tutorials about Tanagra. Each entry describes shortly the subject, it is followed by the link to the tutorial (pdf) and the dataset. The technical references (book, papers, website,...) are also provided. In some tutorials, we compare the results of Tanagra with other free software such as Knime, Orange, R software, Python, Sipina or Weka.
Wednesday, September 24, 2014
Tuesday, September 16, 2014
Single layer and multilayer perceptrons (slides)
Artificial neural networks are computational models inspired by an animal’s central nervous system (in particular brain) which is capable of machine learning as well as pattern recognition (Wikipedia).
In these slides, we present the single layer and multilayer perceptrons, which are devoted to supervised learning process. We describe the baseline of the approaches: the difference between the linear (single-layer) and non-linear (multilayer) classifiers; the representation power of the models; the learning algorithm (the Widrow-Hoff rule and the back propagation algorithm).
Keywords: artificial neural network, perceptron, single layer, SLP, multilayer, MLP, widrow-hoff rule, backpropagation algorithm, linear classifier, non linear classifier
Components (Tanagra): MULTILAYER PERCEPTRON
Slides: Single layer and multilayer perceptrons
Tutorials:
Tanagra tutorials, "Configuration of a multilayer perceptron", December 2017.
Tanagra tutorials, "Multilayer perceptron - Software comparison", 2008.
In these slides, we present the single layer and multilayer perceptrons, which are devoted to supervised learning process. We describe the baseline of the approaches: the difference between the linear (single-layer) and non-linear (multilayer) classifiers; the representation power of the models; the learning algorithm (the Widrow-Hoff rule and the back propagation algorithm).
Keywords: artificial neural network, perceptron, single layer, SLP, multilayer, MLP, widrow-hoff rule, backpropagation algorithm, linear classifier, non linear classifier
Components (Tanagra): MULTILAYER PERCEPTRON
Slides: Single layer and multilayer perceptrons
Tutorials:
Tanagra tutorials, "Configuration of a multilayer perceptron", December 2017.
Tanagra tutorials, "Multilayer perceptron - Software comparison", 2008.
Libellés :
Supervised Learning
Saturday, September 13, 2014
Filter approaches for feature selection (slides)
In the supervised learning context, the filter approach for feature selection consists in the selection of the most appropriate variables for any subsequent machine learning algorithm used for the construction of the model.
The methods are mostly based on the correlation concept (in a large sense). They are interesting because they enable to handle quickly high-dimensional data sets. On the other hand, they are questionable because they do not take into account the characteristics of the model (e.g. linear, non-linear) that will be developed from the selected variables.
Keywords: feature selection, filter methods, embedded methods, wrapper methods
Components (Tanagra): CFS FILTERING, FCBF FILTERING, MIFS FILTERING, MODTREE FILTERING, FEATURE RANKING, FISHER FILTERING, RUNS FILTERING, STEPDISC
Slides: Filter methods
Tutorials:
Tanagra tutorials, "Filter methods for feature selection", 2010.
Tanagra tutorials, "Filter methods for feature selection (continuation)", 2010.
The methods are mostly based on the correlation concept (in a large sense). They are interesting because they enable to handle quickly high-dimensional data sets. On the other hand, they are questionable because they do not take into account the characteristics of the model (e.g. linear, non-linear) that will be developed from the selected variables.
Keywords: feature selection, filter methods, embedded methods, wrapper methods
Components (Tanagra): CFS FILTERING, FCBF FILTERING, MIFS FILTERING, MODTREE FILTERING, FEATURE RANKING, FISHER FILTERING, RUNS FILTERING, STEPDISC
Slides: Filter methods
Tutorials:
Tanagra tutorials, "Filter methods for feature selection", 2010.
Tanagra tutorials, "Filter methods for feature selection (continuation)", 2010.
Libellés :
Feature Selection
Subscribe to:
Posts (Atom)