Wednesday, January 18, 2012

ARS into the SIPINA package

Association Rule Software (ARS) is a basic tool which extracts association rules from attribute-value datasets (categorical or binary attributes). It is distributed with the SIPINA package which includes: a tool for the supervised learning framework, especially the decision tree induction (SIPINA RESEARCH); a tool for the linear regression (REGRESS); and thus, ARS for the association rule mining.

ARS encodes automatically the categorical attributes in dummy variables. If you want use a continuous attributes, you must discretize them before.

This tutorial describes shortly the use of the Association Rule Software (ARS). Compared with the previous version, the GUI of the one incorporated into the SIPINA 3.8 package is simplified.

Keywords: association rule mining, support, confidence, lift, conviction
Download: Sipina setup file
Tutorial: How to use ARS
References:
Wikipedia - Association rule learning

Sipina - Version 3.8

The tools (SIPINA RESEARCH, REGRESS and ASSOCIATION RULE SOFTWARE) included in the SIPINA distribution have been updated with some improvements.

SIPINA.XLA. The add-in for Excel can work now with either for the 32 or 64-bit versions of EXCEL.

Importation of text data files. Processing time has been improved. This improvement reduces also the transferring time when we use the SIPINA.XLA add-in for Excel (which uses a temporary file in the text file format).

Association rule software. The GUI has been simplified; the display of the rules is made more readable.

Because they are internally based on the FastMM memory management, these tools can address up to 3 GB under 32-bit Windows and 4 GB under 64-bit Windows. The processing capabilities are improved.

Keywords: sipina, decision tree induction, association rule, multiple linear regression
Sipina website: Sipina
Download: Setup file
References:
Tanagra - SIPINA add-in for Excel
Tanagra - Tanagra add-in for Excel 2007 and 2010
Delphi Programming Resource - FastMM, a Fast Memory Manager

Monday, January 2, 2012

Tanagra website statistics for 2011

The year 2011 ends, 2012 begins. I wish you all a very happy year 2012.

A small statistical report on the website statistics for the past year. All sites (Tanagra, course materials, e-books, tutorials) has been visited 281,352 times this year, 770 visits per day. For comparison, we had 662 daily visits in 2010, 520 in 2009, 349 in 2008.

Who are you? The majority of visits come from France and Maghreb. Then there are a large part of French speaking countries. In terms of non-francophone countries, we observe mainly the United States, India, UK, Italy, Brazil, Germany,...

Which pages are visited? The pages that are most successful are those that relate to documentation about the Data Mining: course materials, tutorials, links to other documents available on line, etc.. This is not really surprising. I take more time myself to write booklets and tutorials, to study the behavior of different software, of which Tanagra.

Happy New Year 2012 to all.

Ricco.
Slideshow: Website statistics for 2011