Friday, July 6, 2012

CVM and BVM from the LIBCVM toolkit

The Support Vector Machines algorithms are well-known in the supervised learning domain. They are especially appropriate when we handle a dataset with a large number “p” of descriptors . But they are much less efficient when the number of instances “n” is very high. Indeed, a naive implementation is of complexity O(n^3) for the calculation time and O(n^2) for the storing of the values. In consequence, instead of the optimal solution, the learning algorithms often highlight the near-optimal solutions with a tractable computation time .

I recently discovered the CVM (Core Vector Machine) and BVM (Ball Vector Machine) approaches. The idea of the authors is really clever: since only approximate best solutions can be highlighted, their approaches try to resolve an equivalent problem which is easier to handle - the minimum enclosing ball problem in computational geometry - to detect the support vectors. So, we have a classifier which is as efficient as those obtained by the other SVM learning algorithms, but with an enhanced ability to process datasets with a large number of instances.

I found the papers really interesting. They are all the more interesting that all the tools enabling to reproduce the experiments are provided: the program and the datasets. So, all the results shown in the paper can be verified. It contrasts to too numerous papers where some authors flaunt tremendous results but we can never reproduce them.

The CVM and BVM methods are incorporated into the LIBCVM library. This is an extension of the LIBSVM (version 2.85), which is already included into Tanagra. The source code for LIBCVM being available, I compiled it as a DLL (Dynamic-link Library) and I included it also into Tanagra 1.4.44.

In this tutorial, we describe the behavior of the CVM and BVM supervised learning methods on the "Web" dataset available on the website of the authors. We compare the results and the computation time to those of the C-SVC algorithm based on the LIBSVM library.

Keywords: support vector machine, svm, libcvm, cvm, bvm, libsvm, c-svc
Components: SELECT FIRST EXAMPLES, CVM, BVM, C-SVC
Tutorial: en_Tanagra_LIBCVM_library.pdf
Dataset: w8a.txt.zip
References :
I.W. Tsang, A. Kocsor, J.T. Kwok : LIBCVM Toolkit, Version: 2.2 (beta)
C.C Chang, C.J. Lin : LIBSVM -- A Library for Support Vector Machines