LibSVM

toc =Description= Wrapper class for the [|libsvm] library by Chih-Chung Chang and Chih-Jen Lin. The original wrapper, named WLSVM, was developed by [|Yasser EL-Manzalawy]. The current version is complete rewrite of the wrapper, using [|Reflection] in order to avoid compilation errors, in case the is not in the CLASSPATH.

From Weka >= 3.7.2 installation and use of libsvm in Weka has been simplified by the creation of a [|LibSVM] package that can be installed using either the graphical or command line package manager.
 * Important note:**

=**Reference (Weka <=** **3.6.8)**=
 * [|libsvm]
 * [|WLSVM]

=Package=

=Download= The wrapper class is part of Weka since version 3.5.2. But //libsvm//, as a third-party-tool needs to be downloaded separately (see libsvm's Reference). It is recommended to upgrade to a post-3.5.3 version (or Subversion) for bug-fixes and extensions (contains now the method).

=CLASSPATH= Add the from the libsvm distribution to your CLASSPATH to make it available.

code format="bash" java -classpath $CLASSPATH:weka.jar:libsvm.jar weka.gui.GUIChooser code or this on Win32 (if you're starting it from commandline): code format="bash" java -classpath "%CLASSPATH%;weka.jar;libsvm.jar" weka.gui.GUIChooser code
 * Note:** Do NOT start Weka then with . The **-jar** option //overwrites// the CLASSPATH, not augments it (a very common trap to fall into). Instead use something like this on Linux:

If you're starting Weka from the Start Menu on Windows, you'll have to add the to your  environment variable. The following steps are for //Windows XP// (unfortunately, the GUI changes among the different Windows versions):
 * right-click on //My Computer// and select //Properties// from the menu
 * choose the //Advanced// tab and click on //Environment variables// at the bottom
 * either add or modify a variable called and add the  with full path to it

=Examples= > [|This] Wekalist post explains how to use the one-class SVM to detect outliers. > [|This] Wekalist post explains how to use weights for the classes ( parameter, property in GUI).
 * **One-class SVM**
 * **Class weights**

=Troubleshooting= >> >> The property must list the. If it is listed, check whether the path is correct. >> If you're on Windows and you find there, see next bullet point to fix this. >> **Note:** [|backslashes have to be escaped], not only once, but twice (they get interpreted by Java twice!). In other words, instead of //one// you have to use //four//: then turns into.
 * **libsvm classes not in CLASSPATH!**
 * Check whether the is really in your CLASSPATH. Execute the following command in the **SimpleCLI**:
 * On Windows, if you added the to your CLASSPATH environment variable, it can still happen that Weka pops up the error message that the libsvm classes are not in your CLASSPATH. This can happen on [|Windows 2000 and XP] and the  does not get expanded to its actual value in starting up Weka. You can inspect your current CLASSPATH with which Weka got started up with the **SimpleCLI** (see previous bullet point). If  is listed there, your system has the same problem. [|This] Wekalist post explains how to explicitly add the  to  (works the same for ).

=Issues with libsvm.jar= This section is based on [|this] Wekalist post.

The following changes were not incorporated in Weka, since it also means modifying the libsvm Java code, which (I think) is autogenerated from the C code. The authors of libsvm might have to consider that update. It's left to the reader to incorporate these changes.

libsvm.svm uses Math.random
libsvm.svm calls Math.random so the model it returns is usually different for the same training set and svm parameters over time.

Obviously, if you call libsvm.svm from weka.classifiers.functions.libsvm, and you call it again from libsvm.svm_train, the results are also different.

You can use libsvm.svm_save_model to record the svms into files, and then compare the model file from weka libsvm with the model file from libsvm.svm_predict. Then you can see that ProbA values use to be different.

Weka experimenter is based on using always the same random sequences in order to repeat experiments with the same results. So, I'm afraid some important design changes are required on libsvm.jar and weka.classifiers.functions.libsvm.class to keep such behaviour. We made a quick fix adding an static Random attribute to libsvm.svm class: code format="java" static java.util.Random ranGen = new Random(0); code We have changed all Math.random invokations to ranGen.nextdouble. Then we have obtained the same svm from weka libsvm than from libsvm train_svm.

However, weka accuracy results on primary_tumor data were still worse, so there's something wrong when weka uses the svm model at testing step.

Classes without instances
Arff format provides some meta-information (i.e. attributes name and type, set of possible values for nominal attributes), but libsvm format doesn't. So if there are classes in the dataset with zero occurrences through all the instances, libsvm thinks that these classes don't exist whereas Weka knows they exist.

For example, there is a class in primary tumor dataset that never appears. When weka experimenter makes testing, it calls to: code format="java" public static double svm_predict_probability(svm_model model, svm_node[] x, double[] prob_estimates) code passing the array prob_estimates plenty of zeros (array cells are initialized to zero). The size of the array is equal to the number of classes (= 22). On the other hand, if this method is invoked from libsvm.svm_predict, the class that never appears is ignored, so the array dimension is now equal to 21.

So accuracy results are different depending on origin of svm_predict_probability method invocation. I think that better results are obtained if classes without instances are ignored, but I don't know if it is very fair. In fact, accuracies from weka.libsvm and from libsvm.predict_svm seem to be the same if the class that never appears is removed from arff file.

Note that this problem only appears when testing, because the training code uses always the svm_group_classes method to compute the number of classes, so Instances.numClasses value is never used for training. Moreover, maybe the mismatch between the training number of classes and the testing number of classes is the reason behind worse accuracy results when svm_predict_probability invocation is made from weka, but I haven't proved it yet.

Note that this problem does also happen when you have a class with less examples than the number of folds. For some folds, the class will not have training examples.

We also made a quick fix for this problem: > > First line of the method: code format="java" int[] labels = new int[instance.numClasses]; code > could be changed to code format="java" int[] labels = new int[((svm_model) m_Model).getNr_class]; code > Last line in "if(m_ProbablityEstimates)" block: code format="java" prob_estimates = new double[instance.numClasses]; code > could be changed to code format="java" prob_estimates = new double[((svm_model) m_Model).getNr_class]; code
 * 1) Add this public method to libsvm.svm_model class
 * 1) Make the following changes into  Method at