Using+Weka+from+Jython

toc

Jython is an implementation of the high-level, dynamic, object-oriented language Python written in 100% Pure Java, and seamlessly integrated with the Java platform. It thus allows you to run Python on any Java platform.

//-- taken from the [|Jython homepage]//

This article explains how use Weka classes from within Jython and how to write a classifier in Jython that can be used within the Weka framework.

= Accessing Weka classes from Jython =

Requirements
In order for Jython to find the Weka classes, you must export them in your CLASSPATH. Here is an example for adding the located in the directory  to the CLASSPATH in a bash under Linux: code format="bash" export CLASSPATH=$CLASSPATH:/some/where/weka.jar code
 * Note:** Windows users must just the backslash ("\") in the command prompt instead of the slash ("/") in paths.

Implementation
As soon as one imports classes in a Jython module one can use that class just like in Java. E.g., if one wants to use the J48 classifier, one only needs to import it as follows: code format="python" import weka.classifiers.trees.J48 as J48 code Here's a Jython module that loads a dataset, trains a J48 classifier with it and outputs the generated model (code based on [|this] blog entry): code format="python" import sys import java.io.FileReader as FileReader import weka.core.Instances as Instances import weka.classifiers.trees.J48 as J48 file = FileReader("/some/where/file.arff") data = Instances(file) data.setClassIndex(data.numAttributes - 1) j48 = J48 j48.buildClassifier(data) print j48 code A slightly more elaborate example can be found in, which uses more methods of the class.
 * 1) load data file
 * 1) create the model
 * 1) print out the built model


 * NB:** The example needs Weka 3.6.x to run, due to some changes in the API.

= Implementing a Jython classifier =

Requirements

 * Weka >3.5.6 (or snapshot from 02/08/2007 or later)
 * Jython 2.2rc2 (later versions should work as well)

Implementation
This section covers the implementation of in Python: > > //Note:// the interface is necessary for Serialization purposes (Weka creates copies of classifiers via serialization)
 * Subclass an abstract superclass of Weka classifiers (in this case ):

>> Returns an of  objects of all available options. Calling the superclass method is done with, e.g.,. >> Sets the commandline options, with the parameter being an array of strings. >> Returns an array of strings, containing all the currently set options (to be used with ). >> Returns a object with information about what attributes and classes can be processed by this algorithm. >> This method builds the actual model based on the data provided. The first statements in this method should be the ones checking the capabilities of the algorithm against the data and removing all instances with a missing class value: code format="python" self.getCapabilities.testWithFail(instances) instances = Instances(instances) instances.deleteWithMissingClass code >>> Returns either the index of the predicted class label (for nominal classes) or the regression result (for numeric classes) >>> This method returns an array of doubles containing the probabilities for all class labels. In case of a numeric class attribute, the length of this array is 1. In Jython, you can use the module to generate a double array. With the following line you can create the correct array to be returned by this method (you still need to fill it with values): >>> >>> Of course, the elements of this array must sum up to 1. >> Returns a string describing the not-yet-built or built model. code format="python" if __name__ = "__main__": Classifier.runClassifier(JeroR, sys.argv[1:]) code > This doesn't work right out-of-the-box, since Jython cannot access protected static methods in superclasses. One has to set the following value in the [|Jython registry] to make it work (taken from [|this] FAQ): code format="ini" python.security.respectJavaAccessibility=false code
 * You have to implement the following methods:
 * 1) check the capabilities
 * 1) remove instances with missing class
 * at least **one** of the following two:
 * The following code snippet simulates the "main" method; it creates an instance of the classifier and passes it on to the method:

Documentation
Documentation in Python is done with the so-called //doc strings// within the class or method the documentation is for. Using [|HappyDoc], one can use //structured text// to output nice HTML, similar to [|Javadoc]. code format="python" class JeroR (Classifier, JythonSerializableObject): """    JeroR is a Jython implementation of the Weka classifier ZeroR     'author' -- FracPete (fracpete at waikato dot ac dot nz)     'version' -- $Revision$     """ code > **Note:** the tag is filled in by a source control system like CVS or Subversion.
 * Class doc string:

code format="python" def classifyInstance(self, instance): """    returns the prediction for the given instance     Parameter(s):         'instance' -- the instance to predict the class value for     Return:         the prediction for the given instance     """ code
 * Method doc string:

Execution

 * Note:** The commands listed here for a Linux/Unix bash, for Windows remove all the backslashes ("\") at the end of the lines and assemble the command in a single line. Under Windows, the path separator ":" used in the CLASSPATH needs to be replaced with ";" as well.

Jython
The Jython classifier, e.g.,, can be run like this from commandline, with only the and the  in the CLASSPATH: code format="bash" java -classpath weka.jar:jython.jar \ org.python.util.jython \ /some/place/FunkyClassifier.py \ -t /some/where/file.arff code

Weka
In order to execute the Jython classifier with Weka, one basically only needs to have the  and the  in the CLASSPATH and call the  classifier with the Jython classifier, i.e.,, as parameter (""): code format="bash" java -classpath weka.jar:jython.jar \ weka.classifiers.JythonClassifier \ -J /some/place/FunkyClassifier.py \ -t /some/where/file.arff code

= Downloads =
 * [[file:UsingJ48.py]]
 * [[file:UsingJ48Ext.py]]
 * [[file:JeroR.py]] - as Jython script implemented

= See also =
 * Use Weka in your Java code - for general information on how to use the Weka API
 * Using Weka via Jepp - using the approach to interface Java and Python

= Links =
 * Jython
 * [|Homepage]
 * [|Java arrays]
 * [|Registry]
 * [|Embedding Jython]
 * Python
 * [|Homepage]
 * [|HappyDoc] - generating documentation from Jython/Python modules
 * Java
 * [|Homepage]
 * [|Javadoc]
 * Eclipse
 * [|Homepage]
 * [|PyDev plugin]