Writing+your+own+Classifier+(up+to+3.5.2)

toc In case you have a flash idea for a new classifier and want to write one for Weka, this HOWTO will help you developing it.

The Mindmap (, produced with [|Freemind]) helps you decide from which base classifier to start, what methods are to be implemented and general guidelines.

The base classifiers are all located in the following package: code format="text" weka.classifiers code

= Packages = A few comments about the different classifier sub-packages: > contains bayesian classifiers, e.g. NaiveBayes > classes related to evaluation, e.g., cost matrix > e.g., Support Vector Machines, regression algorithms, neural nets > no //offline// learning, that is done during runtime, e.g., k-NN > Meta classifiers that use a //base// classifier as input, e.g., boosting or bagging > various classifiers that don't fit in any another category > rule-based classifiers, e.g. ZeroR > tree classifiers, like decision trees

= Coding =

Random number generators
In order to get repeatable experiments, one is not allowed to use //unseeded// random number generators like. Instead, one has to instantiate a object in the  method with a specific seed value. The seed value can be user supplied, of course, which all the abstract classifiers already implement.

= Integration = After finishing the coding stage, it's time to integrate your classifier in the Weka framework, i.e., to make it available in the Explorer, Experimenter, etc. Starting with version **3.4.4**, Weka supports an automatic discovery of derived classes in your classpath, managed by the //GenericPropertiesCreator//.

The GenericObjectEditor article shows you how to tell Weka where to find your classifier and therefore displaying it in the //GenericObjectEditor//.

= Testing = Weka provides already a test framework to ensure the basic functionality of a classifier. It is essential for the classifier to pass these tests.

Commandline test
Use the CheckClassifier class to test your classifier from Commandline: code format="bash" weka.classifiers.CheckClassifier -W classname [-- additional parameters] code Only the following tests may have "no" as result, the others must have a "no (OK error message)" or "yes":
 * options
 * updateable classifier
 * weighted instances classifier

Unit tests
In order to make sure that your classifier applies to the Weka criteria, you should add your classifier to the [|junit] unit test framework, i.e., by creating a Test class (starting with Weka version 3.4.6 and 3.5.1 the uses the  class to run a battery of tests).

How to check out the unit test framework, you can find here.

= See also =
 * GenericObjectEditor/GenericPropertiesCreator
 * Writing your own Classifier (post 3.5.2)

= Links =
 * [[file:Build_classifier.pdf]] - MindMap for implementing a new classifier
 * [|Weka API]
 * [|Freemind]
 * [|junit]