Remote+Experiment

toc Remote experiments enable you to distribute the computing load across multiple computers. In the following we will discuss the setup and operation for [|HSQLDB] and [|MySQL].

= Preparation = To run a remote experiment you will need:
 * A database server.
 * A number of computers to run remote engines on.
 * To edit the remote engine policy file included in the Weka distribution to allow class and dataset loading from your home directory.
 * An invocation of the Experimenter on a machine somewhere (any will do).

For the following examples, we assume a user called //johndoe// with this setup:
 * Access to a set of computers running a flavour of Unix (pathnames need to be changed for Windows).
 * The home directory is located at.
 * Weka is found in.
 * Additional jar archives, i.e., JDBC drivers, are stored in.
 * The directory for the datasets is.


 * Note:** The example policy file is using this setup (available in ).

= Database Server Setup = code format="bash" java -classpath /home/johndoe/jars/hsqldb.jar \ org.hsqldb.Server -database.0 experiment -dbname.0 experiment code > **Note:** This will start up a database with the alias //experiment//  and create a properties and a log file at the current location prefixed with //experiment//.
 * HSQLDB
 * Download the JDBC driver for HSQLDB, extract the and place it in the directory.
 * To set up the database server, choose or create a directory to run the database server from, and start the server with:

> We won't go into details of setting up a MySQL server, but this is rather straightforward and includes the following steps:
 * MySQL
 * Download a suitable version of MySQL for your server machine.
 * Start the MySQL server.
 * Create a database - for our example we will use experiment as database name.
 * Download the appropriate JDBC driver, extract the JDBC jar and place it as in.

= Remote Engine Setup = code format="bash" /home/johndoe/remote_engine code > * HSQLDB code format="bash" java -Xmx256m \ -classpath /home/johndoe/jars/hsqldb.jar:remoteEngine.jar \ -Djava.security.policy=remote.policy \ weka.experiment.RemoteEngine & code > * MySQL code format="bash" java -Xmx256m \ -classpath /home/johndoe/jars/mysql.jar:remoteEngine.jar \ -Djava.security.policy=remote.policy \ weka.experiment.RemoteEngine & code > * From Weka 3.7.2 you will need to include the core weka.jar file in the classpath for the RemoteEngine. Assuming that the weka.jar file has been copied to : code format="bash" java -Xmx256m \ -classpath /home/johndoe/jars/hsqldb.jar:remoteEngine.jar:weka.jar \ -Djava.security.policy=remote.policy \ weka.experiment.RemoteEngine & code
 * First, set up a directory for scripts and policy files:
 * Unzip the (from the Weka distribution; or build it from the sources with ) into a temporary directory.
 * Next, copy the remoteEngine.jar to the directory.
 * Create a script, called, with the following content (don't forget to make it executable with ):


 * Now we will start the remote engines (note that the same version of Java must be used for the Experimenter and remote engines) :
 * Copy the file to  as.
 * For each machine you want to run an engine on:
 * to the machine.
 * to.
 * Run (to enable the remote engines to use more memory, modify the  option in the  script).

= Configuring the Experimenter = Now we will run the Experimenter: code format="bash" java \ -cp /home/johndoe/jars/hsqldb.jar:remoteEngine.jar:/home/johndoe/weka/weka.jar \ -Djava.rmi.server.codebase=file:/home/johndoe/weka/weka.jar \ weka.gui.experiment.Experimenter code code format="bash" java \ -cp /home/johndoe/jars/mysql.jar:remoteEngine.jar:/home/johndoe/weka/weka.jar \ -Djava.rmi.server.codebase=file:/home/johndoe/weka/weka.jar \ weka.gui.experiment.Experimenter code > **Note:** the database name experiment can be still modified in the Experimenter, this is just the default setup.
 * HSQLDB
 * Copy the //DatabaseUtils.props.hsql// file to the directory and rename it to //DatabaseUtils.props// - a copy comes with your Weka distribution in.
 * Edit this file and change the "" entry to include the name of the machine that is running your database server (e.g., ).
 * Now start the experimenter (inside this directory):
 * MySQL
 * Copy the //DatabaseUtils.props.mysql// file to the directory and rename it to //DatabaseUtils.props// - a copy comes with your Weka distribution in.
 * Edit this file and change the "" entry to include the name of the machine that is running your database server and the name of the database the result will be stored in (e.g., ).
 * Now start the experimenter (inside this directory):

Now we will configure the experiment: >> Supply the value **sa** for the username and leave the password empty. >> Provide username and password that you need for connecting to the database.
 * First of all select the //Advanced// mode in the Setup tab
 * Now choose the //DatabaseResultListener// in the //Destination// panel. Configure this result producer:
 * HSQLDB
 * MySQL
 * From the Result generator panel choose either the //CrossValidationResultProducer// or the //RandomSplitResultProducer// (these are the most commonly used ones) and then configure the remaining experiment details (e.g., datasets and classifiers).
 * Now enable the //Distribute Experiment// panel by checking the tick box.
 * Click on the //Hosts// button and enter the names of the machines that you started remote engines on ( adds the host to the list).
 * You can choose to distribute by run or dataset (try to get a balance).
 * Save your experiment configuration.
 * Now start your experiment as you would do normally.
 * Check your results in the //Analyse// tab by clicking either the //Database// or //Experiment// buttons.

= Multi-core support = If you want to utilize all the cores on a multi-core machine, then you can do so with Weka version 3.6.x and developer versions later than 3.5.7. All you have to do, is define the port alongside the hostname in the Experimenter (format: ) and then start the with the  option, specifying the port to listen on. See also [|this] post on the Wekalist.

= Troubleshooting = > > then do not panic - this happens because multiple remote machines are trying to create the same table and are temporarily locked out - this will resolve itself so just leave your experiment running - in fact, it is a sign that the experiment is working!
 * If you get an error at the start of an experiment that looks a bit like this:
 * If you serialized an experiment and then modify your file due to an error (e.g., a missing type-mapping), the Experimenter will use the  you had at the time you serialized the experiment. Keep in mind that the serialization process also serializes the DatabaseUtils class and therefore stored your props-file! This is another reason for storing your experiments as XML and not in the properietary binary format the Java serialization produces.
 * Using a corrupt or incomplete file can cause peculiar interface errors, for example disabling the use of the //User// button alongside the database URL. If in doubt copy a clean  from Subversion.
 * If you get in the Remote Engine do not be alarmed. This will have no effect on the results of your experiment.

= Links =
 * Databases
 * weka/experiment/DatabaseUtils.props
 * Subversion
 * [|HSQLDB]
 * [|MySQL]