CLASSPATH

See the CLASSPATH article for how to set up your CLASSPATH environment variable, in order to make the JDBC driver available for Weka.

Configuration files

Thanks to JDBC it is easy to connect to Databases that provide a JDBC driver. Responsible for the setup is the following properties file, located in the weka.experiment package:
 DatabaseUtils.props
You can get this properties file from the weka.jar or weka-src.jar jar-archive, both part of a normal Weka release. If you open up one of those files, you'll find the properties file in the sub-folder weka/experiment.

Weka comes with example files for a wide range of databases:
  • DatabaseUtils.props.hsql - HSQLDB (>= 3.4.1)
  • DatabaseUtils.props.msaccess - MS Access (> 3.4.14, > 3.5.8, > 3.6.0)
    see the Windows Databases article for more information.
  • DatabaseUtils.props.mssqlserver - MS SQL Server 2000 (>= 3.4.9, >= 3.5.4)
  • DatabaseUtils.props.mssqlserver2005 - MS SQL Server 2005 (>= 3.4.11, >= 3.5.6)
  • DatabaseUtils.props.mysql - MySQL (>= 3.4.9, >= 3.5.4)
  • DatabaseUtils.props.odbc - ODBC access via Sun's ODBC/JDBC bridge, e.g., for MS Sql Server (>= 3.4.9, >= 3.5.4)
    see the Windows Databases article for more information.
  • DatabaseUtils.props.oracle - Oracle 10g (>= 3.4.9, >= 3.5.4)
  • DatabaseUtils.props.postgresql - PostgreSQL 7.4 (>= 3.4.9, >= 3.5.4)
  • DatabaseUtils.props.sqlite3 - sqlite 3.x (> 3.4.12, > 3.5.7)

The easiest way is just to place the extracted properties file into your HOME directory. For more information on how property files are processed, check out this article.

Note: Weka only looks for the DatabaseUtils.props file. If you take one of the example files listed above, you need to rename it first.

Setup

Under normal circumstances you only have to edit the following two properties:
  • jdbcDriver
  • jdbcURL

Driver

jdbcDriver is the classname of the JDBC driver, necessary to connect to your database, e.g.:
  • HSQLDB
    org.hsqldb.jdbcDriver
  • MS SQL Server 2000 (Desktop Edition)
    com.microsoft.jdbc.sqlserver.SQLServerDriver
  • MS SQL Server 2005
    com.microsoft.sqlserver.jdbc.SQLServerDriver
  • MySQL
    org.gjt.mm.mysql.Driver (or com.mysql.jdbc.Driver)
  • ODBC - part of Sun's JDKs/JREs, no external driver necessary
    sun.jdbc.odbc.JdbcOdbcDriver
  • Oracle
    oracle.jdbc.driver.OracleDriver
  • PostgreSQL
    org.postgresql.Driver
  • sqlite 3.x
    org.sqlite.JDBC

URL

jdbcURL specifies the JDBC URL pointing to your database (can be still changed in the Experimenter/Explorer), e.g. for the database MyDatabase on the server server.my.domain:
  • HSQLDB
jdbc:hsqldb:hsql://server.my.domain/MyDatabase
  • MS SQL Server 2000 (Desktop Edition)
jdbc:microsoft:sqlserver://server.my.comain:1433
  • (Note: if you add ;databasename=db-name you can connect to a different database than the default one, e.g., MyDatabase)
  • MS SQL Server 2005
jdbc:sqlserver://server.my.domain:1433
  • MySQL
jdbc:mysql://server.my.domain:3306/MyDatabase
  • ODBC
jdbc:odbc:DSN_name (replace DSN_name with the DSN that you want to use)
  • Oracle (thin driver)
jdbc:oracle:thin:@server.my.domain:1526:orcl
  • (Note: @machineName:port:SID)
  • for the Express Edition you can use
jdbc:oracle:thin:@server.my.domain:1521:XE
  • PostgreSQL
jdbc:postgresql://server.my.domain:5432/MyDatabase
  • You can also specify user and password directly in the URL:
jdbc:postgresql://server.my.domain:5432/MyDatabase?user=<...>&password=<...>
  • where you have to replace the <...> with the correct values
  • sqlite 3.x
jdbc:sqlite:/path/to/database.db (you can access only local files)

Missing Datatypes

Sometimes (e.g. with MySQL) it can happen that a column type cannot be interpreted. In that case it is necessary to map the name of the column type to the Java type it should be interpreted as. E.g. the MySQL type TEXT is returned as BLOB from the JDBC driver and has to be mapped to String (0 represents String - the mappings can be found in the comments of the properties file):
 BLOB=0
The article weka/experiment/DatabaseUtils.props contains more details on this topic.

Stored Procedures

Let's say you're tired of typing the same query over and over again. A good way to shorten that, is to create a stored procedure.

PostgreSQL 7.4.x

The following example creates a procedure called emplyoee_name that returns the names of all the employees in table employee. Even though it doesn't make much sense to create a stored procedure for this query, nonetheless, it shows how to create and call stored procedures in PostgreSQL.
  • Create
 CREATE OR REPLACE FUNCTION public.employee_name()
  RETURNS SETOF text AS 'select name from employee'
  LANGUAGE 'sql' VOLATILE;
  • SQL statement to call procedure
 SELECT * FROM employee_name()
 java weka.experiment.InstanceQuery -Q "SELECT * FROM employee_name()" -U <user> -P <password>

Troubleshooting

  • In case you're experiencing problems connecting to your database, check out the WEKA Mailing List. It is possible that somebody else encountered the same problem as you and you'll find a post containing the solution to your problem.
  • Specific MS SQL Server 2000 Troubleshooting
  • MS SQL Server 2005: TCP/IP is not enabled for SQL Server, or the server or port number specified is incorrect.Verify that SQL Server is listening with TCP/IP on the specified server and port. This might be reported with an exception similar to: "The login has failed. The TCP/IP connection to the host has failed." This indicates one of the following:
    • SQL Server is installed but TCP/IP has not been installed as a network protocol for SQL Server by using the SQL Server Network Utility for SQL Server 2000, or the SQL Server Configuration Manager for SQL Server 2005
    • TCP/IP is installed as a SQL Server protocol, but it is not listening on the port specified in the JDBC connection URL. The default port is 1433.
    • The port that is used by the server has not been opened in the firewall
  • The Added driver: ... output on the commandline does not mean that the actual class was found, but only that Weka will attempt to load the class later on in order to establish a database connection.
  • The error message No suitable driver can be caused by the following:
    • The JDBC driver you are attempting to load is not in the CLASSPATH (Note: using -jar in the java commandline overwrites the CLASSPATH environment variable!). Open the SimpleCLI, run the command java weka.core.SystemInfo and check whether the property java.class.path lists your database jar. If not correct your CLASSPATH or the Java call you start Weka with.
    • The JDBC driver class is misspelled in the jdbcDriver property or you have multiple entries of jdbcDriver (properties files need unique keys!)
    • The jdbcURL property has a spelling error and tries to use a non-existing protocol or you listed it multiple times, which doesn't work either (remember, properties files need unique keys!)

See also


Links