Databases

toc = CLASSPATH = See the CLASSPATH article for how to set up your CLASSPATH environment variable, in order to make the JDBC driver available for Weka.

= Configuration files = Thanks to JDBC it is easy to connect to Databases that provide a JDBC driver. Responsible for the setup is the following properties file, located in the package: code format="text" DatabaseUtils.props code You can get this properties file from the or  jar-archive, both part of a normal Weka release. If you open up one of those files, you'll find the properties file in the sub-folder.

Weka comes with example files for a wide range of databases: > see the Windows Databases article for more information. > see the Windows Databases article for more information.
 * - HSQLDB (>= 3.4.1)
 * - MS Access (> 3.4.14, > 3.5.8, > 3.6.0)
 * - MS SQL Server 2000 (>= 3.4.9, >= 3.5.4)
 * - MS SQL Server 2005 (>= 3.4.11, >= 3.5.6)
 * - MySQL (>= 3.4.9, >= 3.5.4)
 * - ODBC access via Sun's ODBC/JDBC bridge, e.g., for MS Sql Server (>= 3.4.9, >= 3.5.4)
 * - Oracle 10g (>= 3.4.9, >= 3.5.4)
 * - PostgreSQL 7.4 (>= 3.4.9, >= 3.5.4)
 * - sqlite 3.x (> 3.4.12, > 3.5.7)

The easiest way is just to place the extracted properties file into your HOME directory. For more information on how property files are processed, check out this article.


 * Note:** Weka //only// looks for the file. If you take one of the example files listed above, you need to rename it first.

= Setup = Under normal circumstances you only have to edit the following two properties:

Driver
is the classname of the JDBC driver, necessary to connect to your database, e.g.: > > > > (or ) > > > >
 * HSQLDB
 * MS SQL Server 2000 (Desktop Edition)
 * MS SQL Server 2005
 * MySQL
 * ODBC - part of Sun's JDKs/JREs, no external driver necessary
 * Oracle
 * PostgreSQL
 * sqlite 3.x

URL
specifies the JDBC URL pointing to your database (can be still changed in the Experimenter/Explorer), e.g. for the database on the server : code format=text jdbc:hsqldb:hsql://server.my.domain/MyDatabase code code format=text jdbc:microsoft:sqlserver://server.my.comain:1433 code > (Note: if you add you can connect to a different database than the default one, e.g., //MyDatabase//) code format=text jdbc:sqlserver://server.my.domain:1433 code code format=text jdbc:mysql://server.my.domain:3306/MyDatabase code code format=text jdbc:odbc:DSN_name (replace DSN_name with the DSN that you want to use) code code format=text jdbc:oracle:thin:@server.my.domain:1526:orcl code > (Note: ) > for the //Express Edition// you can use code format=text jdbc:oracle:thin:@server.my.domain:1521:XE code code format=text jdbc:postgresql://server.my.domain:5432/MyDatabase code > You can also specify user and password directly in the URL: code format=text jdbc:postgresql://server.my.domain:5432/MyDatabase?user=<...>&password=<...> code > where you have to replace the with the correct values code format=text jdbc:sqlite:/path/to/database.db (you can access only local files) code
 * HSQLDB
 * MS SQL Server 2000 (Desktop Edition)
 * MS SQL Server 2005
 * MySQL
 * ODBC
 * Oracle (thin driver)
 * PostgreSQL
 * sqlite 3.x

= Missing Datatypes = Sometimes (e.g. with MySQL) it can happen that a column type cannot be interpreted. In that case it is necessary to map the name of the column type to the Java type it should be interpreted as. E.g. the MySQL type is returned as  from the JDBC driver and has to be mapped to  ( represents  - the mappings can be found in the comments of the properties file): code format="ini" BLOB=0 code The article weka/experiment/DatabaseUtils.props contains more details on this topic.

= Stored Procedures = Let's say you're tired of typing the same query over and over again. A good way to shorten that, is to create a stored procedure.

PostgreSQL 7.4.x
The following example creates a procedure called **emplyoee_name** that returns the names of all the employees in table **employee**. Even though it doesn't make much sense to create a stored procedure for this query, nonetheless, it shows how to create and call stored procedures in PostgreSQL. code format=text CREATE OR REPLACE FUNCTION public.employee_name RETURNS SETOF text AS 'select name from employee' LANGUAGE 'sql' VOLATILE; code code format=text SELECT * FROM employee_name code code format=text java weka.experiment.InstanceQuery -Q "SELECT * FROM employee_name" -U -P code
 * Create
 * SQL statement to call procedure
 * Retrieve data via

= Troubleshooting =
 * In case you're experiencing problems connecting to your database, check out the [|WEKA Mailing List]. It is possible that somebody else encountered the same problem as you and you'll find a post containing the solution to your problem.
 * Specific MS SQL Server 2000 Troubleshooting
 * MS SQL Server 2005: TCP/IP is not enabled for SQL Server, or the server or port number specified is incorrect.Verify that SQL Server is listening with TCP/IP on the specified server and port. This might be reported with an exception similar to: "The login has failed. The TCP/IP connection to the host has failed." This indicates one of the following:
 * SQL Server is installed but TCP/IP has not been installed as a network protocol for SQL Server by using the SQL Server Network Utility for SQL Server 2000, or the SQL Server Configuration Manager for SQL Server 2005
 * TCP/IP is installed as a SQL Server protocol, but it is not listening on the port specified in the JDBC connection URL. The default port is 1433.
 * The port that is used by the server has not been opened in the firewall
 * The **Added driver: ...** output on the commandline does not mean that the actual class was found, but only that Weka will //attempt// to load the class later on in order to establish a database connection.
 * The error message **No suitable driver** can be caused by the following:
 * The JDBC driver you are attempting to load is not in the CLASSPATH (Note: using **-jar** in the java commandline **overwrites** the CLASSPATH environment variable!). Open the SimpleCLI, run the command and check whether the property  lists your database jar. If not correct your CLASSPATH or the Java call you start Weka with.
 * The JDBC driver class is misspelled in the property or you have multiple entries of  (properties files need //unique// keys!)
 * The property has a spelling error and tries to use a non-existing protocol or you listed it multiple times, which doesn't work either (remember, properties files need //unique// keys!)

= See also =
 * weka/experiment/DatabaseUtils.props
 * Properties File
 * CLASSPATH

= Links =
 * HSQLDB
 * [|homepage]
 * IBM Cloudscape
 * [|homepage]
 * Microsoft SQL Server
 * [|SQL Server 2000 (Desktop Engine)]
 * [|SQL Server 2000 JDBC Driver SP 3]
 * [|SQL Server 2005 JDBC Driver]
 * MySQL
 * [|homepage]
 * [|JDBC driver]
 * Oracle
 * [|homepage]
 * [|JDBC driver]
 * [|JDBC FAQ]
 * PostgreSQL
 * [|homepage]
 * [|JDBC driver]
 * sqlite
 * [|homepage]
 * [|JDBC driver]
 * Weka Mailing list