Instance+ID

= Instance ID = People often want to **tag** their **instances with identifiers**, so they can keep track of them and the predictions made on them.

Adding the ID
A new ID attribute is added real easy: one only needs to run the filter over the dataset and it's done. Here's an example (at a DOS/Unix command prompt): code format="bash" java weka.filters.unsupervised.attribute.AddID -i data_without_id.arff -o data_with_id.arff code (all on a single line)


 * Note:** the AddID filter adds a numeric attribute, not a String attribute to the dataset. If you want to remove this ID attribute for the classifier in a environment again, use the  filter instead of the  filter (same package).

Removing the ID
If you run from the command line you can use the option to output predictions plus any other attributes you are interested in. So it is possible to have a string attribute in your data that acts as an identifier. A problem is that most classifiers don't like String attributes, but you can get around this by using the (this removes //String// attributes by default).

Here's an example. Lets say you have a training file named, a testing file named , and they have an identifier String attribute as their 5th attribute. You can get the predictions from along with the identifier strings by issuing the following command (at a DOS/Unix command prompt): code format="bash" java weka.classifiers.meta.FilteredClassifier -F weka.filters.unsupervised.attribute.RemoveType -W weka.classifiers.trees.J48 -t train.arff -T test.arff -p 5 code (all on a single line)

If you want, you can redirect the output to a file by adding "" to the end of the line.

In the Explorer GUI you could try a similar trick of using the String attribute identifiers here as well. Choose the, with the as the filter, and whatever classifier you prefer. When you visualize the results you will need click through each instance to see the identifier listed for each.