Input Format

The file that holds data instances as used in lktrain and lkpredict looks as follows:

Iris-setosa | sepal-length:4.7 sepal-width:3.2 petal-length:1.6 petal-width:0.2
Iris-versicolor | sepal-length:5.0 sepal-width:2.3 petal-length:3.3 petal-width:1.0
...
Iris-versicolor | sepal-length:6.0 sepal-width:2.7 petal-length:5.1 petal-width:1.6
...

That is, each line has one data instance, with the correct label followed by a space-separated vertical bar and then the features, consisting of feature name and value, separated by a colon. The colon and value can be omitted, with the value defaulting to 1.0.

An optional importance weight can be specified after the label, e.g.

Iris-setosa 1.5 | sepal-length:4.7 sepal-width:3.2 petal-length:1.6 petal-width:0.2
Iris-versicolor 0.1 | sepal-length:5.0 sepal-width:2.3 petal-length:3.3 petal-width:1.0
...
Iris-versicolor 4 | sepal-length:6.0 sepal-width:2.7 petal-length:5.1 petal-width:1.6
...

This weights the data instance accordingly when training, and for integral weights is similar to duplicating the data instance that many times.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Input Format

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally