You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jul 16, 2021. It is now read-only.
Continuing the discussion on model traits started in the comments for #120.
Suggestions for possible changes:
Separate the notions of modelling method and model
(Or model and fit, or trainer and model, or some other names to be decided.)
The new functions would have signatures along the lines of:
Trainer::train : training data -> Model
Trainer::predict : (data, Model) -> Prediction
Advantages:
Removes the possibility of calling predict on an untrained model.
Clearly delineates the model fit data from the training algorithm. This will be useful when implementing serialisation.
Drawbacks:
API could be less intuitive for people familiar with popular current machine learning libraries.
Further questions:
How should online training be dealt with? What about updating models to take account of new data without completely retraining (e.g. updating class distributions in the leaves of a random forest)?
Associated types
Currently SupModel<T, U> takes its input and output types as type parameters. Should they be associated types instead?
Advantages:
Fits more closely with the typical meanings of type parameters vs associated types in Rust (I think!). In general, the users of a model don't get to choose the types it acts on - these are determined by the model itself (some models do give the user some say over the types used, but in this case these should be parameters to the specific model rather than the trait).
Shrinks signatures of functions which are generic over models. f<M, T, U> where M: SupModel<T, U> becomes f<M: SupModel>.
Drawbacks:
I can't think of any off the top of my head, but I may well be missing something.
Other bits and pieces
Algorithms using randomness should always let the user provide a seed. Otherwise regression testing becomes impossible.
Should we consider the algebraic traits used in HLearn? This might give us efficiency wins in some cases, but might also scare away potential users.
Caveat: I have very little rust experience, and so don't really know what I'm talking about!
Continuing the discussion on model traits started in the comments for #120.
Suggestions for possible changes:
Separate the notions of modelling method and model
(Or model and fit, or trainer and model, or some other names to be decided.)
The new functions would have signatures along the lines of:
Trainer::train : training data -> Model
Trainer::predict : (data, Model) -> Prediction
Advantages:
Drawbacks:
Further questions:
Associated types
Currently
SupModel<T, U>takes its input and output types as type parameters. Should they be associated types instead?Advantages:
f<M, T, U> where M: SupModel<T, U>becomesf<M: SupModel>.Drawbacks:
Other bits and pieces
Caveat: I have very little rust experience, and so don't really know what I'm talking about!