Skip to content
This repository was archived by the owner on Jul 16, 2021. It is now read-only.

Adding error handling to model traits#120

Merged
AtheMathmo merged 4 commits intomasterfrom
error-handling
Sep 14, 2016
Merged

Adding error handling to model traits#120
AtheMathmo merged 4 commits intomasterfrom
error-handling

Conversation

@AtheMathmo
Copy link
Copy Markdown
Owner

This PR adds some error handling to the SupModel and UnSupModel traits. It's flagged as WIP as I have a few tweaks planned - but the core content is ready for review and I'd appreciate some feedback.

Note that I haven't attempted to cover all possible panics from within the train and predict functions. I think it is non-trivial to decide when such things should occur and so I'd prefer to take a lightweight approach for now and add them in as we go (as this will not cause breakage in the api). With that said - if you see something that should be returning an Error here please comment so I can address/create an issue.

@theotherphil
Copy link
Copy Markdown
Contributor

Have you considered separating the notions of model and modelling method (or 'trainer')? That would get rid of the 'model not trained' errors at the type level.

Trainer::train : training data -> Model
Trainer::predict : (data, Model) -> Prediction

This would also help make serialisation clearer when that's implemented - models become just lumps of data, and they're the things that get serialised. Modelling methods/trainer/otherbettername run algorithms to produce models from training data and to produce predictions from data and a model.

@AtheMathmo
Copy link
Copy Markdown
Owner Author

@theotherphil thanks for the comment! Someone else recommended this a while ago and at the time I was really excited about it but a couple things prevented me from moving forwards.

  • Sometimes we want to train a model multiple times. Like with online learning, or maybe in some model validation situations. This isn't a killer though, we could just make a new trait that acts on Models (ReTrainable or something).
  • Though it would make a lot of sense to the Rust crowd I worry it would be overwhelming to users coming from elsewhere (which right now is almost all of our potential users). We have the advantage right now of feeling similar to other frameworks.

Honestly, at this point in time the first point feels like we could work around this cleanly enough. As for the second I think I'd like to try converting an existing model to see what it looks like and how complex the new API feels. Maybe I'll try to get some eyes on it before merging to see how people feel about it.

There were some other issues I ran into while trying this before but none were too severe from memory.

@tafia
Copy link
Copy Markdown
Contributor

tafia commented Sep 2, 2016

Can't we have both alternatives with a wrapper over the Trainer, which hold a Option<Model> for newcomers, and the plain Trainer for the other?

@AtheMathmo
Copy link
Copy Markdown
Owner Author

@tafia - Yeah that is certainly an option too. Though I'm not sure I'd like for both options to exist like that. If we make the change it should be to enforce safety with the type system, but this way we end up with the same problem if the API is invoked directly on the wrapper.

@theotherphil
Copy link
Copy Markdown
Contributor

@AtheMathmo: your first point sounds more like an argument in favour of splitting the model out. As a user I wouldn't be sure what calling train twice is supposed to do. Before you mentioned retraining I'd have guessed that calling train twice should either panic or retrain from scratch on the new data only. But in some cases it might mean 'train a new model on the data and combine with the old model equivalently to having trained from scratch using all the data seen thus far' (e.g. if fitting a normal to your data you can just do a weighted combination of the two fitted normals), or maybe 'update the model to reflect new data without a full refit' (e.g. just updating the leaf distributions for a random forest, without retraining the node functions), or maybe 'train from scratch using all the new data and all the previous training data we've seen because we've cached it all somewhere'. It should be fairly easy to make these distinct in the type system, but at the cost of more surface complexity.

I think offering a wrapper option could lead to a lot of initial confusion for people learning the library.

@AtheMathmo
Copy link
Copy Markdown
Owner Author

@theotherphil I agree with your points totally! However, I wasn't very clear with what I meant. I meant that I avoided the change before as I couldn't see a way to handle the code organization if we require training functionality on models that have been trained (like the one you discussed above). We'd like to reuse much of the code in the training algorithm but it seems difficult to organize this effectively.

I think it still makes sense to explore this change and we can figure out the online training complications later.

@AtheMathmo AtheMathmo changed the title [WIP] Adding error handling to model traits Adding error handling to model traits Sep 3, 2016
@AtheMathmo
Copy link
Copy Markdown
Owner Author

AtheMathmo commented Sep 3, 2016

I'm fairly happy with this PR now. I decided to go with roughly the following criteria:

  • Any unexpected internal algorithmic failures are caught in debug assertions. e.g. dimensions not aligned which are determined by the algorithm.
  • Any user programming errors are caught in panics (normally via assertions).
  • Any failures - due to poor user trait implementations, bad data, or algorithmic failures out of user control are returned as Errors.

I'd be happy to receive feedback on places where it seems I'm not following this!

@AtheMathmo AtheMathmo mentioned this pull request Sep 3, 2016
@theotherphil theotherphil mentioned this pull request Sep 3, 2016
Comment thread src/learning/gp.rs Outdated
return (post_mean, post_var);
Ok((post_mean, post_var))
} else {
Err(Error::new(ErrorKind::UntrainedModel, "The model has not been trained."))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really happening a lot. Can't we have a special constructor or a more advanced ErrorKind that has or not a &'static str message?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I more or less copied the std::io implementation but I agree that this case doesn't really fit in. I'll see how easy it is to modify - but it seems this particular error will be disappearing if we modify the model traits.

@AtheMathmo
Copy link
Copy Markdown
Owner Author

@tafia - I added a new constructor for the untrained model errors. We may end up changing the API to remove these completely though.

@tafia
Copy link
Copy Markdown
Contributor

tafia commented Sep 9, 2016

To be honest while it is better it still doesn't look good. But it doesn't
really matters, better to have results merged.

On 9 Sep 2016 12:05 a.m., "James Lucas" notifications@github.com wrote:

@tafia https://github.com/tafia - I added a new constructor for the
untrained model errors. We may end up changing the API to remove these
completely though.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#120 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHAszm5_pMftdPSV8Erc6qpWWF4pSebNks5qoIamgaJpZM4JzYWR
.

@AtheMathmo
Copy link
Copy Markdown
Owner Author

@tafia I feel the same way. But if you have any suggestions on how I can make it better without overhauling the error module I'd be happy to do so.

Otherwise I'll merge this in (at least it's not worse than our current panic with train message approach!)

@AtheMathmo AtheMathmo merged commit 5fb0c57 into master Sep 14, 2016
@tafia
Copy link
Copy Markdown
Contributor

tafia commented Sep 14, 2016

Sorry for not answering. I have a very limited access to internet. I don't
have any good alternative either. You were right to merge it

On 14 Sep 2016 12:41 p.m., "James Lucas" notifications@github.com wrote:

Merged #120 #120.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#120 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHAszqRXgDOYh5xuzr0sAL_-OdBDpYgSks5qqBWLgaJpZM4JzYWR
.

@AtheMathmo AtheMathmo deleted the error-handling branch September 17, 2016 17:19
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants