Skip to content

prefit xgboost model in xrf_fit(), allow early stopping#60

Merged
simonpcouch merged 4 commits into
mainfrom
prefit
Jun 8, 2022
Merged

prefit xgboost model in xrf_fit(), allow early stopping#60
simonpcouch merged 4 commits into
mainfrom
prefit

Conversation

@simonpcouch
Copy link
Copy Markdown
Contributor

This PR:

  • Adds support for early stopping with xrf::xrf()
  • Corrects routing of mtry from colsample_bytree to colsample_bynode (and is now thus consistent with parsnip, see mtry maps to wrong parameter for XGBoost parsnip#495)
  • De-duplicates xgboost training code between parsnip and rules
  • Extends testing for xrf engine and removes mtry tests that are now redundant with parsnip

It does so by making use of the prefit_xgb argument to xrf::xrf()—instead of having xrf handle the xgboost training, we use parsnip::xgb_train and pass the pre-fit xgboost model to xrf.

Justification for pre-fitting The pre-fitting makes some parts of wrapping easier for us, but was initially a need for early stopping:

xrf::xrf.formula() wraps both xgboost::xgb.train() and glmnet::glmnet(). xrf::xrf.formula() takes in an xgb_control argument, where all arguments passed to xgb_control are passed to xgboost::xbg.train()'s param argument—this is currently the only way one can pass arguments to xgboost::xgb.train() through xrf::xrf.formula().

This is an issue for early stopping, as early_stopping_rounds is a main argument to xgboost::xgb.train() and cannot be passed through param. Thus, if we wanted to implement an interface for early stopping, we'd either:

  • need to contribute to the xrf package to allow for passing arguments to xrf::xrf.formula() as main arguments to xgboost::xbg.train()
  • "prefit" the xgboost model with our own machinery and pass it as prefit_xgb to xrf::xrf.formula().

Related to tidymodels/parsnip#749!

@simonpcouch simonpcouch requested a review from EmilHvitfeldt June 7, 2022 15:06
Copy link
Copy Markdown
Member

@EmilHvitfeldt EmilHvitfeldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of small changes to the tests and I think we are good here!

Comment thread tests/testthat/test-rule-fit-regression.R
Comment thread tests/testthat/test-rule-fit-regression.R Outdated
Comment thread tests/testthat/test-rule-fit-regression.R Outdated
@simonpcouch
Copy link
Copy Markdown
Contributor Author

Ah, figured it out🙈 Wrapping in suppressMessages pending holub008/xrf#21 merged + on CRAN.

@simonpcouch simonpcouch requested a review from EmilHvitfeldt June 8, 2022 18:39
@simonpcouch simonpcouch merged commit 1c12e26 into main Jun 8, 2022
@simonpcouch simonpcouch deleted the prefit branch June 8, 2022 20:16
@github-actions
Copy link
Copy Markdown

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions Bot locked and limited conversation to collaborators Jun 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants