-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[BEAM-13972] Add RunInference interface #16917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #16917 +/- ##
===========================================
+ Coverage 46.89% 83.63% +36.74%
===========================================
Files 204 453 +249
Lines 20122 62528 +42406
===========================================
+ Hits 9436 52295 +42859
- Misses 9686 10233 +547
+ Partials 1000 0 -1000
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
| raise NotImplementedError("Please implement _validate_model") | ||
|
|
||
|
|
||
| class PytorchModel(RunInferenceModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like PytorchModel and SklearnModel should be in their own file, only imported if one or the other is needed, rather than having a single module that holds all model types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. I'll refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep comments like this open until they're done for easier tracking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| _K = TypeVar('_K') | ||
| _INPUT_TYPE = TypeVar('_INPUT_TYPE') | ||
| _OUTPUT_TYPE = TypeVar('_OUTPUT_TYPE') | ||
|
|
||
|
|
||
| @dataclass | ||
| class PredictionResult: | ||
| key: _K | ||
| example: _INPUT_TYPE | ||
| inference: _OUTPUT_TYPE | ||
|
|
||
|
|
||
| @beam.ptransform_fn | ||
| @beam.typehints.with_input_types(Union[_INPUT_TYPE, Tuple[_K, _INPUT_TYPE]]) | ||
| @beam.typehints.with_output_types(PredictionResult) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should we deal with typing? TFX explicitly defines output as Union[_OUTPUT_TYPE, Tuple[_K, _OUTPUT_TYPE]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, TFX is doing it right I think. They either have output type or a key with output type.
I'm not the typing expert, IMO its ok to have a typing specific PR that just concentrates on that too, if this gets messy.
We need typing to support TFX's proto use case as well as ours new frameworks use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've modified the output type and pushed the change. I feel like we should try to get typing as accurate as possible early on, especially if we want to ensure that we can accommodate TFX's implementation.
Could we have, instead of a bare PredictionResult type, a parent class from which PredictionResult and prediction_log_pb2.PredictionLog inherit? Or use a Union? (but this may get messy).
This definitely requires input from someone with more experience with typing.
ryanthompson591
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Add valentyn next as reviewer IMO.
| _K = TypeVar('_K') | ||
| _INPUT_TYPE = TypeVar('_INPUT_TYPE') | ||
| _OUTPUT_TYPE = TypeVar('_OUTPUT_TYPE') | ||
|
|
||
|
|
||
| @dataclass | ||
| class PredictionResult: | ||
| key: _K | ||
| example: _INPUT_TYPE | ||
| inference: _OUTPUT_TYPE | ||
|
|
||
|
|
||
| @beam.ptransform_fn | ||
| @beam.typehints.with_input_types(Union[_INPUT_TYPE, Tuple[_K, _INPUT_TYPE]]) | ||
| @beam.typehints.with_output_types(PredictionResult) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, TFX is doing it right I think. They either have output type or a key with output type.
I'm not the typing expert, IMO its ok to have a typing specific PR that just concentrates on that too, if this gets messy.
We need typing to support TFX's proto use case as well as ours new frameworks use case.
|
CC: @TheNeuralBit and @tvalentyn |
|
CC: @kevingg |
|
R: @robertwb @tvalentyn |
ryanthompson591
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
| """ | ||
| @abc.abstractmethod | ||
| def get_model_loader(self): | ||
| "Returns ModelLoader object" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good for now.
Just a note, but I don't think you need to do anything: Going forward it seems to me that there will be an implementation that uses a service (instead of loading model) a that maybe returns none for get_model_loader. I haven't decided if this should return None by default or if a NotImplementedError is fine.
|
Updated the interface per our discussion, PTAL. @robertwb |
|
@robertwb - could you please review this PR? Other folks on the PR, feel free to ping reviewers here or in other places if a review is open for more than a few business days. And where it makes sense you could also ask other committers to review PRs and spread the review load across committers. |
|
@tvalentyn could you please review/merge? |
|
I'm actively involved in the review within the TFX codebase. I'm assuming, however, that the two will be reconciled. (And, generally, +1 to @aaltay 's advice about pining reviews that are lagging.) |
|
Thanks. They will be reconciled once the change to TFX is finished. |
|
No problem. This is a rapidly developing space.
…On Fri, Mar 18, 2022 at 2:43 PM tvalentyn ***@***.***> wrote:
Sorry, I misunderstood @yeandy <https://github.com/yeandy> in offline
conversation, I thought you already reviewed this change, @robertwb
<https://github.com/robertwb> .
—
Reply to this email directly, view it on GitHub
<#16917 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADWVAP2U5IJHBLPBSIDTBLVAT2JBANCNFSM5PBQUWJQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Adding common interface to the
RunInferencetransform. Arguments will be wrapper classes aroundPyTorchandScikit-learnmodels, along with other common parameters.Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI.