[BEAM-13972] Add RunInference interface #16917

yeandy · 2022-02-22T14:15:26Z

Adding common interface to the RunInference transform. Arguments will be wrapper classes around PyTorch and Scikit-learn models, along with other common parameters.

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

Choose reviewer(s) and mention them in a comment (R: @username).
Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
Update CHANGES.md with noteworthy changes.
If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

See CI.md for more information about GitHub Actions CI.

codecov · 2022-02-22T14:37:50Z

Codecov Report

Merging #16917 (71e2f8c) into master (b2f2128) will increase coverage by 36.74%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           master   #16917       +/-   ##
===========================================
+ Coverage   46.89%   83.63%   +36.74%     
===========================================
  Files         204      453      +249     
  Lines       20122    62528    +42406     
===========================================
+ Hits         9436    52295    +42859     
- Misses       9686    10233      +547     
+ Partials     1000        0     -1000

Flag	Coverage Δ
python	`83.63% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
sdks/go/pkg/beam/runners/direct/impulse.go
sdks/go/pkg/beam/io/databaseio/writer.go
...ks/go/pkg/beam/runners/dataflow/dataflowlib/job.go
sdks/go/pkg/beam/core/runtime/exec/fn_arity.go
sdks/go/pkg/beam/beam.shims.go
sdks/go/pkg/beam/core/runtime/symbols.go
sdks/go/pkg/beam/util/shimx/generate.go
sdks/go/pkg/beam/runners/direct/buffer.go
sdks/go/pkg/beam/core/graph/coder/registry.go
sdks/go/pkg/beam/core/runtime/exec/cogbk.go
... and 647 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b2f2128...71e2f8c. Read the comment docs.

sdks/python/apache_beam/ml/inference/model.py

sdks/python/apache_beam/ml/inference/run_inference.py

sdks/python/apache_beam/ml/inference/model.py

sdks/python/apache_beam/ml/inference/run_inference.py

robertwb · 2022-02-23T19:16:44Z

sdks/python/apache_beam/ml/inference/model.py

+    raise NotImplementedError("Please implement _validate_model")
+
+
+class PytorchModel(RunInferenceModel):


It feels like PytorchModel and SklearnModel should be in their own file, only imported if one or the other is needed, rather than having a single module that holds all model types.

Agreed. I'll refactor.

Let's keep comments like this open until they're done for easier tracking.

These implementations will be done in separate PRs (pytorch, sklearn) after this PR, as well as the PR for the base class #16970 are merged.

sdks/python/apache_beam/ml/inference/run_inference.py

yeandy · 2022-03-01T22:39:40Z