From a7b7d8fd58f561437271d2fe43077054f906f985 Mon Sep 17 00:00:00 2001 From: Ryan Thompson Date: Wed, 19 Oct 2022 11:16:13 -0400 Subject: [PATCH 1/3] Moves tensorflow section next to sklearn and pytorch section of documentation --- .../sdks/python-machine-learning.md | 84 +++++++++---------- 1 file changed, 42 insertions(+), 42 deletions(-) diff --git a/website/www/site/content/en/documentation/sdks/python-machine-learning.md b/website/www/site/content/en/documentation/sdks/python-machine-learning.md index bcd430d0072d..65ec3053f8e2 100644 --- a/website/www/site/content/en/documentation/sdks/python-machine-learning.md +++ b/website/www/site/content/en/documentation/sdks/python-machine-learning.md @@ -83,6 +83,48 @@ You need to provide a path to a file that contains the pickled Scikit-learn mode `model_uri=` and `model_file_type: `, where you can specify `ModelFileType.PICKLE` or `ModelFileType.JOBLIB`, depending on how the model was serialized. +#### TensorFlow + +To use TensorFlow with the RunInference API, you need to do the following: + +* Use `tfx_bsl` version 1.10.0 or later. +* Create a model handler using `tfx_bsl.public.beam.run_inference.CreateModelHandler()`. +* Use the model handler with the [`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html) transform. + +A sample pipeline might look like the following example: + +``` +import apache_beam as beam +from apache_beam.ml.inference.base import RunInference +from tensorflow_serving.apis import prediction_log_pb2 +from tfx_bsl.public.proto import model_spec_pb2 +from tfx_bsl.public.tfxio import TFExampleRecord +from tfx_bsl.public.beam.run_inference import CreateModelHandler + +pipeline = beam.Pipeline() +tfexample_beam_record = TFExampleRecord(file_pattern='/path/to/examples') +saved_model_spec = model_spec_pb2.SavedModelSpec(model_path='/path/to/model') +inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec) +model_handler = CreateModelHandler(inference_spec_type) +with pipeline as p: + _ = (p | tfexample_beam_record.RawRecordBeamSource() + | RunInference(model_handler) + | beam.Map(print) + ) +``` + +The model handler that is created with `CreateModelHander()` is always unkeyed. To make a keyed model handler, wrap the unkeyed model handler in the keyed model handler, which would then take the `tfx-bsl` model handler as a parameter. For example: + +``` +from apache_beam.ml.inference.base import RunInference +from apache_beam.ml.inference.base import KeyedModelHandler +RunInference(KeyedModelHandler(tf_handler)) +``` + +If you are unsure if your data is keyed, you can also use `MaybeKeyedModelHandler`. + +For more information, see [`KeyedModelHander`](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.KeyedModelHandler). + ### Use custom models If you would like to use a model that isn't specified by one of the supported frameworks, the RunInference API is designed flexibly to allow you to use any custom machine learning models. @@ -198,48 +240,6 @@ For detailed instructions explaining how to build and run a pipeline that uses M The RunInference API is available with the Beam Java SDK versions 2.41.0 and later through Apache Beam's [Multi-language Pipelines framework](https://beam.apache.org/documentation/programming-guide/#multi-language-pipelines). For information about the Java wrapper transform, see [RunInference.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/RunInference.java). For example pipelines, see [RunInferenceTransformTest.java](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/test/java/org/apache/beam/sdk/extensions/python/transforms/RunInferenceTransformTest.java). -## TensorFlow support - -To use TensorFlow with the RunInference API, you need to do the following: - -* Use `tfx_bsl` version 1.10.0 or later. -* Create a model handler using `tfx_bsl.public.beam.run_inference.CreateModelHandler()`. -* Use the model handler with the [`apache_beam.ml.inference.base.RunInference`](/releases/pydoc/current/apache_beam.ml.inference.base.html) transform. - -A sample pipeline might look like the following example: - -``` -import apache_beam as beam -from apache_beam.ml.inference.base import RunInference -from tensorflow_serving.apis import prediction_log_pb2 -from tfx_bsl.public.proto import model_spec_pb2 -from tfx_bsl.public.tfxio import TFExampleRecord -from tfx_bsl.public.beam.run_inference import CreateModelHandler - -pipeline = beam.Pipeline() -tfexample_beam_record = TFExampleRecord(file_pattern='/path/to/examples') -saved_model_spec = model_spec_pb2.SavedModelSpec(model_path='/path/to/model') -inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec) -model_handler = CreateModelHandler(inference_spec_type) -with pipeline as p: - _ = (p | tfexample_beam_record.RawRecordBeamSource() - | RunInference(model_handler) - | beam.Map(print) - ) -``` - -The model handler that is created with `CreateModelHander()` is always unkeyed. To make a keyed model handler, wrap the unkeyed model handler in the keyed model handler, which would then take the `tfx-bsl` model handler as a parameter. For example: - -``` -from apache_beam.ml.inference.base import RunInference -from apache_beam.ml.inference.base import KeyedModelHandler -RunInference(KeyedModelHandler(tf_handler)) -``` - -If you are unsure if your data is keyed, you can also use `MaybeKeyedModelHandler`. - -For more information, see [`KeyedModelHander`](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.KeyedModelHandler). - ## Troubleshooting If you run into problems with your pipeline or job, this section lists issues that you might encounter and provides suggestions for how to fix them. From 870f4c0556308cdc343ea181b806ac15cec7108c Mon Sep 17 00:00:00 2001 From: Ryan Thompson Date: Wed, 19 Oct 2022 11:32:36 -0400 Subject: [PATCH 2/3] moved keyed model handler data to be framework neutral --- .../sdks/python-machine-learning.md | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/website/www/site/content/en/documentation/sdks/python-machine-learning.md b/website/www/site/content/en/documentation/sdks/python-machine-learning.md index 65ec3053f8e2..336026375537 100644 --- a/website/www/site/content/en/documentation/sdks/python-machine-learning.md +++ b/website/www/site/content/en/documentation/sdks/python-machine-learning.md @@ -20,7 +20,9 @@ limitations under the License. {{< button-pydoc path="apache_beam.ml.inference" class="RunInference" >}} -You can use Apache Beam with the RunInference API to use machine learning (ML) models to do local and remote inference with batch and streaming pipelines. Starting with Apache Beam 2.40.0, PyTorch and Scikit-learn frameworks are supported. You can create multiple types of transforms using the RunInference API: the API takes multiple types of setup parameters from model handlers, and the parameter type determines the model implementation. +You can use Apache Beam with the RunInference API to use machine learning (ML) models to do local and remote inference with batch and streaming pipelines. Starting with Apache Beam 2.40.0, PyTorch and Scikit-learn frameworks are supported. Tensorflow models are supported through tfx-bsl. + +You can create multiple types of transforms using the RunInference API: the API takes multiple types of setup parameters from model handlers, and the parameter type determines the model implementation. ## Why use the RunInference API? @@ -62,6 +64,7 @@ from apache_beam.ml.inference.sklearn_inference import SklearnModelHandlerNumpy from apache_beam.ml.inference.sklearn_inference import SklearnModelHandlerPandas from apache_beam.ml.inference.pytorch_inference import PytorchModelHandlerTensor from apache_beam.ml.inference.pytorch_inference import PytorchModelHandlerKeyedTensor +from tfx_bsl.public.beam.run_inference import CreateModelHandler ``` ### Use pre-trained models @@ -113,12 +116,21 @@ with pipeline as p: ) ``` -The model handler that is created with `CreateModelHander()` is always unkeyed. To make a keyed model handler, wrap the unkeyed model handler in the keyed model handler, which would then take the `tfx-bsl` model handler as a parameter. For example: +Note: A model handler that is created with `CreateModelHander()` is always unkeyed. + +### Keyed Model Handlers +To make a keyed model handler, wrap the any model handler in the keyed model handler. For example: ``` from apache_beam.ml.inference.base import RunInference from apache_beam.ml.inference.base import KeyedModelHandler -RunInference(KeyedModelHandler(tf_handler)) +model_handler = +keyed_model_handler = KeyedModelHandler(model_handler) + +with pipeline as p: + p | ( + RunInference(keyed_model_handler) + ) ``` If you are unsure if your data is keyed, you can also use `MaybeKeyedModelHandler`. @@ -133,7 +145,6 @@ You only need to create your own `ModelHandler` or `KeyedModelHandler` with logi A simple example can be found in [this notebook](https://github.com/apache/beam/blob/master/examples/notebooks/beam-ml/run_custom_inference.ipynb). The `load_model` method shows how to load the model using a popular `spaCy` package while `run_inference` shows how to run the inference on a batch of examples. - ### Use multiple models You can also use the RunInference transform to add multiple inference models to your pipeline. From de6badcf02f91c5ecd4f9664cf83ab9f9fac6ae5 Mon Sep 17 00:00:00 2001 From: Ryan Thompson Date: Wed, 19 Oct 2022 11:39:09 -0400 Subject: [PATCH 3/3] fixed grammar typo --- .../content/en/documentation/sdks/python-machine-learning.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/website/www/site/content/en/documentation/sdks/python-machine-learning.md b/website/www/site/content/en/documentation/sdks/python-machine-learning.md index 336026375537..b899e5149642 100644 --- a/website/www/site/content/en/documentation/sdks/python-machine-learning.md +++ b/website/www/site/content/en/documentation/sdks/python-machine-learning.md @@ -119,7 +119,7 @@ with pipeline as p: Note: A model handler that is created with `CreateModelHander()` is always unkeyed. ### Keyed Model Handlers -To make a keyed model handler, wrap the any model handler in the keyed model handler. For example: +To make a keyed model handler, wrap any unkeyed model handler in the keyed model handler. For example: ``` from apache_beam.ml.inference.base import RunInference