diff --git a/website/www/site/content/en/documentation/io/connectors.md b/website/www/site/content/en/documentation/io/connectors.md index d95f06575327..dc03590cc011 100644 --- a/website/www/site/content/en/documentation/io/connectors.md +++ b/website/www/site/content/en/documentation/io/connectors.md @@ -61,7 +61,7 @@ This table provides a consolidated, at-a-glance overview of the available built- ✔ ✔ - native + native ✔ diff --git a/website/www/site/content/en/documentation/io/developing-io-java.md b/website/www/site/content/en/documentation/io/developing-io-java.md index 7836a3cd06ac..0d792149bfff 100644 --- a/website/www/site/content/en/documentation/io/developing-io-java.md +++ b/website/www/site/content/en/documentation/io/developing-io-java.md @@ -75,7 +75,7 @@ multiple worker instances in parallel. As such, the code you provide for can use `SourceTestUtils` to increase your implementation's test coverage using a wide range of inputs with relatively few lines of code. For examples that use `SourceTestUtils`, see the - [AvroSourceTest](https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/io/AvroSourceTest.java) and + [AvroSourceTest](https://github.com/apache/beam/blob/master/sdks/java/extensions/avro/src/test/java/org/apache/beam/sdk/extensions/avro/io/AvroSourceTest.java) and [TextIOReadTest](https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/io/TextIOReadTest.java) source code. @@ -344,7 +344,7 @@ sinks that interact with files, including: implementations for examples: * [TextSink](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextSink.java) and - * [AvroSink](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroSink.java). + * [AvroSink](https://github.com/apache/beam/blob/master/sdks/java/extensions/avro/src/main/java/org/apache/beam/sdk/extensions/avro/io/AvroSink.java). ## PTransform wrappers {#ptransform-wrappers} diff --git a/website/www/site/content/en/documentation/io/io-standards.md b/website/www/site/content/en/documentation/io/io-standards.md index 5773ab2b5e5d..2bae828d405d 100644 --- a/website/www/site/content/en/documentation/io/io-standards.md +++ b/website/www/site/content/en/documentation/io/io-standards.md @@ -1213,7 +1213,7 @@ When possible, unit tests are favored over integration tests due to faster execu

Tests that the source/sink populates display data correctly. -

AvroIOTest.testReadDisplayData +

AvroIOTest.testReadDisplayData

DatastoreV1Test.testReadDisplayData

bigquery_test.TestBigQuerySourcetest_table_reference_display_data diff --git a/website/www/site/content/en/documentation/ml/orchestration.md b/website/www/site/content/en/documentation/ml/orchestration.md index 6411b0f72442..b4ae4c79c8ca 100644 --- a/website/www/site/content/en/documentation/ml/orchestration.md +++ b/website/www/site/content/en/documentation/ml/orchestration.md @@ -121,7 +121,7 @@ Because KFP provides the input and output arguments as command-line arguments, a {{< code_sample "sdks/python/apache_beam/examples/ml-orchestration/kfp/components/preprocessing/src/preprocess.py" preprocess_component_argparse >}} {{< /highlight >}} -The implementation of the `preprocess_dataset` function contains the Apache Beam pipeline code and the Beam pipeline options that select the runner. The executed preprocessing involves downloading the image bytes from their URL, converting them to a Torch Tensor, and resizing to the desired size. The caption undergoes a series of string manipulations to ensure that our model receives uniform image descriptions. Tokenization is not done here, but could be included here if the vocabulary is known. Finally, each element is serialized and written to [Avro](https://avro.apache.org/docs/1.2.0/) files. You can use alternative files formats, such as TFRecords. +The implementation of the `preprocess_dataset` function contains the Apache Beam pipeline code and the Beam pipeline options that select the runner. The executed preprocessing involves downloading the image bytes from their URL, converting them to a Torch Tensor, and resizing to the desired size. The caption undergoes a series of string manipulations to ensure that our model receives uniform image descriptions. Tokenization is not done here, but could be included here if the vocabulary is known. Finally, each element is serialized and written to [Avro](https://avro.apache.org/docs/) files. You can use alternative files formats, such as TFRecords. {{< highlight file="sdks/python/apache_beam/examples/ml-orchestration/kfp/components/preprocessing/src/preprocess.py" >}}