Skip to content

Conversation

@ihji
Copy link
Contributor

@ihji ihji commented Aug 10, 2021

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

ValidatesRunner compliance status (on master branch)

Lang ULR Dataflow Flink Samza Spark Twister2
Go --- Build Status Build Status Build Status Build Status ---
Java Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Python --- Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status ---
XLang Build Status Build Status Build Status Build Status Build Status ---

Examples testing status on various runners

Lang ULR Dataflow Flink Samza Spark Twister2
Go --- --- --- --- --- --- ---
Java --- Build Status
Build Status
Build Status
--- --- --- --- ---
Python --- --- --- --- --- --- ---
XLang --- --- --- --- --- --- ---

Post-Commit SDK/Transform Integration Tests Status (on master branch)

Go Java Python
Build Status Build Status Build Status
Build Status
Build Status

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website Whitespace Typescript
Non-portable Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status Build Status
Portable --- Build Status Build Status --- --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@ihji
Copy link
Contributor Author

ihji commented Aug 10, 2021

R: @chamikaramj

@codecov
Copy link

codecov bot commented Aug 10, 2021

Codecov Report

Merging #15307 (8c6fafd) into master (2fd9875) will decrease coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #15307      +/-   ##
==========================================
- Coverage   83.81%   83.79%   -0.03%     
==========================================
  Files         441      441              
  Lines       59745    59801      +56     
==========================================
+ Hits        50075    50109      +34     
- Misses       9670     9692      +22     
Impacted Files Coverage Δ
sdks/python/apache_beam/utils/interactive_utils.py 87.80% <0.00%> (-7.32%) ⬇️
...ks/python/apache_beam/runners/worker/data_plane.py 87.70% <0.00%> (-2.90%) ⬇️
...hon/apache_beam/runners/direct/test_stream_impl.py 94.02% <0.00%> (-2.24%) ⬇️
...eam/runners/interactive/interactive_environment.py 90.33% <0.00%> (-0.38%) ⬇️
...ks/python/apache_beam/runners/worker/sdk_worker.py 88.85% <0.00%> (-0.16%) ⬇️
...hon/apache_beam/runners/worker/bundle_processor.py 93.51% <0.00%> (-0.13%) ⬇️
sdks/python/apache_beam/io/avroio.py 60.60% <0.00%> (ø)
sdks/python/apache_beam/io/textio.py 97.07% <0.00%> (ø)
sdks/python/apache_beam/io/tfrecordio.py 93.39% <0.00%> (ø)
sdks/python/apache_beam/transforms/ptransform.py 93.54% <0.00%> (ø)
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2fd9875...8c6fafd. Read the comment docs.

Copy link
Contributor

@chamikaramj chamikaramj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

echo "You don't have gnome-terminal installed."
if [[ "$INSTALL_GNOME_TERMINAL" != true ]]; then
sudo apt-get upgrade
if [[ "$INSTALL_GNOME_TERMINAL" = true ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition here was inverted. Did we have a bug before ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe so. The software needs to be installed when the variable is true.

if [[ "$INSTALL_KUBECTL" = true ]]; then
sudo apt-get install kubectl
else
echo "kubectl is not installed. Validation on Python cross-language Kafka taxi will be skipped."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a failure for validation instead of skipping ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we already exit the program but only the printed message is misleading. Update the messages.

CLUSTER_NAME=xlang-kafka-cluster-$RANDOM
if [[ "$python_xlang_kafka_taxi_dataflow" = true ]]; then
gcloud container clusters create --project=${USER_GCP_PROJECT} --region=${USER_GCP_REGION} --no-enable-ip-alias $CLUSTER_NAME
kubectl apply -R -f ${LOCAL_BEAM_DIR}/.test-infra/kubernetes/kafka-cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you confirm that works. I think we Beam Kafka IT is currently failing due to a port issue when starting up this cluster: https://issues.apache.org/jira/browse/BEAM-9482

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It worked with clouddfe project. I think there was no other program on clouddfe project using the same port assigned to k8s Kafka cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should get it to a state where the release manager can consistently run the test using the default (apache-beam-testing) project. (I'm not sure if you'll actually hit https://issues.apache.org/jira/browse/BEAM-9482 or not).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it's a separate work. We need to update k8s configs for dynamically assigning the ports.

--runner DataflowRunner \
--num_workers 5 \
--temp_location=${USER_GCS_BUCKET}/temp/ \
--experiments=use_runner_v2 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not need to manually specify this experiment for Beam 2.32.0 and later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

--runner DataflowRunner \
--num_workers 5 \
--temp_location=${USER_GCS_BUCKET}/temp/ \
--experiments=use_runner_v2 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto regarding experiment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

echo "* How to verify results:"
echo "* 1. Goto your Dataflow job console and check whether there is any error."
echo "* 2. Check whether your ${SQL_TAXI_SUBSCRIPTION} subscription has data below:"
# run twice since the first execution would return 0 messages
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea why ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea. I found that sometimes gcloud pubsub pull command just returns empty result (mostly when the first pull command after the subscription creation). Supposedly, this on-screen outputs only provide the hint to the release manager that any data exists in the sink. Visiting the web console might be needed if the hint doesn't help.

sleep 10m
echo "* How to verify results:"
echo "* 1. Goto your Dataflow job console and check whether there is any error."
echo "* 2. Check whether ${KAFKA_TAXI_DF_DATASET}.xlang_kafka_taxi has data, retrieving BigQuery data as below: "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be possible to run a 'grep' to confirm that the output is not empty ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would be reliably possible since the data is constantly changing. Manual review is still important not only for this tests but also other existing validations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data changes but can we just verify that the output is not empty ? Based on my observation there's always some output data for these pipelines after few minutes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@chamikaramj
Copy link
Contributor

Thanks. LGTM.

@chamikaramj
Copy link
Contributor

Retest this please

@ihji
Copy link
Contributor Author

ihji commented Aug 18, 2021

Run Python_PVR_Flink PreCommit

@ihji
Copy link
Contributor Author

ihji commented Aug 18, 2021

Run Java PreCommit

@ihji ihji merged commit 7587508 into apache:master Aug 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants