-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[2/2] sdks/python: enrich data with CloudSQL [PostgreSQL, MySQL, SQLServer] #35473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2/2] sdks/python: enrich data with CloudSQL [PostgreSQL, MySQL, SQLServer] #35473
Conversation
40f595f to
76686aa
Compare
76686aa to
cdd1f6f
Compare
|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
damccorm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally LGTM, but we will probably want to wait until #34398 is merged/released to actually merge. Thanks!
website/www/site/content/en/documentation/transforms/python/elementwise/enrichment-cloudsql.md
Outdated
Show resolved
Hide resolved
…ementwise/enrichment-cloudsql.md Co-authored-by: Danny McCormick <dannymccormick@google.com>
|
@mohamedawnallah would you mind fixing the conflicts here? Otherwise I think this should be good to merge |
| DEVELOCITY_ACCESS_KEY: ${{ secrets.DEVELOCITY_ACCESS_KEY }} | ||
| GRADLE_ENTERPRISE_CACHE_USERNAME: ${{ secrets.GE_CACHE_USERNAME }} | ||
| GRADLE_ENTERPRISE_CACHE_PASSWORD: ${{ secrets.GE_CACHE_PASSWORD }} | ||
| ALLOYDB_PASSWORD: ${{ secrets.ALLOYDB_PASSWORD }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CloudSQL instance on Google Cloud expected to fail in the examples since that workflow change haven't yet switched on. As seeing in the CI logs:
____________ EnrichmentTest.test_enrichment_with_google_cloudsql_pg ____________
[gw0] linux -- Python 3.11.12 /runner/_work/beam/beam/sdks/python/test-suites/tox/py311/build/srcs/sdks/python/target/.tox-py311-cloud/py311-cloud/bin/python
self = <apache_beam.examples.snippets.transforms.elementwise.enrichment_test.EnrichmentTest testMethod=test_enrichment_with_google_cloudsql_pg>
mock_stdout = <_io.StringIO object at 0x7f388c1adbd0>
def test_enrichment_with_google_cloudsql_pg(self, mock_stdout):
db_adapter = DatabaseTypeAdapter.POSTGRESQL
> with EnrichmentTestHelpers.sql_test_context(True, db_adapter):
apache_beam/examples/snippets/transforms/elementwise/enrichment_test.py:147:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/contextlib.py:137: in __enter__
return next(self.gen)
apache_beam/examples/snippets/transforms/elementwise/enrichment_test.py:203: in sql_test_context
result = EnrichmentTestHelpers.pre_sql_enrichment_test(
apache_beam/examples/snippets/transforms/elementwise/enrichment_test.py:246: in pre_sql_enrichment_test
os.environ['GOOGLE_CLOUD_SQL_DB_PASSWORD'] = password
<frozen os>:684: in __setitem__
???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
value = None
> ???
E TypeError: str expected, not NoneType
<frozen os>:758: TypeError
- generated xml file: /runner/_work/beam/beam/sdks/python/test-
|
Looks like the coverage failure here is probably a real issue. We probably need the same change there that we're making to the examples workflow. Rather than bundling those changes with this PR, lets make them in a separate PR so that we can confirm all tests pass here. Added #36061 to do this |
…xamplesForCloudSQLHandler
fcbc37d to
af8e4fe
Compare
|
@mohamedawnallah would you mind taking a look at the failures here? |
The tests are currently failing because I've encountered this exact issue before and couldn't really figure it out (similar problem when tried to access on beam PostCommit Python workflow in #34398 (comment)). This secret specifically seems having issues when attempting to use in other workflows. The failure occurs at this specific line where the code attempts to read the password environment variable: password = os.getenv("ALLOYDB_PASSWORD")When the environment variable doesn't exist, This appears to be an infrastructure issue related to how the Probably the least resistant path would be to skip these tests when What do you think, @damccorm? |
Have you tried running the gradle task that is failing locally? One idea is that it may be that we're invoking these tests with tox and need to allow list this variable to be passed through to the tox environment - Line 34 in 32f5be6
|
sdks/python/apache_beam/examples/snippets/transforms/elementwise/enrichment_test.py
Show resolved
Hide resolved
|
The tests are now passing for this changeset 👍 |
|
We can enforce running CloudSQL enrichment handler on Precommit Python Transforms by removing beam/sdks/python/test-suites/direct/common.gradle Lines 374 to 393 in a2f9fb8
That require us to add beam/.github/workflows/beam_PreCommit_Python_Transforms.yml Lines 52 to 56 in a2f9fb8
|
|
Submitted this PR #36096 to add |
|
After enforcing CloudSQL tests to run only on precommit python transforms they are running and passing except the Google CloudSQL Enrichment tests due to lack of |
…xamplesForCloudSQLHandler
|
Merged now the changes in this PR after the merge of #36096 |
|
It looks like the coverage workflow was failing with the same issue as #35473 (comment) Since this test is running in some suites, could we just skip it when |
Added that conditional skip 👍. Also removed |
|
Hm, looks like the infra is right now, but the tests are failing - https://github.com/apache/beam/actions/runs/17588014826/attempts/1?pr=35473 is an example. I tried kicking it off again, but it looks like a real problem with duplicate data |
Yeah looked at it debugging it now 👍 EDIT: Submitted a commit patch to try use raw SQL first as table removal approach and fallback to this SQLAlchemy. Also made the table id for CloudSQL to be unique using uuid similar to others (with this even table removal failed the tests still passing since unique table id almost for every run) |
|
@damccorm - the relevant CI tests are green now |
damccorm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - this LGTM!
Description
Closes #30773.
Closes #36095.
Towards #35046.
Prev #34398.
Website Update Demo
website_update_demo.mov
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.UpdateCHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.