Skip to content

Conversation

@shunping
Copy link
Collaborator

@shunping shunping commented Aug 26, 2025

fixes #35392

The PR also adds some debugging messages and fix a null pointer edge case.
The null pointer issue will be fixed in #35977.

@shunping shunping requested a review from lostluck August 26, 2025 21:39
@shunping shunping marked this pull request as ready for review August 26, 2025 21:54
@github-actions
Copy link
Contributor

Assigning reviewers:

R: @lostluck for label go.

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@shunping
Copy link
Collaborator Author

cc'ed @liferoad @damccorm

@shunping
Copy link
Collaborator Author

shunping commented Aug 27, 2025

I notice that LogValue in worker.go will be called during logging:

slog.String("endpoint", wk.Endpoint()),

So I think I have to store the value instead of calling ResolveEndpointForWorker every time we log.

Copy link
Contributor

@lostluck lostluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@lostluck
Copy link
Contributor

I notice that LogValue in worker.go will be called during logging:

slog.String("endpoint", wk.Endpoint()),

So I think I have to store the value instead of calling ResolveEndpointForWorker every time we log.

Ultimately it should be the actual address used for connecting, so it can be traced better. Add a debug log if it needs to be different earlier in the process.

@shunping
Copy link
Collaborator Author

shunping commented Aug 27, 2025

Thanks @lostluck ! Could you take another look at the PR since I made a bit changes to make the fix clear?

@nguymin4
Copy link
Contributor

I tested on MacOS. This fix works for me.

// The presence of an external environment does not guarantee execution within
// Docker, as Python's LOOPBACK also runs in an external environment.
// A specific check for the "BEAM_WORKER_POOL_IN_DOCKER_VM" environment variable is required to confirm
// if the worker is running inside a Docker container.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha! The goal here is to avoid Docker In Docker issues, where the address may be weird because we're in a docker container at all. Makes sense.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Basically, if a worker is running inside a docker and we provide a localhost provisioning address, it is going to look at the docker's own port. In order to access the host port, we have to use a different host name.

That's only for mac or windows though. Docker running on linux does not have this issue.

Copy link
Contributor

@lostluck lostluck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New approach LGTM.

@shunping
Copy link
Collaborator Author

Thanks @lostluck and @nguymin4. Merging it now.

@shunping shunping merged commit 62cbf83 into apache:master Aug 27, 2025
10 checks passed
damccorm added a commit that referenced this pull request Aug 27, 2025
* sdks/python: properly make milvus as extra dependency

* sdks/python: update image requirements

* .github: trigger postcommit python

* sdks/python: fix linting issues

* sdks/python: fix formatting issues

* .github: trigger beam postcommit python

* sdks/python: revert milvus version in itests

* sdks/python: update image requirements

* trigger_files: trigger postcommit python

* Bump github.com/docker/go-connections from 0.5.0 to 0.6.0 in /sdks (#35906)

Bumps [github.com/docker/go-connections](https://github.com/docker/go-connections) from 0.5.0 to 0.6.0.
- [Commits](docker/go-connections@v0.5.0...v0.6.0)

---
updated-dependencies:
- dependency-name: github.com/docker/go-connections
  dependency-version: 0.6.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add the readme link to new YAML examples (#35941)

* Bump google.golang.org/api from 0.247.0 to 0.248.0 in /sdks (#35969)

* Remove mysql-connector-python dependency (#35932)

* Fix typos and update test implementation from #35656 (#35958)

* implement lambda name pickling in cloudpickle

* add enable_lambda_name to __init__

* fix formatting and lint

* fix typo

* fix code paths in test

* fix tests

* fix lint

* fix formatting and failing test

* fix formatting again

* remove cloudpickle implementation to leave only typo fixes and fixing test structure.

* fix _make_function typo

* revert regex

* fix failing tests

* fix formatting

* update prefix to not hardcode

* feat(mongodb): upgrade MongoDB Java driver to version 5.5.0 (#35946)

* feat(mongodb): upgrade MongoDB Java driver to version 5.5.0

Update MongoDB Java driver from 3.12.11 to 5.5.0 and refactor code to use new API
Add mongo-bson dependency required by new driver version
Replace deprecated MongoClient with MongoClients and update GridFS implementation

* refactor(mongodb): update MongoDB client usage to modern API

Replace deprecated MongoClient with MongoClients.create() and update database drop method

* build(dependencies): add mongodb driver core dependency

Add mongodb-driver-core to support MongoDB Java driver functionality.
Also mark mongo_java_driver as permitUnusedDeclared and add testImplementation.

* fix(mongodb): update embedded mongo version and fix split key filtering

Update embedded MongoDB test dependency to version 3.5.4 and simplify split key filtering logic by using BsonObjectId for range queries. This ensures proper type handling when filtering MongoDB documents by _id field.

* build: add mongodb-driver-core dependency

Add mongodb-driver-core version 5.5.0 to support MongoDB Java driver functionality

* use version

* refactor: simplify mongo client creation logic

Remove redundant null check and consolidate uri handling in MongoDbGridFSIO

* Bump github.com/aws/aws-sdk-go-v2/credentials in /sdks (#35974)

Bumps [github.com/aws/aws-sdk-go-v2/credentials](https://github.com/aws/aws-sdk-go-v2) from 1.18.6 to 1.18.7.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/config/v1.18.7/CHANGELOG.md)
- [Commits](aws/aws-sdk-go-v2@config/v1.18.6...config/v1.18.7)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/credentials
  dependency-version: 1.18.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump google.golang.org/grpc from 1.74.2 to 1.75.0 in /sdks (#35971)

Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.74.2 to 1.75.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](grpc/grpc-go@v1.74.2...v1.75.0)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.75.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Override localhost endpoint when a worker is running in docker on mac (#35964)

* fix(parquetio): handle missing nullable fields in row conversion (#35948)

* fix(parquetio): handle missing nullable fields in row conversion

Add null value handling when converting rows to Arrow tables for nullable fields that are missing from input data. This fixes KeyError when writing to Parquet with missing nullable fields, addressing issue #35791.

* fix lint

* Bump cloud.google.com/go/storage from 1.56.0 to 1.56.1 in /sdks (#35980)

Bumps [cloud.google.com/go/storage](https://github.com/googleapis/google-cloud-go) from 1.56.0 to 1.56.1.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md)
- [Commits](googleapis/google-cloud-go@spanner/v1.56.0...storage/v1.56.1)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/storage
  dependency-version: 1.56.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [Prism] Fix segv when docker container self-terminated. (#35977)

* Fix segv when docker container is self-terminated

* Add some debug logging for docker and process env.

* add a jinja % include/import pipeline example to docs (#35931)

* add a jinja include pipeline example

* update yaml doc with import example

* address gemini and other comments

* fix table of contents for readme

* add link to jinja pipeline examples

* Bump github.com/aws/aws-sdk-go-v2/config from 1.31.2 to 1.31.3 in /sdks (#35983)

* Add a security GCP log analyzer (#35922)

* Add the base log_analyzer

* Add github action for security logging

* Enhance LogAnalyzer to filter logs by time range and include file names in event summary

* Add dry-run option for weekly email report generation in LogAnalyzer

* Better error handling for timezones and missing details

* Refactor LogAnalyzer to use SinkCls for type consistency and enhance bucket permission management for log sinks

* update py containers (#35982)

* [YAML]: add import jinja pipeline example (#35945)

* add import jinja pipeline example

* revert name change

* update overall examples readme

* fix lint issue

* fix gemini small issue

* Update sdks/python/apache_beam/yaml/examples/transforms/jinja/import/README.md

---------

Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>

* workflows: capture DinD tests in PreCommit Py Coverage workflow

* workflows: temporarily removing `ubuntu-latest` till resolving deps

* workflows: add `matrix.os` label to `beam_PreCommit_Python_Coverage`

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Mohamed Awnallah <mohamedmohey2352@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chamikara Jayalath <chamikaramj@gmail.com>
Co-authored-by: Yi Hu <yathu@google.com>
Co-authored-by: kristynsmith <kristynsmith@google.com>
Co-authored-by: liferoad <huxiangqian@gmail.com>
Co-authored-by: Shunping Huang <shunping@google.com>
Co-authored-by: Derrick Williams <derrickaw@google.com>
Co-authored-by: Enrique Calderon <71863693+ksobrenat32@users.noreply.github.com>
Co-authored-by: Ahmed Abualsaud <65791736+ahmedabu98@users.noreply.github.com>
Co-authored-by: tvalentyn <tvalentyn@users.noreply.github.com>
@shunping shunping changed the title Fix localhost endpoints when running docker environment on mac. [Prism] Override localhost endpoints when running docker env on mac. Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Prism]: Support Docker in MacOS with transforms for Cross Language

3 participants