Skip to content

Conversation

@tchivs
Copy link
Contributor

@tchivs tchivs commented Aug 6, 2025

Fix Windows compatibility for YAML JavaJarProvider

I submitted a PR before but the file is gone. Try again now
#35618

Description

This PR fixes a cross-platform compatibility issue in the YAML provider where the JavaJarProvider.available() method fails on Windows systems due to the use of the Unix-specific which command.

Problem

The original implementation uses subprocess.run(['which', java_executable]) which works on Unix/Linux systems but fails on Windows with FileNotFoundError: [WinError 2] The system cannot find the file specified because Windows doesn't have a which command.

Solution

  • Platform detection: Use platform.system() to detect the operating system
  • Windows compatibility: Use where command on Windows instead of which
  • Unix/Linux compatibility: Keep existing which command behavior unchanged
  • Fallback mechanism: Add shutil.which() as a cross-platform fallback for edge cases
  • Error handling: Improved error handling for FileNotFoundError and OSError

Changes Made

  • Modified JavaJarProvider.available() method in apache_beam/yaml/yaml_provider.py
  • Use platform-specific commands (where on Windows, which on Unix/Linux) for Java detection
  • Add shutil.which fallback for cross-platform compatibility
  • Maintain backward compatibility with existing Unix/Linux behavior

Testing

  • ✅ Tested on Windows 10/11
  • ✅ Tested on Linux Ubuntu

Backward Compatibility

  • Fully backward compatible - no breaking changes
  • Unix/Linux behavior unchanged - existing functionality preserved
  • Same return types - maintains bool or NotAvailableWithReason
  • Same error messages - consistent user experience

Code Quality

  • ✅ Follows existing code style and conventions
  • ✅ Includes proper error handling for all edge cases
  • ✅ Maintains the same method signature and return types
  • ✅ Added appropriate inline comments
  • ✅ Preserves existing pylint disable comments

Impact

This change enables Windows users to use Apache Beam's YAML functionality without requiring workarounds, manual patches, or Unix-like environments (WSL, Git Bash, etc.).

Before: Windows users encounter FileNotFoundError when using YAML providers
After: YAML providers work seamlessly across all supported platforms


  • fixes #35617
  • Update CHANGES.md with noteworthy changes.

Additional Notes

  • This is a small, focused fix that addresses a specific cross-platform compatibility issue
  • No ICLA required as this is a minor bugfix
  • Ready for review and testing on multiple platforms

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @tchivs, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've addressed a critical cross-platform compatibility issue within Apache Beam's YAML JavaJarProvider. Previously, the available() method failed on Windows systems because it incorrectly used the Unix-specific 'which' command to locate the Java executable. My changes introduce platform-aware logic, utilizing 'where' on Windows and retaining 'which' on Unix/Linux, along with a robust 'shutil.which' fallback. This ensures that Windows users can now seamlessly leverage Apache Beam's YAML functionality without encountering FileNotFoundError.

Highlights

  • Windows Compatibility: The JavaJarProvider.available() method in apache_beam/yaml/yaml_provider.py was updated to resolve a FileNotFoundError on Windows by replacing the Unix 'which' command with the Windows 'where' command for locating the Java executable.
  • Platform-Specific Logic: I've implemented platform.system() to dynamically choose between 'where' (for Windows) and 'which' (for Unix/Linux) when attempting to find the Java executable, ensuring correct behavior across different operating systems.
  • Robust Fallback Mechanism: A try-except block has been added to catch FileNotFoundError and OSError, incorporating shutil.which('java') as a cross-platform fallback to reliably locate Java, even if the initial platform-specific command fails.
  • Improved Error Messaging: The error handling for cases where Java cannot be found has been refined to provide more descriptive messages, aiding in debugging.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 6, 2025

The Workflow run is cancelling this PR. It is an earlier duplicate of 1729654 run.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 6, 2025

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@tchivs
Copy link
Contributor Author

tchivs commented Aug 6, 2025

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

assign set of reviewers

@github-actions
Copy link
Contributor

github-actions bot commented Aug 6, 2025

Assigning reviewers:

R: @claudevdm for label python.

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@tchivs
Copy link
Contributor Author

tchivs commented Aug 6, 2025

Assigning reviewers:

R: @claudevdm for label python.

Note: If you would like to opt out of this review, comment assign to next reviewer.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

waiting on author

@github-actions
Copy link
Contributor

github-actions bot commented Aug 6, 2025

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers


def available(self):
# pylint: disable=subprocess-run-check
trial = subprocess.run(['which', subprocess_server.JavaHelper.get_java()],
Copy link
Contributor

@Abacn Abacn Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should fix subprocess_server.JavaHelper.get_java() to support windows. This approach effectively removed JAVA_HOME environment handling that was considered there and affects all platforms, and leads to inconsistent result for available() and jar_provider call

@ahmedabu98 ahmedabu98 merged commit e0cbfe1 into apache:master Aug 6, 2025
2036 of 2057 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants