Skip to content

Conversation

@thiyaguk09
Copy link
Contributor

@thiyaguk09 thiyaguk09 commented Oct 7, 2025

Description

This change refines error handling in util.makeRequest to intercept and transform common low-level network failures (ECONNRESET, ETIMEDOUT, "timed out", "TLS handshake") into three distinct and specific standard Error messages.

The following raw errors are now intercepted:

  • ECONNRESET
  • ETIMEDOUT
  • Generic messages containing "timed out"
  • Generic messages containing "TLS handshake"

Instead of a single message, developers now receive tailored diagnostic information: a specific message for TLS/CPU starvation issues, one for network timeouts, and one for connection resets. This provides clearer guidance for debugging. The original stack trace is preserved on the new Error object. This is an additive change with no breaking impacts, validated by a consolidated, data-driven unit test suite.

By transforming these errors and preserving the original stack trace, we prevent the propagation of cryptic, low-level network codes and provide developers with a clear, unified diagnostic message tailored to the type of connection failure experienced.

Impact

The impact of this change is primarily positive, improving the developer experience:

  • Improved Error Diagnostics (Granular): Developers receive three distinct, specific messages instead of a single ambiguous one. For instance, a TLS handshake error receives a special message related to CPU starvation to guide performance debugging.
  • Consistent Error Handling: Facilitates easier integration with custom error retry and logging mechanisms by providing a predictable, standard Error structure (augmented with the original stack trace) rather than a raw, non-standard network error object.
  • No Breaking Changes: This is a purely additive fix that catches and transforms errors that would have been thrown anyway. It does not alter the successful path for requests.

Testing

Yes, unit tests were added.

  • A dedicated unit test suite, Network Connectivity Errors, was created under makeAuthenticatedRequestFactory to validate the new transformation logic.

  • The test structure was consolidated into a single, data-driven forEach loop that verifies the mapping of the four common failure modes to the three distinct output messages defined in the UtilExceptionMessages enum.

  • A single test loop replaced four separate tests, covering all conditions:

    • should transform raw ECONNRESET into specific network error
    • should transform raw "TLS handshake" into specific network error
    • should transform raw generic "timed out" into specific network error
    • should transform raw ETIMEDOUT into specific network error

Tests Changed? No existing tests were modified.

Breaking Changes? No breaking changes are necessary.

Additional Information

  • Error Object Change: The transformation logic was simplified to augment a standard JavaScript Error object.
  • Test Structure: The tests were consolidated into a single forEach loop for improved clarity and maintainability.
  • Stubbing: The authClient was stubbed to guarantee successful authorization, forcing execution into the network path where the error injection and transformation occur, preventing test timeouts.
    The logic ensures that if an error is transformed, the original stack trace (err.stack) is preserved on the new Error object, allowing developers to debug the source of the failure effectively.

Checklist

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease
  • Appropriate docs were updated
  • Appropriate comments were added, particularly in complex areas or places that require background
  • No new warnings or issues will be generated from this change

Fixes #

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: storage Issues related to the googleapis/nodejs-storage API. labels Oct 7, 2025
@thiyaguk09 thiyaguk09 changed the title Feat/improve tls error handling fix: Transform network failures into specific TLS timeout ApiError Oct 7, 2025
@thiyaguk09 thiyaguk09 marked this pull request as ready for review October 9, 2025 04:40
@thiyaguk09 thiyaguk09 requested review from a team as code owners October 9, 2025 04:40
@ddelgrosso1 ddelgrosso1 added the owlbot:run Add this label to trigger the Owlbot post processor. label Oct 14, 2025
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Oct 14, 2025
@ddelgrosso1
Copy link
Contributor

General comment but this logic will need to be ported to Gaxios in the future.

@thiyaguk09 thiyaguk09 changed the title fix: Transform network failures into specific TLS timeout ApiError fix: Transform network failures into specific TLS timeout Oct 28, 2025
@thiyaguk09
Copy link
Contributor Author

This is a gentle reminder to please take a look when you have a moment.

@ddelgrosso1
Copy link
Contributor

I'm not really sure why we are forcing things such as ECONNRESET, ETIMEDOUT into a TLS error. They may or may not be related to TLS. I think this gives a false impression to the end user. I think we need to rethink what it is we are trying to accomplish here.

@thiyaguk09
Copy link
Contributor Author

I'm not really sure why we are forcing things such as ECONNRESET, ETIMEDOUT into a TLS error. They may or may not be related to TLS. I think this gives a false impression to the end user. I think we need to rethink what it is we are trying to accomplish here.

I agree completely—the current grouping is misleading. Thanks for the feedback!

I've updated the error handling to distinguish these failures:

  • "tls handshake" for specific protocol issues.
  • "etimedout" / "timed out" for network timeouts.
  • "econnreset" for connection resets.

This provides a much more accurate diagnosis for the end-user.

Transforms raw network errors (ECONNRESET, ETIMEDOUT, timed out, and TLS
handshake) into a specific ApiError (code 408) with a descriptive
message regarding potential CPU starvation.

This prevents misleading error propagation from the underlying request
library.
Splits network error handling: uses 408 for timeouts (timed out,
ETIMEDOUT, TLS handshake) and 503 for connection resets (ECONNRESET) to
improve retry logic accuracy.
Converts raw ECONNRESET, ETIMEDOUT, and TLS handshake failures into a
standard Error object with an informative message. This helps diagnose
CPU starvation or misleading 401 errors.
Replaces repetitive test cases in `makeAuthenticatedRequest` and
`makeRequest` with a single, data-driven test loop. This verifies all
conditions (ECONNRESET, ETIMEDOUT, "timed out", "TLS handshake") with
reduced code duplication and improved maintenance.
```
…ilures

Separates specific network transport errors (`ETIMEDOUT`, `ECONNRESET`)
from genuine TLS handshake failures.

The previous approach incorrectly categorized lower-level connection
issues as "TLS errors," leading to misleading diagnostics for end-users.
This change ensures accurate reporting based on the error pattern:
- "tls handshake": Protocol/certificate issue.
- "etimedout" / "timed out": Network timeout/availability.
- "econnreset": Connection forcefully reset by host/intermediary.
@thiyaguk09 thiyaguk09 force-pushed the feat/improve-tls-error-handling branch from 4d68ab1 to 264beb5 Compare December 10, 2025 09:25
@thiyaguk09
Copy link
Contributor Author

@ddelgrosso1 For better readability, I’ve added some additional errors related to the network issue. Please review them.

@ddelgrosso1
Copy link
Contributor

The new changes still suffer from the same issue, we are rewrapping errors and adding our own spin on what the cause may or may not be. We have gone to great lengths to not rewrap most errors thrown by the GCS server and I don't think this class of errors should be any different. I don't think this PR is at all necessary but someone is free to override me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the googleapis/nodejs-storage API. size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants