Skip to content

aws: fix assertion failure in debug due to timer access outside of main thread#34138

Merged
mattklein123 merged 31 commits intoenvoyproxy:mainfrom
nbaws:metadata_async
May 29, 2024
Merged

aws: fix assertion failure in debug due to timer access outside of main thread#34138
mattklein123 merged 31 commits intoenvoyproxy:mainfrom
nbaws:metadata_async

Conversation

@nbaws
Copy link
Copy Markdown
Contributor

@nbaws nbaws commented May 14, 2024

Commit Message: aws: fix assertion failure in debug due to timer access outside of main thread

Additional Description:

Patch to ensure async credential providers do not access main thread timers when credentials are requested. This was a larger fix than originally expected, due to substantial test case rewrites, changes to refresh logic and changes to the cluster initialisation sequence.

This patch performs the following:

  • Changes async providers to kick off credential refresh process via onClusterAddOrUpdate, replacing the init handler
  • Changes async providers to be truly async. They now perform credential refresh based on a timer, set to the expiration time returned in the credential payload, or to the default cache duration of 1 hour. Async providers will no longer trigger credential refresh via the data plane (which was actually the cause of the original bug). Async providers will start refreshing with a 2 second timer and doubling to 32 seconds (or until they succeed) avoiding load on STS or IMDS.
  • Fixes the cache duration calculation to be actually 1 hour, rather than 1 hour * 60 * 60 :)
  • Fixes a reported bug that also caused assertion failure in route specific configuration
  • Async providers now honour any expiration provided from their credential source
  • Async providers now have statistics to capture number of success/failed refreshes, used in test cases
  • Webidentity credential provider now handles the (unlikely) case where more than one region is present in the configuration, such as multiple route specific configs. Webidentity sts clusters will have the region name appended.

Risk Level: Low
Testing: Unit
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue] #33962
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

Signed-off-by: Nigel Brittain <nbaws@amazon.com>
nbaws added 3 commits May 15, 2024 01:27
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
nbaws added 2 commits May 16, 2024 01:21
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
@nbaws nbaws closed this May 16, 2024
nbaws added 4 commits May 17, 2024 07:51
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
@nbaws nbaws reopened this May 18, 2024
@nbaws
Copy link
Copy Markdown
Contributor Author

nbaws commented May 18, 2024

@suniltheta

@nbaws nbaws closed this May 18, 2024
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
@nbaws nbaws reopened this May 19, 2024
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
@nbaws
Copy link
Copy Markdown
Contributor Author

nbaws commented May 19, 2024

A note on this PR - the majority of the change is to implement a handler for onClusterAddOrDelete, and that handler literally calls refresh() once the cluster is ready to handle requests and is never used again. If there is a shorter and/or cleaner way to do this, please let me know. I've used code from dynamic forward proxy to do most of the onClusterAddOrDelete handling.

nbaws added 7 commits May 19, 2024 11:23
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
@mattklein123
Copy link
Copy Markdown
Member

@suniltheta for first pass, thanks.

/wait

Copy link
Copy Markdown
Contributor

@suniltheta suniltheta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for making this change. I am still in process of reviewing tests. Though I would leave f/b as I have them.

Comment thread source/extensions/common/aws/credentials_provider_impl.h Outdated
Comment thread source/extensions/common/aws/credentials_provider_impl.cc Outdated
Comment thread source/extensions/common/aws/credentials_provider_impl.cc Outdated
Comment thread source/extensions/common/aws/credentials_provider_impl.cc
Comment thread source/extensions/common/aws/credentials_provider_impl.cc
Comment thread source/extensions/common/aws/credentials_provider_impl.cc
Comment thread source/extensions/common/aws/credentials_provider_impl.cc Outdated
Comment thread source/extensions/common/aws/credentials_provider_impl.cc Outdated
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
@nbaws
Copy link
Copy Markdown
Contributor Author

nbaws commented May 21, 2024

/retest

nbaws and others added 9 commits May 25, 2024 11:21
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
…metadata_async

Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Signed-off-by: Nigel Brittain <nbaws@amazon.com>
@nbaws
Copy link
Copy Markdown
Contributor Author

nbaws commented May 27, 2024

@suniltheta some minor refactoring - there was an initialization issue I found with upstream filter that I've now addressed.

Signed-off-by: Nigel Brittain <nbaws@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants