Skip to content

Fix: Race between manifest object creation and application of tags (#5958)#6003

Merged
achave11-ucsc merged 5 commits intodevelopfrom
issues/achave11-ucsc/5958-manifest-tagging-race
May 14, 2024
Merged

Fix: Race between manifest object creation and application of tags (#5958)#6003
achave11-ucsc merged 5 commits intodevelopfrom
issues/achave11-ucsc/5958-manifest-tagging-race

Conversation

@achave11-ucsc
Copy link
Member

@achave11-ucsc achave11-ucsc commented Feb 29, 2024

Connected issues: #5958

Checklist

Author

  • PR is a draft
  • Target branch is develop
  • Name of PR branch matches issues/<GitHub handle of author>/<issue#>-<slug>
  • On ZenHub, PR is connected to all issues it (partially) resolves
  • PR description links to connected issues
  • PR title matches1 that of a connected issue or comment in PR explains why they're different
  • PR title references all connected issues
  • For each connected issue, there is at least one commit whose title references that issue

1 when the issue title describes a problem, the corresponding PR
title is Fix: followed by the issue title

Author (partiality)

  • Added p tag to titles of partial commits
  • This PR is labeled partial or completely resolves all connected issues
  • This PR partially resolves each of the connected issues or does not have the partial label

Author (chains)

  • This PR is blocked by previous PR in the chain or is not chained to another PR
  • The blocking PR is labeled base or this PR is not chained to another PR
  • This PR is labeled chained or is not chained to another PR

Author (reindex, API changes)

  • Added r tag to commit title or the changes introduced by this PR will not require reindexing of any deployment
  • This PR is labeled reindex:dev or the changes introduced by it will not require reindexing of dev
  • This PR is labeled reindex:anvildev or the changes introduced by it will not require reindexing of anvildev
  • This PR is labeled reindex:anvilprod or the changes introduced by it will not require reindexing of anvilprod
  • This PR is labeled reindex:prod or the changes introduced by it will not require reindexing of prod
  • This PR is labeled reindex:partial and its description documents the specific reindexing procedure for dev, anvildev, anvilprod and prod or requires a full reindex or carries none of the labels reindex:dev, reindex:anvildev, reindex:anvilprod and reindex:prod
  • This PR and its connected issues are labeled API or this PR does not modify a REST API
  • Added a (A) tag to commit title for backwards (in)compatible changes or this PR does not modify a REST API
  • Updated REST API version number in app.py or this PR does not modify a REST API

Author (upgrading deployments)

  • Ran make image_manifests.json and committed the resulting changes or this PR does not modify azul_docker_images, or any other variables referenced in the definition of that variable
  • Documented upgrading of deployments in UPGRADING.rst or this PR does not require upgrading deployments
  • Added u tag to commit title or this PR does not require upgrading deployments
  • This PR is labeled upgrade or does not require upgrading deployments
  • This PR is labeled deploy:shared or does not modify image_manifests.json, and does not require deploying the shared component for any other reason
  • This PR is labeled deploy:gitlab or does not require deploying the gitlab component
  • This PR is labeled deploy:runner or does not require deploying the runner image

Author (hotfixes)

  • Added F tag to main commit title or this PR does not include permanent fix for a temporary hotfix
  • Reverted the temporary hotfixes for any connected issues or the none of the stable branches (anvilprod and prod) have temporary hotfixes for any of the issues connected to this PR

Author (before every review)

  • Rebased PR branch on develop, squashed old fixups
  • Ran make requirements_update or this PR does not modify requirements*.txt, common.mk, Makefile and Dockerfile
  • Added R tag to commit title or this PR does not modify requirements*.txt
  • This PR is labeled reqs or does not modify requirements*.txt
  • make integration_test passes in personal deployment or this PR does not modify functionality that could affect the IT outcome

Peer reviewer (after approval)

  • PR is not a draft
  • Ticket is in Review requested column
  • PR is awaiting requested review from system administrator
  • PR is assigned to only the system administrator

System administrator (after approval)

  • Actually approved the PR
  • Labeled connected issues as demo or no demo
  • Commented on connected issues about demo expectations or all connected issues are labeled no demo
  • Decided if PR can be labeled no sandbox
  • A comment to this PR details the completed security design review
  • PR title is appropriate as title of merge commit
  • N reviews label is accurate
  • Moved ticket to Approved column
  • PR is assigned to only the operator

Operator (before pushing merge the commit)

  • Checked reindex:… labels and r commit title tag
  • Checked that demo expectations are clear or all connected issues are labeled no demo
  • Squashed PR branch and rebased onto develop
  • Sanity-checked history
  • Pushed PR branch to GitHub
  • Ran _select dev.shared && CI_COMMIT_REF_NAME=develop make -C terraform/shared apply_keep_unused or this PR is not labeled deploy:shared
  • Ran _select dev.gitlab && CI_COMMIT_REF_NAME=develop make -C terraform/gitlab apply or this PR is not labeled deploy:gitlab
  • Ran _select anvildev.shared && CI_COMMIT_REF_NAME=develop make -C terraform/shared apply_keep_unused or this PR is not labeled deploy:shared
  • Ran _select anvildev.gitlab && CI_COMMIT_REF_NAME=develop make -C terraform/gitlab apply or this PR is not labeled deploy:gitlab
  • Checked the items in the next section or this PR is labeled deploy:gitlab
  • PR is assigned to only the system administrator or this PR is not labeled deploy:gitlab

System administrator

  • Background migrations for dev.gitlab are complete or this PR is not labeled deploy:gitlab
  • Background migrations for anvildev.gitlab are complete or this PR is not labeled deploy:gitlab
  • PR is assigned to only the operator

Operator (before pushing merge the commit)

  • Ran _select dev.gitlab && make -C terraform/gitlab/runner or this PR is not labeled deploy:runner
  • Ran _select anvildev.gitlab && make -C terraform/gitlab/runner or this PR is not labeled deploy:runner
  • Added sandbox label or PR is labeled no sandbox
  • Pushed PR branch to GitLab dev or PR is labeled no sandbox
  • Pushed PR branch to GitLab anvildev or PR is labeled no sandbox
  • Build passes in sandbox deployment or PR is labeled no sandbox
  • Build passes in anvilbox deployment or PR is labeled no sandbox
  • Reviewed build logs for anomalies in sandbox deployment or PR is labeled no sandbox
  • Reviewed build logs for anomalies in anvilbox deployment or PR is labeled no sandbox
  • Deleted unreferenced indices in sandbox or this PR does not remove catalogs or otherwise causes unreferenced indices in dev
  • Deleted unreferenced indices in anvilbox or this PR does not remove catalogs or otherwise causes unreferenced indices in anvildev
  • Started reindex in sandbox or this PR is not labeled reindex:dev
  • Started reindex in anvilbox or this PR is not labeled reindex:anvildev
  • Checked for failures in sandbox or this PR is not labeled reindex:dev
  • Checked for failures in anvilbox or this PR is not labeled reindex:anvildev
  • The title of the merge commit starts with the title of this PR
  • Added PR # reference to merge commit title
  • Collected commit title tags in merge commit title but only included p if the PR is also labeled partial
  • Moved connected issues to Merged lower column in ZenHub
  • Pushed merge commit to GitHub

Operator (chain shortening)

  • Changed the target branch of the blocked PR to develop or this PR is not labeled base
  • Removed the chained label from the blocked PR or this PR is not labeled base
  • Removed the blocking relationship from the blocked PR or this PR is not labeled base
  • Removed the base label from this PR or this PR is not labeled base

Operator (after pushing the merge commit)

  • Pushed merge commit to GitLab dev
  • Pushed merge commit to GitLab anvildev
  • Build passes on GitLab dev
  • Reviewed build logs for anomalies on GitLab dev
  • Build passes on GitLab anvildev
  • Reviewed build logs for anomalies on GitLab anvildev
  • Ran _select dev.shared && make -C terraform/shared apply or this PR is not labeled deploy:shared
  • Ran _select anvildev.shared && make -C terraform/shared apply or this PR is not labeled deploy:shared
  • Deleted PR branch from GitHub
  • Deleted PR branch from GitLab dev
  • Deleted PR branch from GitLab anvildev

Operator (reindex)

  • Deindexed all unreferenced catalogs in dev or this PR is neither labeled reindex:partial nor reindex:dev
  • Deindexed all unreferenced catalogs in anvildev or this PR is neither labeled reindex:partial nor reindex:anvildev
  • Deindexed specific sources in dev or this PR is neither labeled reindex:partial nor reindex:dev
  • Deindexed specific sources in anvildev or this PR is neither labeled reindex:partial nor reindex:anvildev
  • Indexed specific sources in dev or this PR is neither labeled reindex:partial nor reindex:dev
  • Indexed specific sources in anvildev or this PR is neither labeled reindex:partial nor reindex:anvildev
  • Started reindex in dev or this PR does not require reindexing dev
  • Started reindex in anvildev or this PR does not require reindexing anvildev
  • Checked for, triaged and possibly requeued messages in both fail queues in dev or this PR does not require reindexing dev
  • Checked for, triaged and possibly requeued messages in both fail queues in anvildev or this PR does not require reindexing anvildev
  • Emptied fail queues in dev or this PR does not require reindexing dev
  • Emptied fail queues in anvildev or this PR does not require reindexing anvildev

Operator

  • Propagated the deploy:shared, deploy:gitlab, deploy:runner, reindex:partial, reindex:anvilprod and reindex:prod labels to the next promotion PRs or this PR carries none of these labels
  • Propagated any specific instructions related to the deploy:shared, deploy:gitlab, deploy:runner, reindex:partial, reindex:anvilprod and reindex:prod labels, from the description of this PR to that of the next promotion PRs or this PR carries none of these labels
  • PR is assigned to no one

Shorthand for review comments

  • L line is too long
  • W line wrapping is wrong
  • Q bad quotes
  • F other formatting problem

@coveralls
Copy link

coveralls commented Feb 29, 2024

Coverage Status

coverage: 85.155% (-0.1%) from 85.278%
when pulling 1f9552d on issues/achave11-ucsc/5958-manifest-tagging-race
into 5ffd004 on develop.

@codecov
Copy link

codecov bot commented Feb 29, 2024

Codecov Report

Attention: Patch coverage is 5.12821% with 37 lines in your changes are missing coverage. Please review.

Project coverage is 85.13%. Comparing base (5ffd004) to head (1f9552d).

Files Patch % Lines
test/integration_test.py 0.00% 36 Missing ⚠️
src/azul/service/manifest_service.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6003      +/-   ##
===========================================
- Coverage    85.26%   85.13%   -0.13%     
===========================================
  Files          155      155              
  Lines        20456    20488      +32     
===========================================
+ Hits         17441    17443       +2     
- Misses        3015     3045      +30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@achave11-ucsc achave11-ucsc force-pushed the issues/achave11-ucsc/5958-manifest-tagging-race branch 2 times, most recently from 12f5e8f to 6aed74d Compare February 29, 2024 21:53
Copy link
Contributor

@nadove-ucsc nadove-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Subject: [PATCH] review
---
Index: test/integration_test.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/test/integration_test.py b/test/integration_test.py
--- a/test/integration_test.py	(revision 6aed74d7a4ce6f480c5aac8fdeb270327bb018a9)
+++ b/test/integration_test.py	(date 1709251288557)
@@ -650,23 +650,20 @@
             ManifestFormat.compact: self._check_compact_manifest,
             ManifestFormat.curl: self._check_curl_manifest,
         }
+        format, validator = self.random.choice([*validators.items()])
         filters = self.__init_manifest_filters(catalog)
-
-        format = self.random.choice([*validators])
-        manifeset_url = config.service_endpoint.set(path='/manifest/files')
+        manifest_url = config.service_endpoint.set(path='/manifest/files',
+                                                   args=dict(catalog=catalog,
+                                                             filters=json.dumps(filters),
+                                                             format=format.value))
 
         with self.subTest('manifest-tagging', catalog=catalog, format=format):
-            args = dict(catalog=catalog, filters=json.dumps(filters))
-            validator = validators[format]
-            args['format'] = format.value
-            manifeset_url.args = args
             method = PUT
-
-            responses = list()
+            responses = []
             while True:
-                response = self._get_url(method, manifeset_url)
-                responses.append(response)
+                response = self._get_url(method, manifest_url)
                 if response.status == 301:
+                    responses.append(response)
                     # Request the same manifest without following the redirect
                     # in order to expose the race condition that returns an
                     # untagged cached manifest. This happens when a step
@@ -674,12 +671,12 @@
                     # in the process of tagging the object.
                     time.sleep(2)
                 elif response.status == 302:
-                    method, manifeset_url = GET, furl(response.headers['Location'])
+                    responses.append(response)
+                    method, manifest_url = GET, furl(response.headers['Location'])
                 else:
                     break
 
             validator(catalog, response.data)
-            responses.remove(response)
             execution_ids = self._manifest_execution_ids(responses, fetch=False)
             self.assertEqual(1, len(execution_ids))

# Can't be absent under S3's strong consistency
assert False, (object_key, self.file_name_tag)
# Paged manifest generators apply the tag after creating
# the object. If we happened to be right inbetween the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# the object. If we happened to be right inbetween the
# the object. If we happen to be right in between the

expect_execution = fetch == first_fetch
self.assertEqual(1 if expect_execution else 0, len(execution_ids))

def __init_manifest_filters(self, catalog: CatalogName) -> JSON:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method name is strange. Why two underscores? I can't think anything satisfactorily descriptive that's also reasonably terse, so unless you can think of something better I'd just go with

Suggested change
def __init_manifest_filters(self, catalog: CatalogName) -> JSON:
def _manifest_filters(self, catalog: CatalogName) -> JSON:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, commit title in unnecessarily detailed. I think "Refactor IT manifest filters" is sufficient. The motivation is obvious from context.

}
filters = self.__init_manifest_filters(catalog)

format = self.random.choice([*validators])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choosing a format at random seems odd to me. If we gain a meaningful increase in coverage from testing multiple formats, we should always test both, otherwise, pick one so we can simplify the test.

file_name = self.file_name(manifest_key, base_name=partition.file_name)
tagging = self.tagging(file_name)
if tagging is not None:
time.sleep(16) # Exposes race between manifest completion and tagging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is needed to expose the race condition, why is it in a drop! commit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PL:

We'll be using the drop! commit in sandbox IT for testing/validation purposes.

@nadove-ucsc nadove-ucsc removed their assignment Mar 1, 2024
@achave11-ucsc achave11-ucsc force-pushed the issues/achave11-ucsc/5958-manifest-tagging-race branch 2 times, most recently from 7483662 to 9f6f649 Compare March 5, 2024 21:49
Copy link
Contributor

@nadove-ucsc nadove-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The drop! commit message claims that it reverts commit 748366, but that commit isn't on this branch. I presume that it's meant to point at b269717 and the reference was broken due to a rebase.

I'm also confused as why the change to 16s is included in this commit, since that line wasn't changed in the commit that it appears to be reverting. Normally, if commit B claims to revert commit A, then I expect squashing A and B to result in an empty diff, but that doesn't seem to be the case here. Perhaps you could have two separate drop! commits, one to revert the fix and one to change the sleep time? But I might be overthinking this.

ManifestFormat.curl: self._check_curl_manifest,
}

for format, validator in validators:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this? I think it needs to be:

Suggested change
for format, validator in validators:
for format, validator in validators.items():

}

def _test_manifest_tagging_race(self, catalog: CatalogName):
validators: dict[ManifestFormat, Callable[[str, bytes], None]] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
validators: dict[ManifestFormat, Callable[[str, bytes], None]] = {
validators: dict[ManifestFormat, Callable[[CatalogName, bytes], None]] = {

@nadove-ucsc nadove-ucsc removed their assignment Mar 6, 2024
@achave11-ucsc achave11-ucsc force-pushed the issues/achave11-ucsc/5958-manifest-tagging-race branch from 1c4ade6 to 509d2ba Compare March 6, 2024 19:14
@achave11-ucsc
Copy link
Member Author

The drop! commit message claims that it reverts commit 748366, …

Yeah, I was trying to keep the changes necessary to expose the race in a single !drop commit, which included reverting the fix and including a sleep in the manifest tagging process to make the bug stick out during the IT process. Which was one of the suggestions that came out of PL; I was a bit careless about the contents of the revert commit. However, to you point, your suggestions make the history clearer and the intent easier to follow.

@achave11-ucsc achave11-ucsc force-pushed the issues/achave11-ucsc/5958-manifest-tagging-race branch 3 times, most recently from 0856bec to 75b4b72 Compare March 7, 2024 00:40
nadove-ucsc
nadove-ucsc previously approved these changes Mar 7, 2024
Copy link
Member

@hannes-ucsc hannes-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the IT failures looks like what I would expect. On anvilbox and hammerbox, the curl manifest is disallowed and on sandbox it's not an assertion that fails but rather we get a MaxRetryError from urllib3.

@hannes-ucsc hannes-ucsc removed their assignment Mar 22, 2024
@achave11-ucsc
Copy link
Member Author

For the next round, also add service logs with 500.

@achave11-ucsc achave11-ucsc force-pushed the issues/achave11-ucsc/5958-manifest-tagging-race branch 2 times, most recently from 6933dfb to e6df3a6 Compare March 28, 2024 21:52
@achave11-ucsc achave11-ucsc force-pushed the issues/achave11-ucsc/5958-manifest-tagging-race branch from e6df3a6 to 216047b Compare April 3, 2024 20:28
@achave11-ucsc
Copy link
Member Author

None of the IT failures looks like what I would expect.

As confirmed during PL, this is the expected failure since we end up using a request method _get_url_unchecked which retries 500 status responses.

retries=urllib3.Retry(total=5,
redirect=0,
status_forcelist={429, 500, 502, 503}),

And as describe in issue #5958, this issue exhibits itself by returning a 500 status response from the service, when present.

When the sandbox runs (with the !drop commits) and fails, include CloudWatch service logs of the failed requests.

hannes-ucsc

This comment was marked as duplicate.

Copy link
Member

@hannes-ucsc hannes-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Index: test/integration_test.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/test/integration_test.py b/test/integration_test.py
--- a/test/integration_test.py	(revision 216047b1f0dc8f2f844979b8bb9292058e8d3b70)
+++ b/test/integration_test.py	(date 1712263094671)
@@ -648,45 +648,44 @@
 
     def _test_manifest_tagging_race(self, catalog: CatalogName):
         validators: dict[ManifestFormat, Callable[[CatalogName, bytes], None]] = {
-            ManifestFormat.compact: self._check_compact_manifest
+            ManifestFormat.compact: self._check_compact_manifest,
+            ManifestFormat.curl: self._check_curl_manifest
         }
-        # Test curl manifest format only in plugins that use it
-        if ManifestFormat.curl in self.metadata_plugin(catalog).manifest_formats:
-            validators[ManifestFormat.curl] = self._check_curl_manifest
-
+        plugin = self.metadata_plugin(catalog)
         for format, validator in validators.items():
-            with self.subTest('manifest_tagging_race', catalog=catalog, format=format):
-                filters = self._manifest_filters(catalog)
-                manifest_url = config.service_endpoint.set(path='/manifest/files',
-                                                           args=dict(catalog=catalog,
-                                                                     filters=json.dumps(filters),
-                                                                     format=format.value))
-                method = PUT
-                responses = []
-                while True:
-                    response = self._get_url(method, manifest_url)
-                    if response.status == 301:
-                        responses.append(response)
-                        # Request the same manifest without following the
-                        # redirect in order to expose the race condition that
-                        # returns an untagged cached manifest. This happens
-                        # when a step function execution has generated a
-                        # manifest but is still in the process of tagging the
-                        # object. The more requests we make, the more likely it
-                        # is that we catch the execution in this racy state. We
-                        # still have to throttle the requests in order to
-                        # prevent tripping the WAF rate limit.
-                        rate_limit = config.waf_rate_rule_period / config.waf_rate_rule_limit
-                        time.sleep(rate_limit)
-                    elif response.status == 302:
-                        responses.append(response)
-                        method, manifest_url = GET, furl(response.headers['Location'])
-                    else:
-                        break
+            if format in plugin.manifest_formats:
+                with self.subTest('manifest_tagging_race', catalog=catalog, format=format):
+                    filters = self._manifest_filters(catalog)
+                    manifest_url = config.service_endpoint.set(path='/manifest/files',
+                                                               args=dict(catalog=catalog,
+                                                                         filters=json.dumps(filters),
+                                                                         format=format.value))
+                    method = PUT
+                    responses = []
+                    while True:
+                        response = self._get_url(method, manifest_url)
+                        if response.status == 301:
+                            responses.append(response)
+                            # Request the same manifest without following the
+                            # redirect in order to expose the race condition that
+                            # returns an untagged cached manifest. This happens
+                            # when a step function execution has generated a
+                            # manifest but is still in the process of tagging the
+                            # object. The more requests we make, the more likely it
+                            # is that we catch the execution in this racy state. We
+                            # still have to throttle the requests in order to
+                            # prevent tripping the WAF rate limit.
+                            rate_limit = config.waf_rate_rule_period / config.waf_rate_rule_limit
+                            time.sleep(rate_limit)
+                        elif response.status == 302:
+                            responses.append(response)
+                            method, manifest_url = GET, furl(response.headers['Location'])
+                        else:
+                            break
 
-                validator(catalog, response.data)
-                execution_ids = self._manifest_execution_ids(responses, fetch=False)
-                self.assertEqual(1, len(execution_ids))
+                    validator(catalog, response.data)
+                    execution_ids = self._manifest_execution_ids(responses, fetch=False)
+                    self.assertEqual(1, len(execution_ids))
 
     def _manifest_execution_ids(self,
                                 responses: list[urllib3.HTTPResponse],

@hannes-ucsc hannes-ucsc removed their assignment Apr 4, 2024
hannes-ucsc
hannes-ucsc previously approved these changes Apr 5, 2024
@hannes-ucsc
Copy link
Member

@dsotirho-ucsc, same drill as last time we tried to merge this: #6003 (comment)

with the additional step documented here: #6003 (comment)

@dsotirho-ucsc
Copy link
Contributor

dsotirho-ucsc commented Apr 5, 2024

As expected, PR failed IT on sandbox. Drop commits have been removed.

CloudWatch Logs on sandbox:
(First request inlined below, full logs for the six failed requests attached here: 6003_service_5xx.json)

[
    {
        "@timestamp": "2024-04-05 10:11:25.551",
        "@message": "START RequestId: 20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4 Version: $LATEST\n"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.552",
        "@message": "[INFO]\t2024-04-05T10:11:25.552Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.chalice\tReceived PUT request for '/manifest/files', with {\"query\": {\"catalog\": \"dcp3-it\", \"filters\": \"{\\\"fileSize\\\": {\\\"within\\\": [[0, 1577461978389]]}}\", \"format\": \"compact\"}, \"headers\": {\"accept-encoding\": \"identity\", \"cloudfront-forwarded-proto\": \"https\", \"cloudfront-is-desktop-viewer\": \"true\", \"cloudfront-is-mobile-viewer\": \"false\", \"cloudfront-is-smarttv-viewer\": \"false\", \"cloudfront-is-tablet-viewer\": \"false\", \"cloudfront-viewer-asn\": \"14618\", \"cloudfront-viewer-country\": \"US\", \"host\": \"service.sandbox.dev.singlecell.gi.ucsc.edu\", \"user-agent\": \"python-urllib3/1.26.18\", \"via\": \"1.1 ffa8ec5cfe61dcaaebc108ff8c867054.cloudfront.net (CloudFront)\", \"x-amz-cf-id\": \"QHygW-TANDgIwpGv57Jbbc_nQVa56KxXc21Tq-xqzcN0l3NdptJFFA==\", \"x-amzn-trace-id\": \"Root=1-660fce4d-689ed0d100a62a127b6a3305\", \"x-forwarded-for\": \"52.20.215.168, 70.132.60.85\", \"x-forwarded-port\": \"443\", \"x-forwarded-proto\": \"https\"}}."
    },
    {
        "@timestamp": "2024-04-05 10:11:25.552",
        "@message": "[INFO]\t2024-04-05T10:11:25.552Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.chalice\tDid not authenticate request."
    },
    {
        "@timestamp": "2024-04-05 10:11:25.553",
        "@message": "[INFO]\t2024-04-05T10:11:25.553Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\tdynamodb.GetItem:\tMaking POST request to https://dynamodb.us-east-1.amazonaws.com/"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.553",
        "@message": "[INFO]\t2024-04-05T10:11:25.553Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\tdynamodb.GetItem:\t… with request body b'{\"TableName\": \"azul-sources_cache_by_auth-sandbox\", \"Key\": {\"identity\": {\"S\": \"dcp3-it:\"}}, \"ProjectionExpression\": \"sources,...'"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.556",
        "@message": "[INFO]\t2024-04-05T10:11:25.556Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\tdynamodb.GetItem:\tGot 200 response"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.556",
        "@message": "[INFO]\t2024-04-05T10:11:25.556Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\tdynamodb.GetItem:\t… with response body b'{\"Item\":{\"expiration\":{\"N\":\"1712311919\"},\"sources\":{\"S\":\"[\\\\\"a45481be-52fe-4458-b926-37b82fb0b04a\\\\\", \\\\\"dc7a86a7-e75c-4012-934b...'"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.557",
        "@message": "[DEBUG]\t2024-04-05T10:11:25.557Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.service.manifest_service\tComputing content hash for manifest using filters Filters(explicit={'fileSize': {'within': [[0, 1577461978389]]}}, source_ids={'5d9ff6e0-3ec0-44f5-bb5c-32691a09517b', '8bf100c0-7ac3-4860-80ca-28093f4adb61', '139ecf31-b6bb-419a-97c8-1245029631bb', '879a755b-ebc0-4ebb-b4ad-6c89054c4851', 'd74bbb92-2f76-4637-b386-4a60eec93e28', '421f4b48-5e68-4e61-92bb-c0218500e73c', '215793b0-b991-4b73-9584-1103d4072f39', '64032479-8f07-4a87-8a54-09a7c329e17c', 'da0f61e8-5c00-4cac-bc52-97a3f485d350', '155a019c-ca47-432d-939b-6e51d5bec730', 'cc1c98a4-bfc4-45f2-b8dc-e920e5ca634d', '7d9c28b0-d55e-4157-b3a1-9d2e9535a66c', 'fcc6b9e5-3e37-464f-8d5b-2b7b3d40f128', '54052b19-6fa1-41cd-b301-6d130b9ad516', 'd15b9e6e-dd53-4b80-80f1-4c698300b15a', 'e2151834-13cd-4156-9ea2-168a1b7abf60', '057262af-09d8-4975-8647-2a68a5d33629', 'dc4c099e-1d33-4810-aca5-036a026f3aae', 'a3b6a664-1afb-4b4f-bf08-63bab26c848f', '2c5901f6-67bf-40cc-aea1-621a60c19baa', '768c4624-c07b-4496-8b7b-76275a149774', 'f5a52dbb-acad-47fd-8832-436134301ea2', '0ab78bd0-2e02-4536-b3c9-bdd7a57d7988', 'e470d283-fd86-45b3-bec2-0c1663c16a67', '24ddfa48-402b-43df-8dff-f75d5e283193', '9eab82b6-0b35-4c00-9ddc-fa4f1b8cab99', 'f142a2f7-4c50-458d-a047-27bdb8e0d367', 'a6ffed95-9d29-4484-a5db-c080a6278665', '01b1ac88-8144-428e-a84c-88e9ed8ec6a2', '7138a5ec-41de-458d-9ac3-2fa2b0575787', 'e612386b-287d-4424-beb8-4ab62f925642', '3d661f98-8742-434a-98de-168dde5b5120', 'f012a311-5d55-42ad-8b77-f8a1816f3aa4', 'a45481be-52fe-4458-b926-37b82fb0b04a', 'ca753d3d-8726-4b2c-a606-b702e3ea8be0', 'fb68966f-f91e-428f-a167-118be31d2882', 'a73ed2f2-12c7-4c71-a09d-3189b0fedc26', '182ee25a-e89b-4800-923e-3ad1210b5816', '9b53159b-d0b8-44e7-8871-ca086529709c', '1e63b2ce-88ec-4c56-b016-56e9bd7f35e5', 'a177b37c-ff59-4a45-97e0-0f29a3136728', 'daa3a6f0-c1e0-4e38-b6aa-235fa509daf5', '41d88011-1981-4f50-b6fe-78f7b7d27bab', 'ed2f43fc-b4b2-4be1-aed9-b0c4f46afd78', '8a527b32-5ae3-437e-8a96-12562894849c', '62f19274-c8a9-4f7f-b913-108e6e71f293', '79abca77-cb9b-4905-8853-190367e397dc', '07bd9e9a-f3d5-4ce7-a7b5-a82cf5ede5b9', '61cbb99b-7b93-4d06-bdeb-5d45f3b34407', 'e785046f-93af-4c79-94b1-d8d0b4c04689', '90d35849-38f1-4259-86dd-b91023a6b9d8', '70c0ed2f-0db6-4245-b1c5-d140cc6c712c', 'd7250e7a-20c6-42ff-9332-4bc5a5fd2470', '104f2254-2af9-4699-9573-00d2da718430', 'cabce9f2-b98b-49f3-9dd4-d7e14c19acd2', '5c3ab1e3-24a9-4ae7-9c56-638422b1ae8b', 'f5a609de-0760-4a39-b633-9d751aa7d472', 'f7eefeb5-4198-4d9a-b3db-f2db6a567a15', 'ee9a4b92-82c1-4324-85ec-252cd8cea024', 'a36cb7cb-0122-40af-8942-1d5ec9372e77', '19510524-bbb9-4883-9228-fcbee947d27e', '1568e68d-3032-4571-a066-dac6ccec9716', '001afc86-da4c-4739-85be-26ca98d2693f', '79f24905-a9e2-4328-91da-fc18036b77eb', 'e2db098b-a834-449f-bd97-0a76a6a9d581', '1c32d6d0-6249-4b0e-a3a4-3a262ed610ef', 'c935fa3e-06f7-4e85-9ff1-c58eaa789978', 'a21be4c9-9778-4713-bc31-1a1c905a6ad3', '170988f8-1dd1-4f39-b72d-645dabf607b7', 'd8ee3cfa-32c9-49c2-b1ae-af0a7bb1c98d', '90c2d68f-a000-4c8d-a493-8be93a71b44d', '74b361b1-b13c-49ae-83f0-12ff7da06e0e', '0c86170e-312d-4b39-a0a4-2a2bfaa24c7a', 'c9551b96-7d8d-47c7-90dd-b90eec36504a', '23f0064e-3615-4009-9dfc-60583f92064e', '2d6116af-ceb1-4324-ba47-65fbc4202028', 'f84e6b4a-719c-4c75-891c-be52c106c6b8', 'e2af563e-68e2-495a-a51a-041e1e32f037', '05ed25dc-408d-4045-998d-6166fc2c3293', 'b1083e8b-4de9-467a-97de-18179c4e6bd1', 'f90f5d7f-c507-4e56-abfc-b965a66023fb', '806a9a68-1804-4234-85e1-aac7cdbffd73', 'e9fb5a5a-6eca-40b4-a894-3a3c65b7908e', 'd7f9e55f-fdce-4996-a2e3-4ab697a12bca', '0559efce-d188-4a59-9f6f-2a6258ca7e79', 'c594fc9a-f1c5-4770-acaa-013c2852e660', '02b9f4e0-3bee-4c23-87d4-485ddb414a87', '9d173ed8-c0ff-4a43-baf5-eb30ea7fba6d', '251c52e6-3bda-48df-a4f2-50aa759d7d80', '9bf03ef5-4f42-4afb-b451-21923c6ff468', '403ef1b1-7b7f-4e13-bf4a-d898f047310a', '4c15703d-2502-4863-8d82-fc1b6ccf65f0', '32cfbe46-2cd9-4b13-95c0-e2670373dd9a', '385eb563-005e-43ed-adb7-d3c9e0a8ae93', 'cc6e85e6-6e30-4bf0-8c52-56fbd8a1bdaf', '58a0eacf-0caa-449b-9fc8-4fe629e99176', '1b3866aa-d2a3-42cf-888d-7fb57a77c5aa', '98f112bc-743e-4f38-8b7a-6efb08802ef0', 'e1ab6579-ea65-45a7-abea-869b055b5a6a', '40f2f1d0-f5f4-4847-9e65-5865f2cea4fe', '43983b5a-24f5-4cc6-b37d-27901430b9aa', '2efd70fe-5e5b-453c-a851-e31ed0229e4f', '9e33ae37-eb19-47b5-a5fd-e2e8a41d3574', '477240da-1ce8-422a-aef4-cc2cf65dd539', '7496e23c-ff51-45fc-9d0f-37659b290038', '68598166-b4e2-49a0-b025-51a0159f6649', '5f55707c-6cb2-4b8a-a1b0-9099617fae2e', 'a0925f5f-3d38-4f8a-a09f-f6997072dc23', '92166035-2bfc-412b-91a5-fdff874967b2', '1719e522-0e4d-4149-8952-b723dc5d304a', 'ae52c25c-9ee4-4a42-83f2-6e4b20116380', 'd1437ee2-cfb9-4635-822d-8593da2954fa', '6bc74ceb-b871-4e41-af59-4bedcc7f15c4', '06eca349-0e2c-4785-a93a-1eeb46a50220', 'da13c055-9b6e-4539-be07-230f662b0273', '0522bcb4-e8f9-4c0b-936b-b2d729cbcac7', 'a8d6882f-0950-4085-8357-6cdece410071', '006f69b6-7f42-4eba-81df-86805c22dd3d', 'd0670652-b45a-450f-a76f-2c729cd149d7', '60f11345-3c51-43c7-9c40-4d41eae11a17', '924ed952-4c7b-455f-871b-c740b72d405f', '13c6bf33-41a3-4330-a3f7-1e69f675b01e', '69d49001-0f5f-46a8-8396-b83fd1c41e6b', 'e0c597f8-4aa4-485c-b19c-c46b564550a5', '4a245d7b-c43c-4bcb-a738-90ae11378928', '743190c3-fb25-4ccc-9e66-f076ca624c03', '790795c4-49b1-4ac8-a060-207b92ea08c5', 'e0664f4e-cf09-488a-841d-4baf2cbf1507', 'dc7a86a7-e75c-4012-934b-86e7f7ff21eb', '69525f31-1a87-4a22-9ea3-e10ade1b7343', 'b580fa04-b2ef-472e-bd39-4a51368e813c', 'fcca83ce-9015-42ae-a846-504cefa7902a', '0e5501d1-39ca-44bd-b2a8-127841eb19ba', '03b58060-2316-42fc-811f-0e5895399e96', '8635ff71-8513-48ea-a8d9-f4432f6fcc32', '6b073ea9-caea-4a64-9cbf-745d17d5a0c8', 'e889c6f4-8783-46d6-89bc-8ab9685edf35', 'bb2aaa54-3c4d-4fc4-93c6-6c8c67d4f989', '49fbab54-9c7e-405e-9443-ec1e19a32fb1', 'd5222430-0f7a-4bd1-8706-c58b4397a9b8', '2e0b01ae-bf72-43a6-864a-4419af665b80', '31d19143-1faa-471a-8575-4a8052fdcf19', '3011a1cf-1388-4819-8091-fee55e878ea1', '3739a9ee-e7d7-4ced-860d-cf89e81d4437', '96c6482b-7949-4d6e-894b-371149e85134', '7b6a1bb0-7da1-4975-bd61-04865a1dc296', '830b26f1-f900-40ca-a69d-eb530d73703f', 'a54071f5-9e06-4184-b84e-0b49b2af5b34', '5d88275d-d7a5-492b-8e2a-4e93305a0b46', '356c3be6-78c2-47e0-b71a-5d0ffd6603de', 'c6e19201-4fae-4213-a16c-edfd0cc11335'}) ..."
    },
    {
        "@timestamp": "2024-04-05 10:11:25.558",
        "@message": "[INFO]\t2024-04-05T10:11:25.558Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\telasticsearch\tMaking POST request to https://vpc-azul-index-sandbox-mcwjphhhdivigzrsrdmxm2uude.us-east-1.es.amazonaws.com:443/azul_v2_sandbox_dcp3-it_files_aggregate/_search"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.558",
        "@message": "[INFO]\t2024-04-05T10:11:25.558Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\telasticsearch\t… with request body b'{\"query\":{\"bool\":{\"must\":[{\"constant_score\":{\"filter\":{\"range\":{\"contents.files.size\":{\"gte\":0,\"lte\":1577461978389,\"relation\"...'"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.567",
        "@message": "[INFO]\t2024-04-05T10:11:25.567Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\telasticsearch\tGot 200 response after 0.009s from POST to https://vpc-azul-index-sandbox-mcwjphhhdivigzrsrdmxm2uude.us-east-1.es.amazonaws.com:443/azul_v2_sandbox_dcp3-it_files_aggregate/_search"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.567",
        "@message": "[INFO]\t2024-04-05T10:11:25.567Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\telasticsearch\t… with response body '{\"took\":0,\"timed_out\":false,\"_shards\":{\"total\":1,\"successful\":1,\"skipped\":0,\"failed\":0},\"hits\":{\"total\":{\"value\":73,\"relation\":…'"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.568",
        "@message": "[INFO]\t2024-04-05T10:11:25.568Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.service.manifest_service\tManifest content hash -480359814 was computed in 0.011s using filters Filters(explicit={'fileSize': {'within': [[0, 1577461978389]]}}, source_ids={'5d9ff6e0-3ec0-44f5-bb5c-32691a09517b', '8bf100c0-7ac3-4860-80ca-28093f4adb61', '139ecf31-b6bb-419a-97c8-1245029631bb', '879a755b-ebc0-4ebb-b4ad-6c89054c4851', 'd74bbb92-2f76-4637-b386-4a60eec93e28', '421f4b48-5e68-4e61-92bb-c0218500e73c', '215793b0-b991-4b73-9584-1103d4072f39', '64032479-8f07-4a87-8a54-09a7c329e17c', 'da0f61e8-5c00-4cac-bc52-97a3f485d350', '155a019c-ca47-432d-939b-6e51d5bec730', 'cc1c98a4-bfc4-45f2-b8dc-e920e5ca634d', '7d9c28b0-d55e-4157-b3a1-9d2e9535a66c', 'fcc6b9e5-3e37-464f-8d5b-2b7b3d40f128', '54052b19-6fa1-41cd-b301-6d130b9ad516', 'd15b9e6e-dd53-4b80-80f1-4c698300b15a', 'e2151834-13cd-4156-9ea2-168a1b7abf60', '057262af-09d8-4975-8647-2a68a5d33629', 'dc4c099e-1d33-4810-aca5-036a026f3aae', 'a3b6a664-1afb-4b4f-bf08-63bab26c848f', '2c5901f6-67bf-40cc-aea1-621a60c19baa', '768c4624-c07b-4496-8b7b-76275a149774', 'f5a52dbb-acad-47fd-8832-436134301ea2', '0ab78bd0-2e02-4536-b3c9-bdd7a57d7988', 'e470d283-fd86-45b3-bec2-0c1663c16a67', '24ddfa48-402b-43df-8dff-f75d5e283193', '9eab82b6-0b35-4c00-9ddc-fa4f1b8cab99', 'f142a2f7-4c50-458d-a047-27bdb8e0d367', 'a6ffed95-9d29-4484-a5db-c080a6278665', '01b1ac88-8144-428e-a84c-88e9ed8ec6a2', '7138a5ec-41de-458d-9ac3-2fa2b0575787', 'e612386b-287d-4424-beb8-4ab62f925642', '3d661f98-8742-434a-98de-168dde5b5120', 'f012a311-5d55-42ad-8b77-f8a1816f3aa4', 'a45481be-52fe-4458-b926-37b82fb0b04a', 'ca753d3d-8726-4b2c-a606-b702e3ea8be0', 'fb68966f-f91e-428f-a167-118be31d2882', 'a73ed2f2-12c7-4c71-a09d-3189b0fedc26', '182ee25a-e89b-4800-923e-3ad1210b5816', '9b53159b-d0b8-44e7-8871-ca086529709c', '1e63b2ce-88ec-4c56-b016-56e9bd7f35e5', 'a177b37c-ff59-4a45-97e0-0f29a3136728', 'daa3a6f0-c1e0-4e38-b6aa-235fa509daf5', '41d88011-1981-4f50-b6fe-78f7b7d27bab', 'ed2f43fc-b4b2-4be1-aed9-b0c4f46afd78', '8a527b32-5ae3-437e-8a96-12562894849c', '62f19274-c8a9-4f7f-b913-108e6e71f293', '79abca77-cb9b-4905-8853-190367e397dc', '07bd9e9a-f3d5-4ce7-a7b5-a82cf5ede5b9', '61cbb99b-7b93-4d06-bdeb-5d45f3b34407', 'e785046f-93af-4c79-94b1-d8d0b4c04689', '90d35849-38f1-4259-86dd-b91023a6b9d8', '70c0ed2f-0db6-4245-b1c5-d140cc6c712c', 'd7250e7a-20c6-42ff-9332-4bc5a5fd2470', '104f2254-2af9-4699-9573-00d2da718430', 'cabce9f2-b98b-49f3-9dd4-d7e14c19acd2', '5c3ab1e3-24a9-4ae7-9c56-638422b1ae8b', 'f5a609de-0760-4a39-b633-9d751aa7d472', 'f7eefeb5-4198-4d9a-b3db-f2db6a567a15', 'ee9a4b92-82c1-4324-85ec-252cd8cea024', 'a36cb7cb-0122-40af-8942-1d5ec9372e77', '19510524-bbb9-4883-9228-fcbee947d27e', '1568e68d-3032-4571-a066-dac6ccec9716', '001afc86-da4c-4739-85be-26ca98d2693f', '79f24905-a9e2-4328-91da-fc18036b77eb', 'e2db098b-a834-449f-bd97-0a76a6a9d581', '1c32d6d0-6249-4b0e-a3a4-3a262ed610ef', 'c935fa3e-06f7-4e85-9ff1-c58eaa789978', 'a21be4c9-9778-4713-bc31-1a1c905a6ad3', '170988f8-1dd1-4f39-b72d-645dabf607b7', 'd8ee3cfa-32c9-49c2-b1ae-af0a7bb1c98d', '90c2d68f-a000-4c8d-a493-8be93a71b44d', '74b361b1-b13c-49ae-83f0-12ff7da06e0e', '0c86170e-312d-4b39-a0a4-2a2bfaa24c7a', 'c9551b96-7d8d-47c7-90dd-b90eec36504a', '23f0064e-3615-4009-9dfc-60583f92064e', '2d6116af-ceb1-4324-ba47-65fbc4202028', 'f84e6b4a-719c-4c75-891c-be52c106c6b8', 'e2af563e-68e2-495a-a51a-041e1e32f037', '05ed25dc-408d-4045-998d-6166fc2c3293', 'b1083e8b-4de9-467a-97de-18179c4e6bd1', 'f90f5d7f-c507-4e56-abfc-b965a66023fb', '806a9a68-1804-4234-85e1-aac7cdbffd73', 'e9fb5a5a-6eca-40b4-a894-3a3c65b7908e', 'd7f9e55f-fdce-4996-a2e3-4ab697a12bca', '0559efce-d188-4a59-9f6f-2a6258ca7e79', 'c594fc9a-f1c5-4770-acaa-013c2852e660', '02b9f4e0-3bee-4c23-87d4-485ddb414a87', '9d173ed8-c0ff-4a43-baf5-eb30ea7fba6d', '251c52e6-3bda-48df-a4f2-50aa759d7d80', '9bf03ef5-4f42-4afb-b451-21923c6ff468', '403ef1b1-7b7f-4e13-bf4a-d898f047310a', '4c15703d-2502-4863-8d82-fc1b6ccf65f0', '32cfbe46-2cd9-4b13-95c0-e2670373dd9a', '385eb563-005e-43ed-adb7-d3c9e0a8ae93', 'cc6e85e6-6e30-4bf0-8c52-56fbd8a1bdaf', '58a0eacf-0caa-449b-9fc8-4fe629e99176', '1b3866aa-d2a3-42cf-888d-7fb57a77c5aa', '98f112bc-743e-4f38-8b7a-6efb08802ef0', 'e1ab6579-ea65-45a7-abea-869b055b5a6a', '40f2f1d0-f5f4-4847-9e65-5865f2cea4fe', '43983b5a-24f5-4cc6-b37d-27901430b9aa', '2efd70fe-5e5b-453c-a851-e31ed0229e4f', '9e33ae37-eb19-47b5-a5fd-e2e8a41d3574', '477240da-1ce8-422a-aef4-cc2cf65dd539', '7496e23c-ff51-45fc-9d0f-37659b290038', '68598166-b4e2-49a0-b025-51a0159f6649', '5f55707c-6cb2-4b8a-a1b0-9099617fae2e', 'a0925f5f-3d38-4f8a-a09f-f6997072dc23', '92166035-2bfc-412b-91a5-fdff874967b2', '1719e522-0e4d-4149-8952-b723dc5d304a', 'ae52c25c-9ee4-4a42-83f2-6e4b20116380', 'd1437ee2-cfb9-4635-822d-8593da2954fa', '6bc74ceb-b871-4e41-af59-4bedcc7f15c4', '06eca349-0e2c-4785-a93a-1eeb46a50220', 'da13c055-9b6e-4539-be07-230f662b0273', '0522bcb4-e8f9-4c0b-936b-b2d729cbcac7', 'a8d6882f-0950-4085-8357-6cdece410071', '006f69b6-7f42-4eba-81df-86805c22dd3d', 'd0670652-b45a-450f-a76f-2c729cd149d7', '60f11345-3c51-43c7-9c40-4d41eae11a17', '924ed952-4c7b-455f-871b-c740b72d405f', '13c6bf33-41a3-4330-a3f7-1e69f675b01e', '69d49001-0f5f-46a8-8396-b83fd1c41e6b', 'e0c597f8-4aa4-485c-b19c-c46b564550a5', '4a245d7b-c43c-4bcb-a738-90ae11378928', '743190c3-fb25-4ccc-9e66-f076ca624c03', '790795c4-49b1-4ac8-a060-207b92ea08c5', 'e0664f4e-cf09-488a-841d-4baf2cbf1507', 'dc7a86a7-e75c-4012-934b-86e7f7ff21eb', '69525f31-1a87-4a22-9ea3-e10ade1b7343', 'b580fa04-b2ef-472e-bd39-4a51368e813c', 'fcca83ce-9015-42ae-a846-504cefa7902a', '0e5501d1-39ca-44bd-b2a8-127841eb19ba', '03b58060-2316-42fc-811f-0e5895399e96', '8635ff71-8513-48ea-a8d9-f4432f6fcc32', '6b073ea9-caea-4a64-9cbf-745d17d5a0c8', 'e889c6f4-8783-46d6-89bc-8ab9685edf35', 'bb2aaa54-3c4d-4fc4-93c6-6c8c67d4f989', '49fbab54-9c7e-405e-9443-ec1e19a32fb1', 'd5222430-0f7a-4bd1-8706-c58b4397a9b8', '2e0b01ae-bf72-43a6-864a-4419af665b80', '31d19143-1faa-471a-8575-4a8052fdcf19', '3011a1cf-1388-4819-8091-fee55e878ea1', '3739a9ee-e7d7-4ced-860d-cf89e81d4437', '96c6482b-7949-4d6e-894b-371149e85134', '7b6a1bb0-7da1-4975-bd61-04865a1dc296', '830b26f1-f900-40ca-a69d-eb530d73703f', 'a54071f5-9e06-4184-b84e-0b49b2af5b34', '5d88275d-d7a5-492b-8e2a-4e93305a0b46', '356c3be6-78c2-47e0-b71a-5d0ffd6603de', 'c6e19201-4fae-4213-a16c-edfd0cc11335'})."
    },
    {
        "@timestamp": "2024-04-05 10:11:25.569",
        "@message": "[INFO]\t2024-04-05T10:11:25.569Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\ts3.HeadObject:\tMaking HEAD request to https://s3.amazonaws.com/edu-ucsc-gi-platform-hca-dev-storage-sandbox.us-east-1/manifests/2f3c7abc-b363-5107-b6bb-769dbaca8f71.ad14b32b-dea0-555e-b745-8ebe5bb0958d.tsv"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.569",
        "@message": "[INFO]\t2024-04-05T10:11:25.569Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\ts3.HeadObject:\t… without request body"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.584",
        "@message": "[INFO]\t2024-04-05T10:11:25.584Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\ts3.HeadObject:\tGot 200 response"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.585",
        "@message": "[INFO]\t2024-04-05T10:11:25.585Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\ts3.HeadObject:\t… with response body b''"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.585",
        "@message": "[DEBUG]\t2024-04-05T10:11:25.585Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.service.manifest_service\tManifest object expires in 654514.414835 seconds, on 2024-04-13 00:00:00+00:00"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.586",
        "@message": "[INFO]\t2024-04-05T10:11:25.586Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\ts3.GetObjectTagging:\tMaking GET request to https://s3.amazonaws.com/edu-ucsc-gi-platform-hca-dev-storage-sandbox.us-east-1/manifests/2f3c7abc-b363-5107-b6bb-769dbaca8f71.ad14b32b-dea0-555e-b745-8ebe5bb0958d.tsv?tagging"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.586",
        "@message": "[INFO]\t2024-04-05T10:11:25.586Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\ts3.GetObjectTagging:\t… without request body"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.600",
        "@message": "[INFO]\t2024-04-05T10:11:25.600Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\ts3.GetObjectTagging:\tGot 200 response"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.600",
        "@message": "[INFO]\t2024-04-05T10:11:25.600Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.boto3\ts3.GetObjectTagging:\t… with response body b'<?xml version=\"1.0\" encoding=\"UTF-8\"?>\\n<Tagging xmlns=\"http://s3.amazonaws.com/doc/2006-03-01/\"><TagSet/></Tagging>'"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.603",
        "@message": "[ERROR]\t2024-04-05T10:11:25.600Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul-service-sandbox\tCaught exception for path /manifest/files\nTraceback (most recent call last):\n  File \"/var/task/azul/service/manifest_service.py\", line 751, in _get_cached_manifest_file_name\n    encoded_file_name = tagging[self.file_name_tag]\n                        ~~~~~~~^^^^^^^^^^^^^^^^^^^^\nKeyError: 'azul_file_name'\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/var/task/chalice/app.py\", line 1917, in _get_view_function_response\n    response = view_function(**function_args)\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/var/task/app.py\", line 1536, in file_manifest\n    return _file_manifest(fetch=False)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/var/task/app.py\", line 1572, in _file_manifest\n    return app.manifest_controller.get_manifest_async(query_params=query_params,\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/var/task/azul/service/manifest_controller.py\", line 139, in get_manifest_async\n    manifest = self.service.get_cached_manifest(format=format,\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/var/task/azul/service/manifest_service.py\", line 658, in get_cached_manifest\n    return self._get_cached_manifest(generator_cls, manifest_key)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/var/task/azul/service/manifest_service.py\", line 697, in _get_cached_manifest\n    file_name = self._get_cached_manifest_file_name(generator_cls, manifest_key)\n                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/var/task/azul/service/manifest_service.py\", line 754, in _get_cached_manifest_file_name\n    assert False, (object_key, self.file_name_tag)\nAssertionError: ('manifests/2f3c7abc-b363-5107-b6bb-769dbaca8f71.ad14b32b-dea0-555e-b745-8ebe5bb0958d.tsv', 'azul_file_name')"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.603",
        "@message": "[DEBUG]\t2024-04-05T10:11:25.603Z\t20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tazul.chalice\tReturning 500 response with headers {\"Content-Type\": \"text/plain\", \"Access-Control-Allow-Origin\": \"*\", \"Access-Control-Allow-Headers\": \"Authorization,Content-Type,X-Amz-Date,X-Amz-Security-Token,X-Api-Key\", \"Strict-Transport-Security\": \"max-age=31536000; includeSubDomains\", \"X-Content-Type-Options\": \"nosniff\", \"X-Frame-Options\": \"DENY\"}. See next line for the first 1024 characters of the body.\nTraceback (most recent call last):\n  File \"/var/task/azul/service/manifest_service.py\", line 751, in _get_cached_manifest_file_name\n    encoded_file_name = tagging[self.file_name_tag]\n                        ~~~~~~~^^^^^^^^^^^^^^^^^^^^\nKeyError: 'azul_file_name'\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/var/task/chalice/app.py\", line 1917, in _get_view_function_response\n    response = view_function(**function_args)\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/var/task/app.py\", line 1536, in file_manifest\n    return _file_manifest(fetch=False)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/var/task/app.py\", line 1572, in _file_manifest\n    return app.manifest_controller.get_manifest_async(query_params=query_params,\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/var/task/azul/service/manifest_controller.py\", line 139, in get_manifest_async\n    manifest = self.service.get_cached_manifest("
    },
    {
        "@timestamp": "2024-04-05 10:11:25.604",
        "@message": "END RequestId: 20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\n"
    },
    {
        "@timestamp": "2024-04-05 10:11:25.604",
        "@message": "REPORT RequestId: 20cb21e1-f604-4e6f-b4dd-56f2fc8c85c4\tDuration: 53.37 ms\tBilled Duration: 54 ms\tMemory Size: 2048 MB\tMax Memory Used: 174 MB\t\n"
    }
]

Copy link
Member

@hannes-ucsc hannes-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sandbox failures look good. We don't need to perform that experiment again.

But I do want to make a few refactorings:

Subject: [PATCH] REVIEW 2
---
Index: test/integration_test.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/test/integration_test.py b/test/integration_test.py
--- a/test/integration_test.py	(revision 9cfdbee5fc17671da184e471848373c42e313795)
+++ b/test/integration_test.py	(date 1712346000951)
@@ -578,15 +578,7 @@
                 self._check_endpoint(GET, path, args=args, endpoint=endpoint)
 
     def _test_manifest(self, catalog: CatalogName):
-        supported_formats = self.metadata_plugin(catalog).manifest_formats
-        assert supported_formats
-        validators: dict[ManifestFormat, Callable[[str, bytes], None]] = {
-            ManifestFormat.compact: self._check_compact_manifest,
-            ManifestFormat.terra_bdbag: self._check_terra_bdbag_manifest,
-            ManifestFormat.terra_pfb: self._check_terra_pfb_manifest,
-            ManifestFormat.curl: self._check_curl_manifest,
-            ManifestFormat.verbatim_jsonl: self._check_jsonl_manifest
-        }
+        supported_formats = self._manifest_formats(catalog)
         for format in [None, *supported_formats]:
             filters = self._manifest_filters(catalog)
             first_fetch = bool(self.random.getrandbits(1))
@@ -594,9 +586,8 @@
                 with self.subTest('manifest', catalog=catalog, format=format, fetch=fetch):
                     args = dict(catalog=catalog, filters=json.dumps(filters))
                     if format is None:
-                        validator = validators[first(supported_formats)]
+                        format = first(supported_formats)
                     else:
-                        validator = validators[format]
                         args['format'] = format.value
 
                     # Wrap self._get_url to collect all HTTP responses
@@ -616,7 +607,7 @@
 
                         def worker(_):
                             response = self._check_endpoint(PUT, '/manifest/files', args=args, fetch=fetch)
-                            validator(catalog, response)
+                            self._manifest_validators[format](catalog, response)
 
                         num_workers = 3
                         with ThreadPoolExecutor(max_workers=num_workers) as tpe:
@@ -646,14 +637,25 @@
             }
         }
 
-    def _test_manifest_tagging_race(self, catalog: CatalogName):
-        validators: dict[ManifestFormat, Callable[[CatalogName, bytes], None]] = {
+    def _manifest_formats(self, catalog: CatalogName) -> Sequence[ManifestFormat]:
+        supported_formats = self.metadata_plugin(catalog).manifest_formats
+        assert supported_formats
+        return supported_formats
+
+    @cached_property
+    def _manifest_validators(self) -> dict[ManifestFormat, Callable[[str, bytes], None]]:
+        return {
             ManifestFormat.compact: self._check_compact_manifest,
-            ManifestFormat.curl: self._check_curl_manifest
+            ManifestFormat.terra_bdbag: self._check_terra_bdbag_manifest,
+            ManifestFormat.terra_pfb: self._check_terra_pfb_manifest,
+            ManifestFormat.curl: self._check_curl_manifest,
+            ManifestFormat.verbatim_jsonl: self._check_jsonl_manifest
         }
-        plugin = self.metadata_plugin(catalog)
-        for format, validator in validators.items():
-            if format in plugin.manifest_formats:
+
+    def _test_manifest_tagging_race(self, catalog: CatalogName):
+        supported_formats = self._manifest_formats(catalog)
+        for format in [ManifestFormat.compact, ManifestFormat.curl]:
+            if format in supported_formats:
                 with self.subTest('manifest_tagging_race', catalog=catalog, format=format):
                     filters = self._manifest_filters(catalog)
                     manifest_url = config.service_endpoint.set(path='/manifest/files',
@@ -684,11 +686,11 @@
                             responses.append(response)
                             method, manifest_url = GET, furl(response.headers['Location'])
                         else:
+                            assert response.status == 200, response
+                            self._manifest_validators[format](catalog, response.data)
                             break
 
-                    validator(catalog, response.data)
-                    execution_ids = self._manifest_execution_ids(responses,
-                                                                 fetch=False)
+                    execution_ids = self._manifest_execution_ids(responses, fetch=False)
                     self.assertEqual(1, len(execution_ids))
 
     def _manifest_execution_ids(self,

We can leave the drop commits off.

Comment on lines +753 to +761
# Paged manifest generators apply the tag after creating
# the object. If we happen to be right in between the
# creation and application of tags, we pretend that the
# manifest object doesn't exist, assuming that the caller
# will tell the client to check back later. Note that this
# involves the attempted creation of step-function
# execution, but because step-functions generate manifests
# idempotentently, subsequent requests for the same
# manifest are associated to a single SF execution.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Paged manifest generators apply the tag after creating
# the object. If we happen to be right in between the
# creation and application of tags, we pretend that the
# manifest object doesn't exist, assuming that the caller
# will tell the client to check back later. Note that this
# involves the attempted creation of step-function
# execution, but because step-functions generate manifests
# idempotentently, subsequent requests for the same
# manifest are associated to a single SF execution.
# While unpaged manifest generators apply the tag *at* object creation, paged ones do so in a separate request. If we happen to get here right in between the
# creation and application of tags, we pretend that the
# manifest object doesn't exist, relying on client to check back later. Note that this
# involves the attempted creation of a step-function (SF)
# execution, but because SFs generate manifests
# idempotently, subsequent requests for the same
# manifest are associated with a single SF execution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the second sentence. Please raise in PL.

Copy link
Member

@hannes-ucsc hannes-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conflicts, unfortunately.

hannes-ucsc
hannes-ucsc previously approved these changes May 10, 2024
@hannes-ucsc
Copy link
Member

Security design review

  • Security design review completed; this PR does not
    • … affect authentication; for example:
      • OAuth 2.0 with the application (API or Swagger UI)
      • Authentication of developers with Google Cloud APIs
      • Authentication of developers with AWS APIs
      • Authentication with a GitLab instance in the system
      • Password and 2FA authentication with GitHub
      • API access token authentication with GitHub
      • Authentication with Terra
    • … affect the permissions of internal users like access to
      • Cloud resources on AWS and GCP
      • GitLab repositories, projects and groups, administration
      • an EC2 instance via SSH
      • GitHub issues, pull requests, commits, commit statuses, wikis, repositories, organizations
    • … affect the permissions of external users like access to
      • TDR snapshots
    • … affect permissions of service or bot accounts
      • Cloud resources on AWS and GCP
    • … affect audit logging in the system, like
      • adding, removing or changing a log message that represents an auditable event
      • changing the routing of log messages through the system
    • … affect monitoring of the system
    • … introduce a new software dependency like
      • Python packages on PYPI
      • Command-line utilities
      • Docker images
      • Terraform providers
    • … add an interface that exposes sensitive or confidential data at the security boundary
    • … affect the encryption of data at rest
    • … require persistence of sensitive or confidential data that might require encryption at rest
    • … require unencrypted transmission of data within the security boundary
    • … affect the network security layer; for example by
      • modifying, adding or removing firewall rules
      • modifying, adding or removing security groups
      • changing or adding a port a service, proxy or load balancer listens on
  • Documentation on any unchecked boxes is provided in comments below

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2 reviews [process] Lead requested changes twice sandbox [process] Resolution is being verified in sandbox deployment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Race between manifest object creation and application of tags

5 participants