-
Notifications
You must be signed in to change notification settings - Fork 3
Feature/firebase lookup #445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/firebase lookup #445
Conversation
| dedup_hash, | ||
| "DONE", | ||
| outputs_directory=upload_result.get("outputs_directory"), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I tried running this locally, I got the error ERROR | DBRecipeHandler:647 | upload_packing_results_workflow() | 'AWSHandler' object has no attribute 'create_timestamp', which I think traces back to the fact that we're now calling upload_job_status here instead of update_outputs_directory here, so we need to add a check into upload_job_status similiar to the one we had in update_outputs_directory since self.db is currently a AWSHandler instance.
So I think we want to add something like
if not self.db or self.db.s3_client:
# switch to firebase handler to update job status
handler = DATABASE_IDS.handlers().get("firebase")
initialized_db = handler(default_db="staging")
to upload_job_status
cellpack/autopack/DBRecipeHandler.py
Outdated
| data = { | ||
| "timestamp": timestamp, | ||
| "status": str(status), | ||
| "result_path": result_path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to move result_path out to an if statement, like we do for outputs_directory, so we don't overwrite it's value in firebase to None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
| self.packing_tasks = set() | ||
|
|
||
| async def run_packing(self, job_id, recipe=None, config=None, body=None): | ||
| os.environ["AWS_BATCH_JOB_ID"] = job_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to add the line setting this job ID back in since we try to read it in simularium_helper, or we need to update simularium_helper to have dedup_hash passed to post_and_open_file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching that! I refactored things to pass dedup_hash through env, seems also helped address the TODO in simulairum_helper too. let me know what do you think about this approach, here is the commit


Problem
closes #441
Solution
job_idwithdedup_hash(SHA256) for both params:recipe(the string path) andjson_recipe(dict recipe data)result_pathif foundcalculateddedup_hashviaRecipeLoader.get_dedup_hash()after recipe normalization to make sure consistent hashing regardless of source (local_path, firebase_path, json body)note: the previous approach hashes the normalized recipe, so the same semantic recipes from different sources(local_path, firebase_path, json body) would get the same hash. After chatting with @ascibisz, we simplified this to hash only the json body, aligning with server's planned use case
Type of change
Steps to Verify:
job_statusin firebaseresult_pathin the web response should reference the cached result