Skip to content

Conversation

@rugeli
Copy link
Collaborator

@rugeli rugeli commented Jan 21, 2026

Problem

closes #441

Solution

  • replaced uuid-based job_id with dedup_hash(SHA256) for both params: recipe(the string path) and json_recipe(dict recipe data)
  • added firebase lookup to check for existing results before packing, return cached result_path if found
  • calculated dedup_hash via RecipeLoader.get_dedup_hash() after recipe normalization to make sure consistent hashing regardless of source (local_path, firebase_path, json body)

note: the previous approach hashes the normalized recipe, so the same semantic recipes from different sources(local_path, firebase_path, json body) would get the same hash. After chatting with @ascibisz, we simplified this to hash only the json body, aligning with server's planned use case

Type of change

  • New feature (non-breaking change which adds functionality)

Steps to Verify:

  1. run a packing job via the server and verify the document is created in job_status in firebase
  2. run the exact same job again, no new document should be created, and the result_path in the web response should reference the cached result

@github-actions
Copy link
Contributor

Packing analysis report

Analysis for packing results located at cellpack/tests/outputs/test_spheres/spheresSST

Ingredient name Encapsulating radius Average number packed
ext_A 25 236.0

Packing image

Packing image

Distance analysis

Expected minimum distance: 50.00
Actual minimum distance: 50.01

Ingredient key Pairwise distance distribution
ext_A Distance distribution ext_A

@rugeli rugeli marked this pull request as ready for review January 23, 2026 23:19
@rugeli rugeli requested a review from ascibisz January 23, 2026 23:19
dedup_hash,
"DONE",
outputs_directory=upload_result.get("outputs_directory"),
)
Copy link
Collaborator

@ascibisz ascibisz Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tried running this locally, I got the error ERROR | DBRecipeHandler:647 | upload_packing_results_workflow() | 'AWSHandler' object has no attribute 'create_timestamp', which I think traces back to the fact that we're now calling upload_job_status here instead of update_outputs_directory here, so we need to add a check into upload_job_status similiar to the one we had in update_outputs_directory since self.db is currently a AWSHandler instance.

So I think we want to add something like

if not self.db or self.db.s3_client:
    # switch to firebase handler to update job status
    handler = DATABASE_IDS.handlers().get("firebase")
    initialized_db = handler(default_db="staging")

to upload_job_status

data = {
"timestamp": timestamp,
"status": str(status),
"result_path": result_path,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to move result_path out to an if statement, like we do for outputs_directory, so we don't overwrite it's value in firebase to None

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

self.packing_tasks = set()

async def run_packing(self, job_id, recipe=None, config=None, body=None):
os.environ["AWS_BATCH_JOB_ID"] = job_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add the line setting this job ID back in since we try to read it in simularium_helper, or we need to update simularium_helper to have dedup_hash passed to post_and_open_file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that! I refactored things to pass dedup_hash through env, seems also helped address the TODO in simulairum_helper too. let me know what do you think about this approach, here is the commit

@rugeli rugeli merged commit 17ba17c into feature/server-passed-recipe-json Jan 29, 2026
2 checks passed
@rugeli rugeli deleted the feature/firebase-lookup branch January 29, 2026 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants