Skip to content

Comments

Bundle + Encrypt#62

Merged
jangevaare merged 5 commits intorefactor/single-client-pdfsfrom
feat/batch-n-encrypt
Nov 4, 2025
Merged

Bundle + Encrypt#62
jangevaare merged 5 commits intorefactor/single-client-pdfsfrom
feat/batch-n-encrypt

Conversation

@jangevaare
Copy link
Member

  • PDF batching is now PDF bundling
  • Bundling can be used at the same time as encryption - this is to support QA workflows, while still running the pipeline with digital notice delivery in mind
  • Clean up of non-encrypted PDFs now happens during the existing clean up stage, after optional batching has occurred
  • Renamed parameters surrounding clean up to improve clarity around what is happening before a run vs. after
  • before a pipeline run, we can clear the whole output directory, minus logs, which are retained
  • after a pipeline run we can clear non-encrypted pdfs (if batching AND/OR encryption is used), and remove output/artifacts directory
  • Tests updated

Copy link
Contributor

@kassyray kassyray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small things. Looks great otherwise!

I also think this wasn't too large of a PR. A lot of the changes were documentation based.

- Meningococcal
- Varicella
- Other
cleanup:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove cleanup?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced by after and before run?

Copy link
Member Author

@jangevaare jangevaare Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct - replaced by pipeline.before_run and pipeline.after_run for clarity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interested to see how our times compare

- Compilation → PDF validation/counting (PDF integrity)
- PDF validation → Encryption (PDF metadata preservation)
- Encryption → Batching (batch manifest generation)
- Encryption → Bundleing (bundle manifest generation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: Bundleing --> Bundling?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂️thank you

- Step 7 of pipeline (optional): groups PDFs into bundlees by school/size
- Enables efficient shipping of notices to schools and districts
- Batching strategy affects how notices are organized for distribution
- Bundleing strategy affects how notices are organized for distribution
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error


Real-world significance:
- Chunking ensures batches don't exceed max_size limit
- Chunking ensures bundlees don't exceed max_size limit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error


Real-world significance:
- Consistent PDF ordering for reproducible batches
- Consistent PDF ordering for reproducible bundlees
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error


Real-world significance:
- Batching disabled (batch_size=0) skips grouping
- Bundleing disabled (bundle_size=0) skips grouping
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error - check all for bundleing

"qr:\n enabled: false\ncleanup:\n remove_directories:\n - artifacts\n - metadata\n"
)
# Modify config to enable artifact removal
import yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to top? I always thought imports at top were best practice but perhaps I am outdated in this knowledge?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be arguments for lazy imports inside functions.

But not an argument I'll make here


assert (tmp_output_structure["artifacts"] / "test.json").exists()
# Modify config to have encryption disabled and batching disabled, but removal requested
import yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we already import?

).write_text("pdf content")

# Ensure both encryption and batching are disabled
import yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this again?

No more lazy yaml in test_cleanup

bundleing too
@jangevaare jangevaare force-pushed the feat/batch-n-encrypt branch from c776f46 to ee7bba7 Compare November 4, 2025 21:48
@jangevaare jangevaare merged commit a8d2bd0 into refactor/single-client-pdfs Nov 4, 2025
@jangevaare jangevaare deleted the feat/batch-n-encrypt branch November 4, 2025 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants