Bundle + Encrypt#62
Conversation
jangevaare
commented
Nov 3, 2025
- PDF batching is now PDF bundling
- Bundling can be used at the same time as encryption - this is to support QA workflows, while still running the pipeline with digital notice delivery in mind
- Clean up of non-encrypted PDFs now happens during the existing clean up stage, after optional batching has occurred
- Renamed parameters surrounding clean up to improve clarity around what is happening before a run vs. after
- before a pipeline run, we can clear the whole output directory, minus logs, which are retained
- after a pipeline run we can clear non-encrypted pdfs (if batching AND/OR encryption is used), and remove output/artifacts directory
- Tests updated
… encryption in single pipeline to support QA workflows while also using electronic delivery
…ections in config
…OR encryption to also be enabled
kassyray
left a comment
There was a problem hiding this comment.
Just a few small things. Looks great otherwise!
I also think this wasn't too large of a PR. A lot of the changes were documentation based.
| - Meningococcal | ||
| - Varicella | ||
| - Other | ||
| cleanup: |
There was a problem hiding this comment.
Replaced by after and before run?
There was a problem hiding this comment.
Correct - replaced by pipeline.before_run and pipeline.after_run for clarity.
There was a problem hiding this comment.
Interested to see how our times compare
| - Compilation → PDF validation/counting (PDF integrity) | ||
| - PDF validation → Encryption (PDF metadata preservation) | ||
| - Encryption → Batching (batch manifest generation) | ||
| - Encryption → Bundleing (bundle manifest generation) |
There was a problem hiding this comment.
Spelling error: Bundleing --> Bundling?
tests/unit/test_bundle_pdfs.py
Outdated
| - Step 7 of pipeline (optional): groups PDFs into bundlees by school/size | ||
| - Enables efficient shipping of notices to schools and districts | ||
| - Batching strategy affects how notices are organized for distribution | ||
| - Bundleing strategy affects how notices are organized for distribution |
tests/unit/test_bundle_pdfs.py
Outdated
|
|
||
| Real-world significance: | ||
| - Chunking ensures batches don't exceed max_size limit | ||
| - Chunking ensures bundlees don't exceed max_size limit |
tests/unit/test_bundle_pdfs.py
Outdated
|
|
||
| Real-world significance: | ||
| - Consistent PDF ordering for reproducible batches | ||
| - Consistent PDF ordering for reproducible bundlees |
tests/unit/test_bundle_pdfs.py
Outdated
|
|
||
| Real-world significance: | ||
| - Batching disabled (batch_size=0) skips grouping | ||
| - Bundleing disabled (bundle_size=0) skips grouping |
There was a problem hiding this comment.
Spelling error - check all for bundleing
tests/unit/test_cleanup.py
Outdated
| "qr:\n enabled: false\ncleanup:\n remove_directories:\n - artifacts\n - metadata\n" | ||
| ) | ||
| # Modify config to enable artifact removal | ||
| import yaml |
There was a problem hiding this comment.
Move to top? I always thought imports at top were best practice but perhaps I am outdated in this knowledge?
There was a problem hiding this comment.
There can be arguments for lazy imports inside functions.
But not an argument I'll make here
tests/unit/test_cleanup.py
Outdated
|
|
||
| assert (tmp_output_structure["artifacts"] / "test.json").exists() | ||
| # Modify config to have encryption disabled and batching disabled, but removal requested | ||
| import yaml |
tests/unit/test_cleanup.py
Outdated
| ).write_text("pdf content") | ||
|
|
||
| # Ensure both encryption and batching are disabled | ||
| import yaml |
No more lazy yaml in test_cleanup bundleing too
c776f46 to
ee7bba7
Compare