refactor: use the same path for dedicated and packed blob#5449
Conversation
Signed-off-by: Xuanwo <github@xuanwo.io>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
westonpace
left a comment
There was a problem hiding this comment.
How do you envision compaction affecting blob files? Would small packed blob files (e.g. a few rows of 1MB per row so each file is only a few MBs) be compacted into a single packed file? Or would we never compact blob flies?
I ask because compacting packed blob files seems like it would be an ok thing to do but packing dedicated blob files would make them no longer dedicated (not sure if that would be a problem or not)
Yes, we will combine small packed files into a larger one when appropriate.
Yeah, that's actually a beautiful thing about the current design. That is, we CAN compact dedicated blobs into packed blobs too if we want! But we don't necessarily need to do this since we can expect that dedicated blobs are always large enough that they don't need to be packed. |
…at#5449) I made this change to make it easier for us to perform compaction or GC, as all blob IDs will now refer to the same blob paths. This means that as long as we know the largest blob IDs, we can simply remove them all at once. --- **Parts of this PR were drafted with assistance from Codex (with `gpt-5.1-codex-max`) and fully reviewed and edited by me. I take full responsibility for all changes.** Signed-off-by: Xuanwo <github@xuanwo.io>
I made this change to make it easier for us to perform compaction or GC, as all blob IDs will now refer to the same blob paths. This means that as long as we know the largest blob IDs, we can simply remove them all at once.
Parts of this PR were drafted with assistance from Codex (with
gpt-5.1-codex-max) and fully reviewed and edited by me. I take full responsibility for all changes.