export-tar --tar-format=BORG_C / import-tar: support chunked tar content#6643
export-tar --tar-format=BORG_C / import-tar: support chunked tar content#6643ThomasWaldmann wants to merge 1 commit intoborgbackup:masterfrom
Conversation
while the BORG format uses a full, raw content byte stream, the BORG_C format uses a sequence of chunk packs. each pack is: - 32bit size (signed) - 256bit chunk id - <size> bytes data (optional, only present if size != -1) for simplicity, a pack is generated for each entry in item.chunks, but only still missing chunks have data. packs with no data (size == -1) must already exist in the target repository. for simplicity / for now: - export-tar decrypts and decompresses, but chunks and chunk ids are kept - import-tar does not recompute chunk ids, accepts missing chunks "as is" (but recompresses / re-encrypts) and increfs already present chunks. - no preload via archive.iter_items for chunked mode - have_chunks is initialised to the empty set, thus only inner duplication in the exported archive is considered.
Codecov Report
@@ Coverage Diff @@
## master #6643 +/- ##
==========================================
- Coverage 82.94% 82.60% -0.34%
==========================================
Files 39 39
Lines 10669 10715 +46
Branches 2094 2102 +8
==========================================
+ Hits 8849 8851 +2
- Misses 1307 1344 +37
- Partials 513 520 +7
Continue to review full report at Codecov.
|
|
From @callegar: There are a couple of things that are not completely clear to me...
Am I missing something? Thanks in advance in case you can provide any bit that I am missing . Out of my curiosity, I will really appreciate that. But if you can't I will well understand, since I do not want to abuse of your time! |
|
@callegar 1. and 2. -> exactly.
The tar-pipe as a clear boundary between old code / old repo and new code / new repo would be nice, but due to its nature it lacks capabilities (like 2 way communication). When directly talking to an old repo, the new code would need to keep the capabilities to do that and that requires quite some stuff i would like to get rid off (like old crypto). |
|
Considering this is not finished and would need another channel to tell the code what we already have at the destination, I think I will abandon this PR in favour of |
|
Closing this.
|
while the BORG format uses a full, raw content byte stream,
the BORG_C format uses a sequence of chunk packs.
each pack is:
for simplicity, a pack is generated for each entry in item.chunks,
but only still missing chunks have data.
packs with no data (size == -1) must already exist in the target repository.
for simplicity / for now:
(but recompresses / re-encrypts) and increfs already present chunks.
in the exported archive is considered.