Skip to content

Conversation

@parkan
Copy link
Collaborator

@parkan parkan commented Apr 2, 2025

This is an in-progress fix for the situation where very small data segments (e.g. DAG pieces, remainder data CARs from a larger preparation, or simply small preparations) get padded to the full piece size (generally 32GiB), potentially creating 90%+ padding pieces. During the original design process for singularity a high padding percentage within a sector was allowed, however currently such pieces will likely get rejected by SPs and sectors containing excess padding will not fulfill verified deal requirements. The fully padded pieces may also increase transfer times, depending on the download mechanism.

Currently, any piece will get padded up to the target piece size. We change this logic by:

  • allowing specifying a minimum piece size (based on cursory discussions with SPs this defaults to 256B, up for adjustment later)
  • pass both minimum and target piece sizes to the GetCommp function; this allows flexible control where we both retain the previous behavior (see below) but can optionally pass 0 for the target size, resulting in no padding beyond the normal power of 2
  • the target size check appears to be a "belt and suspenders" approach for most cases, because the chunker should already emit correctly sized pieces, however this is (1) subject to further validation (2) probably a good idea anyway

why this is WIP:

  • need to validate claim (3) above
  • need to validate logic for DAG case and likely add tests
  • need to bring test in line with new behavior if divergence is observed

what is outside the scope of this PR:

  • the burden of aggregating small pieces into a sector falls upon the SPs; this may be acceptable if data from multiple clients with varied piece sizes is being sealed (or a heterogeneous stream of pieces from a single client) but e.g. in the case of large (byte count) preparations with relatively few files (i.e. small DAG and small remainder piece, both singular) filling sectors may be difficult or require longer sealing timeouts

closes #473

@parkan
Copy link
Collaborator Author

parkan commented Apr 2, 2025

for example, here's a piece using 32GiB for a 278KiB DAG:
baga6ea4seaqgmowdulb5xtlwmxsmedeuggznygn45evetj5ul3ut7sotpdaqeei 34359738368 bafybeihxmwrn7kll74iialjcetonrpw3z4vuorhgf5oe7xcoo3hwqusoxi 285012

@parkan
Copy link
Collaborator Author

parkan commented Apr 3, 2025

making a lot of changes to simplify this, belay review

@parkan parkan force-pushed the feat/allow-small-pieces branch from dbd68c0 to e0f3723 Compare April 3, 2025 15:40
@parkan
Copy link
Collaborator Author

parkan commented Apr 3, 2025

ok this now works for not massively padding DAGs and tests pass, working on handling the "remainder" cars

@parkan
Copy link
Collaborator Author

parkan commented Apr 3, 2025

output for a DAG car:

             FileSize   StoragePath  
        data       baga6ea4seaqcjnkwlzzgf52hbkdk666xockuzw5wlqip2fjwwxtx2n5e3jotgpq  34359738368  bafkreic7zndobkznnmagtmz2k7s7bm652zhxujzmivrakbr56phn35sqmq  147603060               
        dag        baga6ea4seaqnp5wetkvowgcvtsy3waoe3575i42dgbamezhf3liy6pj2eyh42ma  262144       bafybeihmv6z3627jkfsodub4fpkpgfqnhegpph2bq7g7utx262efzkm5sa  206382                  

@Sankara-Jefferson
Copy link
Contributor

@parkan do you have a link to the FIP in question? In practice it seems SPs are taking smaller pieces.

The DAG piece is still less than a GiB! Are these the tiny pieces that can then be aggregated and shipped/sealed as a 32GiB piece/sector?

This approach will present a challenge to users and add a few complications and steps to their workflows.

I think we are aligned that if miners can receive smaller pieces without an issue, we should settle at no padding beyond the normal power of 2

The DAG piece should be processed as the rest of the pieces in the preparation.

@parkan
Copy link
Collaborator Author

parkan commented Apr 4, 2025

@parkan do you have a link to the FIP in question? In practice it seems SPs are taking smaller pieces.

The DAG piece is still less than a GiB! Are these the tiny pieces that can then be aggregated and shipped/sealed as a 32GiB piece/sector?

This approach will present a challenge to users and add a few complications and steps to their workflows.

I think we are aligned that if miners can receive smaller pieces without an issue, we should settle at no padding beyond the normal power of 2

The DAG piece should be processed as the rest of the pieces in the preparation.

FIP in question is https://github.com/filecoin-project/FIPs/blob/master/FIPS/fip-0045.md (particularly here), some related concepts here https://spec.filecoin.io/#section-systems.filecoin_mining.sector.sector-quality

it's not clear to me exactly how the implementation of the deal weight calculations shook out as there was quite a bit of back and forth on that, so any insights would be appreciated

basically we have these overlapping concerns:
(1) full pieces (minimal piece padding)
(2) full sectors (minimal sector padding)
(2a) timed knapsack problem for sector filling (need to get the optimal subset of pieces to fill sector within the publish window, I believe this is set to ~1-2hr by default on both boost and curio)
(3) reasonable piece count
(4) input data-side aggregation (aggregating data that's below piece size)

to the best of my understanding:
concern (1) primarily affects clients (paying or spending datacap on padding data), though it's also arguably bad for the network (storing random data esp. as part of verified deals)

concern (2) primarily affects SPs (risk of lower deal weight, and potentially also https://github.com/filecoin-project/FIPs/blob/master/FIPS/fip-0045.md#deal-packing)

(1) and (2) are potentially in tension because clients don't want to pay (FIL, DC, or out of band) to store useless data, while SPs want to maximize DealWeight and VerifiedDealWeight per sector -- my understanding of the dynamics here is not deep enough to judge the balance

(2a) comes into play whenever piece size < sector size, whether because client has chosen a (smaller) fixed value or we are only padding to powers of two

(3) used to be a bigger problem before lower gas costs and aggregation, however it's still present (database and message bloat, operational complexity, etc)

(4) is a concern primarily for clients with small amounts of data

the existing implementation* in singularity has cut the gordian knot by optimizing for 2/2a/3 (with default settings i.e. piece size = 32GiB), making a significant compromise on (1) (1 piece per prep almost empty, 1 piece underfilled, e.g. with a 33GiB prep you are wasting ~63GiB on padding), and ignoring (4)

the proposed changes aim to rebalance slightly in favor of the client by prioritizing (1), though as currently implemented (only DAGs can be < target piece size) at a relatively large cost in (2a) and thus potentially (2)

there are several fixes, in order of increasing complexity:

  • use smaller target piece size, e.g. 4GiB, which will make (2a) significantly easier -- this requires 0 additional implementation, only config changes
  • allow the Nth piece to be padded only to next power of two, this will probabilistically leave room for smaller pieces -- a little bit tricky to detect the Nth piece due to how the jobs are scheduled
  • simply pad any pieces produced to next power of two -- this seems like a simpler approach than above but I have concerns, see next comment
  • dynamically solve for optimal piece size given input dataset after scanning -- dream solution, way out of scope 😉

@parkan
Copy link
Collaborator Author

parkan commented Apr 4, 2025

why I am concerned about only using power of 2 padding:

  • currently there is a bit of a belt and suspenders approach where pieces packed at already the right raw size are considered for padding again in GetCommp; this may be just for simplicity of code paths and as an additional sanity check, but there might be considerations I'm overlooking
  • I don't understand the intent behind --max-size, is it for packing onto drives? that's my best guess but drives are usually much bigger than 64GiB these days... a few other puzzling things:
    • it's labeled as "maximum CAR file size", which is true, but it also means "maximum data segment size per piece" (pieces can't span CARs) and implies a piece size as per:
    • --piece-size defaults to ceil_pow2(max_size) but can also be set higher, which would only result in increased padding, why?

I would feel much better changing things if I understood the above^ as well as the balance of concerns from clients, SPs, and FIL+ regarding padding and underfilled sectors

@parkan
Copy link
Collaborator Author

parkan commented Apr 7, 2025

additional note: some other approaches avoid this by addressing (4), i.e. pre-aggregating the data before packing; we were able to address this partially by enabling union storage support but this approach does rely on the user managing the inputs (by contrast to a continuous pool of input files as done by e.g. estuary)

for larger datasets this is a fine approach but does present issues with smaller inputs

@parkan
Copy link
Collaborator Author

parkan commented Apr 11, 2025

reverted the experimental feature, which was meant to leverage behavior similar to prep add-piece, however on closer examination prep list-pieces has special logic to select and display pieces with no attachment, however an equivalent code path was never added to dealpusher so any add-piece pieces would never be scheduled

the proper fix would be to add the equivalent query to dealpusher, however this is out of scope so I'm parking it

an alternate workaround is in place for our tests

&cli.StringFlag{
Name: "min-piece-size",
Usage: "The minimum size of a piece. Pieces smaller than this will be padded up to this size.",
Value: "256B",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to be 2MiB, fix incoming

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok bizarre this is correct in my local copy on the same commit?

@parkan
Copy link
Collaborator Author

parkan commented Apr 22, 2025

this is confirmed as needed for curio as all pieces padded over ceil_pow2(payload) are being rejected

@parkan parkan force-pushed the feat/allow-small-pieces branch from bd47432 to da7c1f1 Compare April 29, 2025 14:11
@parkan
Copy link
Collaborator Author

parkan commented Apr 30, 2025

this is ready for review, will merge #467 into this to hopefully get a green build

@parkan
Copy link
Collaborator Author

parkan commented May 1, 2025

ok this is fully ready for review, I propose an RC with the current changes and an API change for --max-size / --piece-size into a major version bump

@parkan parkan marked this pull request as ready for review May 1, 2025 20:34
@parkan parkan changed the title [WIP] Feat/allow small pieces Feat/allow small pieces May 2, 2025
Copy link
Collaborator

@ianconsolata ianconsolata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is just a pure refactor -- you're taking the existing minPieceSize value, and making it a configurable setting, but not actually updating the logic to pack things differently. Is that correct?

If it is, LGTM with minor edits to documentation

},
&cli.StringFlag{
Name: "min-piece-size",
Usage: "The minimum size of a piece. Pieces smaller than this will be padded up to this size. It's recommended to leave this as the default",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to pad at all? And are we sure this is compatible with the limits on padding that the Fil+ program set? Whatever that limit is, I think it would be good to either document it here, or check elsewhere to ensure that min-piece-size isn't set high enough to cause an issue.


// Verify piece size is a power of two
pieceSize := uint64(len(downloaded))
require.True(t, util.IsPowerOfTwo(pieceSize), "piece size %d is not a power of two", pieceSize)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just testing the default value?


if minPieceSize != util.NextPowerOfTwo(minPieceSize) {
return nil, errors.Wrap(handlererror.ErrInvalidParameter, "minPieceSize must be a power of two")
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add a check here to ensure the min-piece-size never over-pads in a way that makes things non-compliant.


const (
DataPiece PieceType = "data"
DagPiece PieceType = "dag"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document the difference between these two? Data and DAG are both very overloaded terms in this space.

@parkan
Copy link
Collaborator Author

parkan commented May 21, 2025

It looks like this is just a pure refactor -- you're taking the existing minPieceSize value, and making it a configurable setting, but not actually updating the logic to pack things differently. Is that correct?

If it is, LGTM with minor edits to documentation

not quite, though I did try to reuse the existing code paths as much as possible and be surgical with my changes

there are actual logic changes in two places:

DAG: https://github.com/data-preservation-programs/singularity/pull/479/files#diff-1b1ef7ccebaac03849b7f75c1ffcc482f4fd97b4e985890a600f56dd2b649935

data: https://github.com/data-preservation-programs/singularity/pull/479/files#diff-614158a313ce1ade512309d1428e0399b620ffa83aa55cea8e0b3063a438f07c

these use the min piece size as the target instead of the requested piece size/max-size, in effect all pieces > min_piece_size get a cel_pow2 padding

the packing logic per se remains unchanged, we are only changing the "virtual" padding that gets passed to GetCommP

the PieceSize parameter then effectively becomes "max piece size" only, which perhaps should be reflected better in the param naming/docs

the other quality of life change I would consider adding is suppressing the warning here:

logger.Warn("piece size is larger than the target piece size")

as now the "target" piece size will almost always be less than actual

@parkan
Copy link
Collaborator Author

parkan commented May 21, 2025

the big question remains whether curio will allow 1MiB pieces an exception (@LexLuthr stated that "we can probably add it" but I'm waiting on an update)

if not, then we'll need to get crafty -- either manipulate the DC allocation such that the <1MiB piece gets grouped at that level (unclear if that is even possible on the client side) or add an extra job type that aggregates pieces, though the latter is a real can of worms as it would need to work across preparations and is actually not guaranteed to succeed from the knapsack problem perspective (you could have some loose KiB that aren't groupable with anything)

Copy link
Contributor

@Sankara-Jefferson Sankara-Jefferson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • @parkan I changed the values of the 'sizes' and things ran well. Reasonable number of pieces and jobs. :)
  • I created two preps (same data, different paths) under my main branch before your fix, and then under your PR's branch and ran the three commands in this order scan, pack, daggen while exploring the prep after each command.
  • I got the same output. I pasted it below.
  • Is there a way for me to test and confirm tha Packing DAG and data pieces no longer leads to excessive empty sector space?
  • I think the next clear test will be that I can make deals through curio and that the data in these is retrievable.
  • I would like to test for backward compatibility, can you think of how I can use my current build (main branch) to create car files that do not have DAG pieces and then using the branch with your PR, run and generate DAG pieces that I can then onboard and test retrievals...
jeffersonsankara@Jeffersons-MacBook-Pro singularity % singularity prep list-pieces 19 
AttachmentID  SourceStorageID  
21            25               
    SourceStorage
        ID  Name       Type   Path                                         
        25  dpla_demo  local  /Users/jeffersonsankara/Documents/demo_dpla  
    Pieces
        PieceCID                                                          PieceSize  RootCID                                                      FileSize  StoragePath  
        baga6ea4seaqiiyi76co7qvbcuhgl5oaygefi3no2ztfyedf2rtkm5kmczba4aby  4194304    bafkreie6tmbw3cwzs27s3us53r4zv3nyqetjd2gmt44ckcr3q422tnv5bi  2460037                
        baga6ea4seaqmg5xtmjmss6vmtvxfwjn5x6cssxsygue6wckourrczblclhxg6ni  4194304    bafybeihdcny2iwx4izst7ftfol7s3icxn27e2ybvkxoiosqeqhlwasgmyi  810                    
jeffersonsankara@Jeffersons-MacBook-Pro singularity % 

AND

jeffersonsankara@Jeffersons-MacBook-Pro singularity % singularity prep list-pieces 20
AttachmentID  SourceStorageID  
22            26               
    SourceStorage
        ID  Name           Type   Path                                             
        26  demo_dpla_479  local  /Users/jeffersonsankara/Documents/demo_dpla_479  
    Pieces
        PieceCID                                                          PieceSize  RootCID                                                      FileSize  StoragePath  
        baga6ea4seaqiiyi76co7qvbcuhgl5oaygefi3no2ztfyedf2rtkm5kmczba4aby  4194304    bafkreie6tmbw3cwzs27s3us53r4zv3nyqetjd2gmt44ckcr3q422tnv5bi  2460037                
        baga6ea4seaqmg5xtmjmss6vmtvxfwjn5x6cssxsygue6wckourrczblclhxg6ni  4194304    bafybeihdcny2iwx4izst7ftfol7s3icxn27e2ybvkxoiosqeqhlwasgmyi  810                    
jeffersonsankara@Jeffersons-MacBook-Pro singularity % 

@parkan
Copy link
Collaborator Author

parkan commented May 22, 2025

could you please paste the exact command history, including when you switched branches and re-built the binary?

I am going to post my other comments separately for ease of reference

@parkan
Copy link
Collaborator Author

parkan commented May 22, 2025

regarding the data pieces:

  • assuming you have specified --piece-size=4MiB or --max-size anywhere between 2MiB and 4MiB, the data piece behavior would be the same between old and new, i.e. these inputs do not test the changes in handling of data pieces since a 4MiB piece size is appropriate for a 2.5MiB input
  • to test this effectively, I suggest picking a piece size such that the input would be split into 2 or more data pieces, e.g. --piece-size=2MiB

in this case the old codebase will output 2 data pieces of 2MiB each, with the second being overpadded (it should be 1MiB) whereas the new code will output a 2MiB and a 1MiB piece (these are estimates but likely accurate)

@parkan
Copy link
Collaborator Author

parkan commented May 22, 2025

regarding these questions:

I got the same output. I pasted it below.
Is there a way for me to test and confirm tha Packing DAG and data pieces no longer leads to excessive empty sector space?

this is the expected outcome for the data piece, but not for the dag piece, which should be padded to 1MiB (due to the min piece size floor)

this suggests that something is not working correctly, or that your binary is not using the current code version

I am very strongly suspecting the latter, because the output of list-pieces has changed in this version, with an additional column for PieceType, for example:

❯ ./singularity prep list-pieces 540
2025-05-22T17:15:39.205+0200	INFO	database	database/connstring_cgo.go:35	Opening postgres database
AttachmentID  SourceStorageID  
525           540              
    SourceStorage
        ID   Name                 Type   Path  
        540  eot-2024-crawls_107  union        
    Pieces
        PieceType  PieceCID                                                          PieceSize    RootCID                                                      FileSize     StoragePath  
        data       baga6ea4seaqml55hruc7na5cy4abas6dyaeolycvx73xi7bwvjiaaqysjopakfy  34359738368  bafkreibycn35yqfbez6dyluu472r55a55myzs5ribdutnxuyakzaobxrw4  32751676659  

note that PieceType precedes PieceCID, while in your output you have:

    Pieces
        PieceCID                                                          PieceSize  RootCID    
  ...

I highly suggest verifying the correct binary version, it should be similar to this (minus the ...dirty part potentially):

❯ ./singularity version
singularity v0.5.18-0.20250502115600-18d119848686+dirty-18d1198-dirty

once this is done I suggest creating a fresh preparation with the above settings and retrying your test

@parkan
Copy link
Collaborator Author

parkan commented May 22, 2025

regarding backward compatibility, this is a very good question, and there are indeed certain complex limitations in that regard but let's establish the baseline functionality first (for what it's worth the test you proposed should work just fine)

@Sankara-Jefferson
Copy link
Contributor

@parkan Here is everything I tried and ended up with an older version singularity v0.5.17-RC1-58-gefdc703-dirty-unknown-unknown

  1. Checked out PR Feat/allow small pieces #479 cleanly
git fetch origin pull/479/head:feat/allow-small-pieces
git checkout feat/allow-small-pieces
  1. Generated version metadata
VERSION=$(git describe --tags --dirty --always)
COMMIT=$(git rev-parse HEAD)

cat <<EOF > version.json
{
  "version": "$VERSION",
  "commit": "$COMMIT"
}
EOF

  1. Built the binary
go build -o singularity singularity.go
chmod +x ./singularity

  1. Ran the version command
./singularity version

But the output remained:

singularity v0.5.17-RC1-58-gefdc703-dirty-unknown-unknown

Next, I patched cmd/version.go to read actual values

I replaced debug.ReadBuildInfo() with:

fmt.Printf("singularity %s (%s)\n", Version, Commit)

However, this failed to compile with:

undefined: Commit

Questions:

  • Is Version defined in cmd/app.go?
  • Should I even be trying all this troubleshooting?

@parkan
Copy link
Collaborator Author

parkan commented May 28, 2025

we worked through the basic flow, a few small changes are added to scope (suppress the piece size warnings which are no longer relevant, update release metadata, etc)

@Sankara-Jefferson
Copy link
Contributor

@parkan I checked the metadata API
metadata_api_output.json
and the fields that you added even to the CLI are not pulling values. I have attached sample output.

@parkan
Copy link
Collaborator Author

parkan commented Jun 4, 2025

@parkan I checked the metadata API metadata_api_output.json and the fields that you added even to the CLI are not pulling values. I have attached sample output.

could you explain exactly what values you expect to see in this output?

are you referring to

"pieceType": "",

having an empty value? can you confirm that this preparation was created using the new codebase?

in general, when possible, please use the "actions taken/result expected/result seen" format for the quickest turnaround 👍

@Sankara-Jefferson
Copy link
Contributor

Sankara-Jefferson commented Jun 4, 2025

@parkan Yes, the 'pieceType' and yes I used the new codebase.

Actions Taken:
Created a prep using the new codebase (this PR).

Result Expected:
Expected all values, including in the "pieceType", "min-piece-size" fields, to be pulled correctly by API (metadata).

Result Seen:
Observed an empty value for pieceType and min-piece-size was not in the output. Confirming this occurred despite using the updated codebase

@parkan
Copy link
Collaborator Author

parkan commented Jun 5, 2025

@parkan Yes, the 'pieceType' and yes I used the new codebase.

Actions Taken: Created a prep using the new codebase (this PR).

Result Expected: Expected all values, including in the "pieceType", "min-piece-size" fields, to be pulled correctly by API (metadata).

Result Seen: Observed an empty value for pieceType and min-piece-size was not in the output. Confirming this occurred despite using the updated codebase

cannnot reproduce, tests and manual verification shows correct piece_type returned

min_piece_size is not a property of the car much like min-size etc are not

@parkan
Copy link
Collaborator Author

parkan commented Jun 5, 2025

0.6.0-RC1 has been proposed and will be cut when this is merged

@parkan
Copy link
Collaborator Author

parkan commented Jun 5, 2025

not sure where the cancellations are coming from? 🤔

@Sankara-Jefferson Sankara-Jefferson merged commit 0fa6a2c into main Jun 10, 2025
8 of 18 checks passed
@Sankara-Jefferson Sankara-Jefferson deleted the feat/allow-small-pieces branch June 10, 2025 21:39
Sankara-Jefferson pushed a commit that referenced this pull request Jun 13, 2025
feat: allow configurable min-piece-size to reduce excessive padding for small segments

This change introduces support for specifying a `--min-piece-size` when preparing data, improving behavior for small DAGs, remainder CARs, and small preparations that would otherwise be padded to the full target piece size (e.g. 32GiB). Such excessive padding leads to inefficiencies and causes sectors to be rejected by Storage Providers or fail verified deal requirements.

### Key changes:
- Add support for `--min-piece-size` (default: 256B, subject to adjustment)
- Pass both `min` and `target` piece sizes to `GetCommp`, enabling finer control over padding
- Retain power-of-2 padding via `target size`, but allow flexibility by setting it to `0`

This helps avoid generating 90%+ padding pieces and reduces transfer times in many cases.

### Notes:
- Default behavior remains unchanged if `--min-piece-size` is not set
- Full support for non-padded pieces now depends on both chunker accuracy and downstream deal acceptance
- `pieceType` is now tracked in metadata (e.g., data vs. DAG)

### Out of scope:
- No cross-preparation aggregation; that responsibility remains with SPs
- Edge cases like aggregating under-1MiB pieces are not yet solved

Closes #473

Co-authored-by: Arkadiy Kukarkin <arkadiy@archive.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

[Bug] DAG pieces get padded to full sector size

4 participants