Skip to content

Use CidHashMap in the plain CAR-file blockstore#3299

Merged
jdjaustin merged 13 commits into
mainfrom
jdjaustin/issue-3286-cidhashmap-in-plain-car-file-bs
Aug 23, 2023
Merged

Use CidHashMap in the plain CAR-file blockstore#3299
jdjaustin merged 13 commits into
mainfrom
jdjaustin/issue-3286-cidhashmap-in-plain-car-file-bs

Conversation

@jdjaustin
Copy link
Copy Markdown
Contributor

@jdjaustin jdjaustin commented Jul 31, 2023

Summary of changes

Changes introduced in this pull request:

Replace maps used for write cache and index in src/db/car/plain.rs with CidHashMap.

  • Add missing methods to CidHashMap
  • Replace HashMap<Cid, V> with CidHashMap<V> in car/plain.rs
  • Measure change in peak memory use when validating an uncompressed calibnet and mainnet snapshot. See /usr/bin/time -v forest-cli snapshot validate {snapshot}.car

Reference issue to close (if applicable)

Closes #3286

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Josh Jones and others added 4 commits July 31, 2023 16:29
Co-authored-by: Aatif Syed <aatifsyed@users.noreply.github.com>
…stin/issue-3286-cidhashmap-in-plain-car-file-bs
@jdjaustin
Copy link
Copy Markdown
Contributor Author

Running gtime -v forest-cli snapshot validate forest_snapshot_calibnet_2023-08-01_height_786458.forest.car with the latest changes from this PR (peak RSS 593760 kbytes) versus main branch (peak RSS 855152 kbytes):
Screen Shot 2023-08-01 at 4 12 34 PM
Screen Shot 2023-08-01 at 4 10 46 PM

@jdjaustin
Copy link
Copy Markdown
Contributor Author

Running gtime -v forest-cli snapshot validate 3095520_2023_08_04T18_00_00Z.car with the latest changes from this PR (peak RSS 6432096 kbytes) versus main branch (peak RSS 6473984 kbytes):
Screen Shot 2023-08-04 at 6 47 39 PM
Screen Shot 2023-08-04 at 5 46 48 PM

@lemmih
Copy link
Copy Markdown
Contributor

lemmih commented Aug 5, 2023

Try running it with --check-stateroots 0.

@lemmih
Copy link
Copy Markdown
Contributor

lemmih commented Aug 5, 2023

Better yet, use forest-tool benchmark graph-traversal

@lemmih
Copy link
Copy Markdown
Contributor

lemmih commented Aug 5, 2023

I'm quite surprised by your numbers. They should be higher, and there should be a larger difference between your branch and main. These are the numbers I get:
main/graph-traverse: 13.26 GiB
main/validate: 17.74 GiB
josh/graph-traverse: 9.26 GiB
josh/validate: 13.75 GiB

I'll try running it on MacOS.

@jdjaustin
Copy link
Copy Markdown
Contributor Author

My machine has done weird things ever since I got it. I recompiled this branch and ran it again against main; this time I'm getting much more expected results.
Screen Shot 2023-08-15 at 12 33 58 PM
Screen Shot 2023-08-15 at 12 03 46 PM

@jdjaustin jdjaustin changed the title [WIP] Use CidHashMap in the plain CAR-file blockstore Use CidHashMap in the plain CAR-file blockstore Aug 16, 2023
@jdjaustin jdjaustin marked this pull request as ready for review August 16, 2023 10:45
@jdjaustin jdjaustin requested a review from a team as a code owner August 16, 2023 10:45
@jdjaustin jdjaustin requested review from ruseinov and sudo-shashank and removed request for a team August 16, 2023 10:45
@jdjaustin
Copy link
Copy Markdown
Contributor Author

I sense merge conflicts incoming due to #3388 merging soon, so converting this to draft until after I resolve those.

@jdjaustin jdjaustin marked this pull request as draft August 18, 2023 09:32
@jdjaustin jdjaustin marked this pull request as ready for review August 18, 2023 13:03
@ruseinov
Copy link
Copy Markdown
Contributor

It would be nice to have some unit tests for the new CidHashMap functionality.

@jdjaustin jdjaustin added this pull request to the merge queue Aug 23, 2023
Merged via the queue into main with commit f1db1ff Aug 23, 2023
@jdjaustin jdjaustin deleted the jdjaustin/issue-3286-cidhashmap-in-plain-car-file-bs branch August 23, 2023 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

use CidHashMap in the plain CAR-file blockstore

4 participants