Skip to content

Add cuco::static_multimap#85

Merged
jrhemstad merged 271 commits intoNVIDIA:devfrom
PointKernel:static-multi-map
Nov 1, 2021
Merged

Add cuco::static_multimap#85
jrhemstad merged 271 commits intoNVIDIA:devfrom
PointKernel:static-multi-map

Conversation

@PointKernel
Copy link
Copy Markdown
Member

@PointKernel PointKernel commented Jun 3, 2021

Closes #59

Added cuco::static_multimap:

  • Major performance optimizations:

    • Relaxed atomic and reinterpret_cast (to be replaced by atomic_ref once it's ready)
    • Shared memory buffer at CG/warp/block level
    • Double hashing instead of linear probing
    • Use of CG with parallel retrieval
    • Vector loads (each thread loads two consecutive slots) for 32-bit integers
      find_all
  • Comprehensive benchmark to evaluate multimap performance

    • nvbench instead of google benchmark
    • Jupyter notebooks showing benchmarking results (cuCollections/benchmarks/analysis/notebooks)
  • Flexible switch between vector/scalar loads and between different probing methods

    • class ProbeSequence as a template parameter for cuco::static_multimap
    • By default, using per-block shared memory buffer + vector loads for 32-bit integers while per-CG buffer + scalar loads for 64-bit integers

PointKernel and others added 30 commits May 3, 2021 13:46
@abellina
Copy link
Copy Markdown

Just checking in, what is the status of this PR?

@jrhemstad
Copy link
Copy Markdown
Collaborator

Just checking in, what is the status of this PR?

Waiting on some final review changes.

Comment thread include/cuco/probe_sequences.cuh Outdated
Comment thread include/cuco/probe_sequences.cuh Outdated
@jrhemstad jrhemstad merged commit 62b90b7 into NVIDIA:dev Nov 1, 2021
@PointKernel PointKernel deleted the static-multi-map branch December 16, 2022 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA] static_multimap

7 participants