Skip to content

[Discuss][C++] Replace MemoTable with a SwissTable implementation #38372

@js8544

Description

@js8544

Describe the enhancement requested

This is a follow up of #36059 (comment). There are many cases where we use a MemoTable, e.g. set lookup functions, vector hash functions, the count_distinct aggregate function, dictionary unification etc. Their performance can be boosted with a SwissTable.

We can either:

  1. Use an existing swiss table library. This requires some work on dependency management. I recommend absl::flat_hash_map since they are the original author of swiss tables and we already has absl in our 3rd party toolchain.
  2. Write one by ourselves. The current one in acero is too customized for the join node and it seems hard to extract a general hashtable from it. If I were to write one I would probably follow the structure of absl's and replace things like memory management and bit tweaking with Arrow ones.

What do you think? @pitrou @westonpace

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions