Skip to content

feat: Add shared array and buffer to nanoarrow.h#864

Draft
paleolimbot wants to merge 9 commits intoapache:mainfrom
paleolimbot:shared-array-and-buffer
Draft

feat: Add shared array and buffer to nanoarrow.h#864
paleolimbot wants to merge 9 commits intoapache:mainfrom
paleolimbot:shared-array-and-buffer

Conversation

@paleolimbot
Copy link
Copy Markdown
Member

For #861 (actual decoding of dictionary arrays), we really need shared arrays for this to be reasonable (e.g., to not deep copy each dictionary for each batch!).

This PR (1) moves shared buffers to the C library instead of the IPC extension, (2) implements a second kind of shared buffer, which is borrowed from an owned reference-counted array, and (3) implements a "clone" that mostly uses the second concept to explode an array into 100% reference counted buffers that we can clone.

I had hoped we could replace the R and Python versions of these but the logic is fuzzier there because once an array has been referenced we can't mess with it or any of its parents (or it will cause a crash). My mind explodes (and I get a lot of failing R tests cases) whenever I mess with that piece and so I'll try to deal with that a different day.

This still needs tests for the shared array move and clone.

paleolimbot added a commit that referenced this pull request Apr 18, 2026
This PR uses the previous steps to output arrays (including the
ArrowArrayStream reader). This lets us wire it in to all the tests as
well.

The main follow up is that this PR currently deep copies the dictionary
for every batch that arrives, negating much of the point of dictionary
encoding. This is a fairly self-contained change that I'll do
separately: #864

I also added dictionary index validation while I was here! It is fairly
compact (compared to the other code).

Closes #845.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant