Skip to content

feat: Actually decode dictionary arrays#861

Merged
paleolimbot merged 27 commits intoapache:mainfrom
paleolimbot:decode-dictionary-arrays
Apr 18, 2026
Merged

feat: Actually decode dictionary arrays#861
paleolimbot merged 27 commits intoapache:mainfrom
paleolimbot:decode-dictionary-arrays

Conversation

@paleolimbot
Copy link
Copy Markdown
Member

@paleolimbot paleolimbot commented Apr 5, 2026

This PR uses the previous steps to output arrays (including the ArrowArrayStream reader). This lets us wire it in to all the tests as well.

The main follow up is that this PR currently deep copies the dictionary for every batch that arrives, negating much of the point of dictionary encoding. This is a fairly self-contained change that I'll do separately: #864

I also added dictionary index validation while I was here! It is fairly compact (compared to the other code).

Closes #845.

@paleolimbot paleolimbot force-pushed the decode-dictionary-arrays branch from 213584c to 0e69e58 Compare April 7, 2026 01:27
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 60.97561% with 48 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.44%. Comparing base (f11d517) to head (282727e).
⚠️ Report is 37 commits behind head on main.

Files with missing lines Patch % Lines
src/nanoarrow/ipc/decoder.c 52.94% 12 Missing and 20 partials ⚠️
src/nanoarrow/ipc/reader.c 70.90% 8 Missing and 8 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #861      +/-   ##
==========================================
- Coverage   79.96%   79.44%   -0.53%     
==========================================
  Files         105      106       +1     
  Lines       15461    15881     +420     
  Branches     1738     1834      +96     
==========================================
+ Hits        12364    12616     +252     
- Misses       1998     2115     +117     
- Partials     1099     1150      +51     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@paleolimbot paleolimbot marked this pull request as ready for review April 16, 2026 22:38
@paleolimbot
Copy link
Copy Markdown
Member Author

I'm pretty happy with this! If there are no objections I'll merge later today and move on to the shared dictionary problem.

@paleolimbot paleolimbot merged commit 5bb90b1 into apache:main Apr 18, 2026
42 checks passed
@paleolimbot paleolimbot deleted the decode-dictionary-arrays branch April 18, 2026 02:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Decode dictionary-encoded arrays in IPC decoder

2 participants