Skip to content

Export improvements#90

Open
gbrgr wants to merge 26 commits intomainfrom
gb/export-improvements
Open

Export improvements#90
gbrgr wants to merge 26 commits intomainfrom
gb/export-improvements

Conversation

@gbrgr
Copy link
Copy Markdown
Contributor

@gbrgr gbrgr commented Apr 28, 2026

Export improvements

This PR reworks the Parquet/Iceberg write path for throughput, cleanliness, and configurability.

New write APIs

  • WriterConfig struct with configurable Parquet properties: compression codec, dictionary encoding, plain encoding, row group size, page size, write batch size, and statistics
  • ColumnBatch / write_columns — zero-copy write path for flat column buffers (bypasses Arrow IPC serialization)
  • GatheredBatch / write_columns — gathered-column write path that assembles columns from scattered slices (selection vectors + validity bitmaps) directly in the calling thread
  • set_encode_workers! to configure the encode thread pool size before first use

Global encode worker pool
A single pool of N OS threads (default: Sys.CPU_THREADS) is shared across all writers. Each write_columns / write call gathers or serializes data in the calling thread, submits a RecordBatch to the pool, and returns immediately — encode and Parquet I/O run on pool threads. Per-writer ordering is preserved via a Mutex<ConcreteDataFileWriter> in the shared WriterState: only one pool thread encodes a given writer at a time. close_writer waits for all in-flight tasks to drain before finalising the file.

This design lets Julia pipeline the gather/serialize step on the main thread with encode work happening concurrently on pool threads, rather than blocking end-to-end on each write.

Tests
Added tests for ColumnBatch, GatheredBatch (including scattered slices, nullable columns, string columns), decimal types (Int32/Int64/Int128 backing), and all WriterConfig Parquet properties.

gbrgr and others added 16 commits April 21, 2026 19:45
Update iceberg and iceberg-catalog-rest to rev
418213731e91544f5eb31a3efa459e88f599030e, which includes the fix for
incremental scans with from=None silently dropping EXISTING manifest
entries from expired snapshots.

Also pulls in arrow/parquet v58.1.0 via the updated lock file.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
iceberg-rust rev 418213731e91 depends on arrow/parquet v58.1.0;
keeping the FFI crate on 57.x caused two incompatible versions of
RecordBatch to be linked, failing to compile. Align all arrow-*
and parquet pins to "58.1".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gbrgr gbrgr marked this pull request as ready for review April 28, 2026 17:07
gbrgr and others added 3 commits April 28, 2026 19:12
Accepts Vector{String} with optional validity BitVector, handling pointer
extraction and GC preservation internally. The low-level ptr/len overload
is retained for performance-sensitive callers with pre-allocated buffers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gbrgr
Copy link
Copy Markdown
Contributor Author

gbrgr commented Apr 29, 2026

I am seeing good throughput improvements locally:

# Run on 2026-04-29 17:13:08.052
# Apple M3 Pro, 8 thread(s), -O2
# ========================================================================================================
# Name                                                    Pager        Mem   Allocs Throughput        Time
# iceberg_export:Int,Int,Int,Int GNF                  172.80MiB   37.22MiB  371.48K 3412.2MB/s     178.9ms
# iceberg_export:Int,Int,Int,Int GNF_Null(30.0%)      144.00MiB   71.56MiB    1.65M 2038.4MB/s     299.4ms
# iceberg_export:Int,Int,Int,Int WIDE                   0.00MiB    6.53MiB   89.52K 5615.9MB/s     108.7ms
# iceberg_export:Float64,Float64,Float64,Float64 GNF  172.80MiB   37.38MiB  371.87K 3240.3MB/s     188.4ms
# iceberg_export:Float64,...,Float64 GNF_Null(30.0%)  144.00MiB   71.37MiB    1.65M 1925.0MB/s     317.1ms
# iceberg_export:Float64,...t64,Float64,Float64 WIDE    0.00MiB    7.25MiB   89.83K 5244.7MB/s     116.4ms
# iceberg_export:Int,Float64,FD{Int64,4},Int128 GNF   172.80MiB   37.91MiB  371.74K 2313.7MB/s     329.7ms
# iceberg_export:Int,Floa...},Int128 GNF_Null(30.0%)  144.00MiB   71.78MiB    1.64M 2058.4MB/s     370.6ms
# iceberg_export:Int,Float64,FD{Int64,4},Int128 WIDE    0.00MiB    6.55MiB   90.52K 3319.6MB/s     229.8ms
# iceberg_export:Int,Float64,vstr,Int128 GNF          172.80MiB    1.17GiB   20.43M 1018.1MB/s    1199.0ms
# iceberg_export:Int,Floa...r,Int128 GNF_Null(30.0%)  144.00MiB  901.95MiB   15.80M 1012.7MB/s    1205.4ms
# iceberg_export:Int,Float64,vstr,Int128 WIDE           0.00MiB    1.27GiB   20.15M 1354.6MB/s     901.2ms
# iceberg_export:vstr,vstr,vstr,vstr GNF              211.20MiB    4.49GiB   80.53M  764.2MB/s    3194.6ms
# iceberg_export:vstr,vstr,vstr,vstr GNF_Null(30.0%)  172.80MiB    3.33GiB   58.08M 1133.3MB/s    2154.2ms
# iceberg_export:vstr,vstr,vstr,vstr WIDE               0.00MiB    3.05GiB   20.23M 1043.5MB/s    2339.7ms

vs.

# Run on 2026-04-29 10:36:06.658
# Apple M3 Pro, 8 thread(s), -O2
# ========================================================================================================
# Name                                                    Pager        Mem   Allocs Throughput        Time
# iceberg_export:Int,Int,Int,Int GNF                  172.80MiB  660.68MiB  539.06K  841.5MB/s     725.3ms
# iceberg_export:Int,Int,Int,Int GNF_Null(30.0%)      144.00MiB   76.06MiB    1.89M  680.1MB/s     897.5ms
# iceberg_export:Int,Int,Int,Int WIDE                   0.00MiB    5.90MiB  118.44K 1128.7MB/s     540.7ms
# iceberg_export:Float64,Float64,Float64,Float64 GNF  172.80MiB  660.68MiB  539.14K  840.6MB/s     726.1ms
# iceberg_export:Float64,...,Float64 GNF_Null(30.0%)  144.00MiB   76.07MiB    1.89M  628.9MB/s     970.5ms
# iceberg_export:Float64,...t64,Float64,Float64 WIDE    0.00MiB    5.90MiB  118.52K 1054.6MB/s     578.8ms
# iceberg_export:Int,Float64,FD{Int64,4},Int128 GNF   172.80MiB  813.85MiB  539.35K  838.1MB/s     910.3ms
# iceberg_export:Int,Floa...},Int128 GNF_Null(30.0%)  144.00MiB   77.22MiB    1.93M  712.7MB/s    1070.6ms
# iceberg_export:Int,Float64,FD{Int64,4},Int128 WIDE    0.00MiB    5.97MiB  119.19K 1125.0MB/s     678.2ms
# iceberg_export:Int,Float64,vstr,Int128 GNF          172.80MiB    2.26GiB   20.57M  942.2MB/s    1295.6ms
# iceberg_export:Int,Floa...r,Int128 GNF_Null(30.0%)  144.00MiB    1.37GiB   15.96M  785.7MB/s    1553.6ms
# iceberg_export:Int,Float64,vstr,Int128 WIDE           0.00MiB    1.70GiB   20.17M  991.6MB/s    1231.0ms
# iceberg_export:vstr,vstr,vstr,vstr GNF              211.20MiB    6.62GiB   80.67M 1025.7MB/s    2380.2ms
# iceberg_export:vstr,vstr,vstr,vstr GNF_Null(30.0%)  172.80MiB    5.31GiB   57.99M  843.2MB/s    2895.2ms
# iceberg_export:vstr,vstr,vstr,vstr WIDE               0.00MiB    9.00GiB  100.31M  310.4MB/s        7.9s

(some results are noisy due to local execution, but all workloads consistently show improvements. More so for numeric types due to batched gathering on the Rust side).

…diate Vec

Build StringArray directly with OffsetBuffer + values Buffer in a single pass,
replacing Vec<Option<&str>> + StringArray::from. Use new_unchecked to skip
Arrow's UTF-8 re-validation — Julia strings are guaranteed valid UTF-8.

For 20M x 32-byte strings this eliminates ~320 MB of intermediate Vec<Option<&str>>
storage and ~640 MB of redundant UTF-8 validation reads per column.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@robertbuessow robertbuessow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understood everything but I think it's good enough. Some nits + it would be good to improve error handling between rust and julia.

Comment thread iceberg_rust_ffi/src/writer.rs Outdated
Comment thread iceberg_rust_ffi/src/writer.rs Outdated
Comment thread iceberg_rust_ffi/src/writer_columns.rs Outdated
Comment thread iceberg_rust_ffi/src/writer_columns.rs
Comment thread iceberg_rust_ffi/src/writer_columns.rs
Comment thread iceberg_rust_ffi/src/writer_columns.rs Outdated
Comment thread src/writer.jl Outdated
gbrgr and others added 2 commits April 30, 2026 13:51
- Extract encode worker loop body into encode_worker_loop()
- Retain panic message: downcast Box<dyn Any> to &str / String before formatting
- Add iceberg_take_gather_error() FFI + thread-local to surface gather errors
  immediately in Julia exceptions rather than deferring to writer close
- Clarify lengths_ptr doc: array of byte lengths per string
- Merge identical sequential/scattered validity branches in build_null_buffer_scattered
- Add explanatory comments: bitvector merging, re-alignment, all-valid bit-set

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 71.42857% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.81%. Comparing base (cdce472) to head (7a8947a).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/writer.jl 71.42% 30 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #90      +/-   ##
==========================================
- Coverage   84.42%   82.81%   -1.61%     
==========================================
  Files           9        9              
  Lines         873      966      +93     
==========================================
+ Hits          737      800      +63     
- Misses        136      166      +30     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants