Skip to content

Add Multidimensional array support#329

Open
DanielBunting wants to merge 18 commits into
ClickHouse:mainfrom
DanielBunting:DB/add-multidimensional-array-support
Open

Add Multidimensional array support#329
DanielBunting wants to merge 18 commits into
ClickHouse:mainfrom
DanielBunting:DB/add-multidimensional-array-support

Conversation

@DanielBunting
Copy link
Copy Markdown
Contributor

@DanielBunting DanielBunting commented May 13, 2026

Summary

#320 raised an issue about multidimensional-array support across the driver's write path.

Cannot convert 219 to Array(UInt8)

Gets thrown when the user tried to send a parameter typed as Array(Array(UInt8)). Tracing it revealed three layered problems, not one:

  1. HttpParameterFormatter.cs:127: the formatter flattens multidim arrays. The Array(Array(T)) branch did foreach element in value, then recursed to format each element as another inner array. That works for jagged arrays (byte[][]) where each element is itself a byte[]. But byte[,] is different: foreach over a 2D array walks every scalar cell in row-major order, not row by row. So for byte[,] { {1,2,3}, {4,5,6} }, the formatter saw 1, 2, 3, 4, 5, 6 (six scalars) and tried to format each one as if it were a whole inner array. The first call (Format(Array(UInt8), 1, ...)) fell off the end of the type-dispatch switch and threw "Cannot convert 1 to Array(UInt8)".
  2. ArrayType.Write: same problem on the binary insert path. The binary writer cast the value to IList, wrote collection.Count as the outer length, then indexed into it row by row. That's fine for byte[][]: Count is the number of rows and value[0] returns the first row as byte[]. But byte[,] also implements IList via the Array base class, and on a 2D array those operations behave nothing like row access: IList.Count returns the total flattened length (6 for a 2×3, not 2), and value[i] throws ArgumentException ("Array was not a one-dimensional array.") on the very first access. So the writer first encoded the wrong outer length, then crashed on the first indexer call.
  3. TypeConverter.ToClickHouseType: the type-inference helpers only understood one layer of array. Both overloads (the one taking a Type, used when you ask "what ClickHouse type matches typeof(byte[,])?", and the one taking a value, used when you pass a parameter with no explicit type hint) had a single if (type.IsArray) return new ArrayType { ... } branch. For byte[,], the inner call asked byte[,].GetElementType() and got back byte — the rank-2 information was thrown away, producing Array(UInt8) instead of Array(Array(UInt8)). The value-based path was even worse: it tried to peek at the first element via Array.GetValue(0), which throws on any array of rank > 1 (you have to pass an index per dimension), so the inference crashed instead of just producing the wrong type.

The error message was also lossy: by the time recursion bottomed out, the formatter only knew it had a scalar where an Array(UInt8) was expected. the original Array(Array(UInt8)) context was gone, as was the parameter name.

The fix

1. Types/MultiDimArrayHelper.cs (new)

The central abstraction. Two public methods:

  • EnumerateOutermostRank(Array array): given a rank-N array, yields N-1-rank slices along axis 0. For rank 1 it just yields elements. The slicing copies each row into a freshly-allocated lower-rank Array so that downstream code can recurse without worrying about rank.
  • ToMultidimensional<T>(object jagged): the inverse direction. Takes a jagged value (T[][] or deeper), validates rectangularity, and copies it into a rank-N T[,,...]. Throws InvalidOperationException on ragged data or null intermediate rows, with a message that names the depth and the mismatched length.

Both operations are conceptually trivial but the index bookkeeping is fiddly enough that they belong in a single, unit-testable home.

2. Types/TypeConverter.cs: rank-aware loops

if (type.IsArray)
{
    ClickHouseType result = ToClickHouseType(type.GetElementType());
    var rank = type.GetArrayRank();
    for (var i = 0; i < rank; i++)
        result = new ArrayType { UnderlyingType = result };
    return result;
}

The same loop appears in both overloads; ToClickHouseType(Type) and ToClickHouseType(object). The value-based overload also stopped crashing on multidim by using Array.GetValue(int[] indices) instead of Array.GetValue(int).

3. Formats/HttpParameterFormatter.cs

A new pattern-match arm placed before the existing IEnumerable arm:

case ArrayType arrayType when value is Array multidim && multidim.Rank > 1:
    return $"[{string.Join(",", MultiDimArrayHelper.EnumerateOutermostRank(multidim)
        .Select(obj => Format(arrayType.UnderlyingType, obj, true, customFormatter, parameterName)))}]";

The ordering matters. Multidimensional arrays would otherwise be intercepted by the IEnumerable arm and flattened.

The default branch's error message also got a name and full-type rewrite, and the public entry point wraps recursive throws with the outer type so a leaf-level mismatch surfaces the user-visible parameter type:

Parameter 'm_value' (type Array(Array(UInt8))): Parameter 'm_value': cannot convert value of type 'System.Byte' (219) to ClickHouse type Array(Array(UInt8))

4. Types/ArrayType.Write: same multidim arm on the binary path

if (value is Array multidim && multidim.Rank > 1)
{
    writer.Write7BitEncodedInt(multidim.GetLength(0));
    foreach (var slice in MultiDimArrayHelper.EnumerateOutermostRank(multidim))
        UnderlyingType.Write(writer, slice);
    return;
}

Each slice is a rank-(N-1) array; the recursive UnderlyingType.Write call sees a lower-rank value and either recurses again (rank-3 input > rank-2 slice > still multidim) or drops into the legacy (IList)value path (rank-1 slice > ordinary array).

5. Reading multidimensional shapes... the part I went back on (see Choosing the reader API)

ClickHouseDataReader.GetFieldValue<T> an override that did a one-line cast now detects multidimensional T and routes through MultiDimArrayHelper.ToMultidimensional<T>:

public override T GetFieldValue<T>(int ordinal)
{
    var raw = GetValue(ordinal);
    if (FieldValueDispatcher<T>.RequiresMultidimConversion)
        return MultiDimArrayHelper.ToMultidimensional<T>(raw);
    return (T)raw;
}

private static class FieldValueDispatcher<T>
{
    public static readonly bool RequiresMultidimConversion =
        typeof(T).IsArray && typeof(T).GetArrayRank() >= 2;
}

The FieldValueDispatcher<T> cache reduces the hot-path overhead to a single static bool load and branch. Each closed generic instantiation computes the predicate exactly once on first use.


On the standalone MultiDimArrayHelper type

The slicing and rectangularity logic is small enough that it could plausibly live as private methods on either ArrayType (write side) or ClickHouseDataReader (read side). I deliberately put it in its own internal static class for three reasons:

  1. Both the formatter and the binary writer needed the slicing. If it lived on either of them, the other would either duplicate or take a coupling it didn't need.
  2. Pure-function unit-testability. The conversion is the kind of thing that benefits from exhaustive small tests; rank-2 rectangular, rank-3 ragged at middle depth, empty inner rows, null intermediate nodes, single-row arrays, all-empty arrays. Putting it behind a pure-static surface lets MultiDimArrayHelperTests cover 22 cases without needing to spin up a server or even a ClickHouseDataReader.
  3. It documents intent at the call site. MultiDimArrayHelper.EnumerateOutermostRank(value) reads clearly; whereas a private SliceOuter(value) on HttpParameterFormatter would only make sense if you remembered the formatter happened to have it.

It's internal rather than public because the helper isn't part of the user-facing contract. The only public surface that touches it is GetFieldValue<T>'s new special-case branch.


Choosing the reader API

Two reasonable designs presented themselves. I weighed them against the existing architecture and the cost on the hot read path, and picked the one that slotted into the codebase's existing conventions.

Option A: a new public method GetMultidimensional<T>(int ordinal) on ClickHouseDataReader that delegates to the helper. I feel this conversion is qualitatively different from the existing typed accessors (GetIPAddress, GetTuple, GetBigInteger) which are zero-cost casts of GetValue, this one allocates, copies, and can throw on ragged data. An explicit method name would make that visible at the call site, and keep the cost out of any unrelated read.

Option B: extending the existing GetFieldValue<T> override so it detects multidim T and routes through the helper. ClickHouseDataReader already overrides GetFieldValue<T>. A user wanting to read a multidim column would naturally try reader.GetFieldValue<int[,]>(0) first, and today that path already exists and fails with InvalidCastException. Two paths to one goal, with the obvious one broken.

I picked B. It fits the codebase's typed-read convention, fixes a latent UX issue in the same stroke, and doesn't require users to discover a new method through the changelog.

The cost was a hot-path concern: every GetFieldValue<T> call now checks typeof(T).IsArray, which I've helped mitigate with a FieldValueDispatcher<T> static-cached predicate. The hot path becomes a single bool load and branch.

For value-type T the JIT can fold the static initialiser to a constant and eliminate the check entirely; for reference-type T it should be nanoseconds.

Checklist

Delete items not relevant to your PR:

  • Unit and integration tests covering the common scenarios were added
  • A human-readable description of the changes was provided to include in CHANGELOG
  • A human-readable description of the changes was provided to include in RELEASENOTES

Note

Medium Risk
Touches core serialization (HTTP, binary, type inference) and GetFieldValue behavior for multidimensional targets; behavior changes are user-visible but heavily tested and scoped to nested-array scenarios.

Overview
Adds end-to-end support for nested ClickHouse arrays (Array(Array(T)) and deeper) across HTTP parameters, binary/bulk inserts, and type inference (#320).

Write path: Jagged shapes (T[][], List<List<T>>) keep working; rectangular CLR arrays (T[,], T[,,], …) are now serialized correctly instead of being flattened or mis-handled via IEnumerable/IList. A new internal MultiDimArrayHelper walks multidimensional arrays in place for HTTP and binary encoding, validates CLR rank against nested Array depth, and handles non-zero lower bounds. TypeConverter maps multidimensional CLR types to the right nested Array(...) types and no longer crashes on rank > 1 when peeking the first element.

Read path: GetValue still returns jagged arrays; GetFieldValue<T[,]> (and higher rank) materializes rectangular arrays when data is rectangular, with clearer InvalidCastException vs InvalidOperationException semantics and column ordinal in messages.

Diagnostics: HTTP parameter errors include parameter name, full ClickHouse type, and CLR value type; outer type is preserved when formatting fails on nested parameters.

Release notes and broad integration/unit tests cover bulk copy, form-data parameters, ClickHouseClient binary insert, and GetFieldValue edge cases.

Reviewed by Cursor Bugbot for commit 73ab6f8. Bugbot is set up for automated code reviews on this repo. Configure here.

@DanielBunting DanielBunting marked this pull request as ready for review May 13, 2026 11:29
Copilot AI review requested due to automatic review settings May 13, 2026 11:29
@DanielBunting DanielBunting requested a review from mzitnik as a code owner May 13, 2026 11:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds end-to-end support for nested ClickHouse arrays (Array(Array(T)) and deeper) across HTTP-parameter formatting, binary/bulk insert writing, and type inference, including accepting rectangular CLR multidimensional arrays (T[,], T[,,], …) on the write path and enabling GetFieldValue<T> materialization back into multidimensional CLR arrays when the data is rectangular.

Changes:

  • Add MultiDimArrayHelper to slice rank>1 CLR arrays into jagged “rows” for formatting/writing, and to materialize jagged results back into rectangular multidimensional arrays on read.
  • Make TypeConverter.ToClickHouseType rank-aware for both type-based and value-based inference to preserve array nesting depth for multidimensional CLR arrays.
  • Update HTTP formatter and ArrayType.Write to avoid flattening multidimensional CLR arrays; expand diagnostics and add extensive unit/integration test coverage.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
RELEASENOTES.md Documents nested/multidimensional array support and improved HTTP mismatch diagnostics.
CHANGELOG.md Mirrors release notes for nested/multidimensional array support and type-inference fix.
ClickHouse.Driver/Types/TypeConverter.cs Makes array type inference rank-aware (type- and value-based).
ClickHouse.Driver/Types/MultiDimArrayHelper.cs New helper for slicing multidimensional arrays and materializing jagged → multidimensional.
ClickHouse.Driver/Types/ArrayType.cs Adds rank>1 CLR array handling on the binary write path via slicing.
ClickHouse.Driver/Formats/HttpParameterFormatter.cs Adds rank>1 CLR array formatting path and improves/rehydrates nested mismatch error messages.
ClickHouse.Driver/ADO/Readers/ClickHouseDataReader.cs Extends GetFieldValue<T> to materialize jagged nested arrays into multidimensional CLR arrays for rank>=2 T.
ClickHouse.Driver.Tests/Types/TypeMappingTests.cs Adds unit tests for deeper nested arrays and multidimensional CLR array type mapping/inference + formatter coverage.
ClickHouse.Driver.Tests/Types/MultiDimArrayHelperTests.cs Adds focused unit tests for slicing and jagged→multidimensional materialization (incl. ragged rejection).
ClickHouse.Driver.Tests/SQL/NestedArrayParameterTests.cs Adds end-to-end integration tests covering nested arrays across select/insert/bulk/client APIs and multidimensional read materialization.
ClickHouse.Driver.Tests/SQL/NestedArrayParameterFormDataTests.cs Adds multipart/form-data parameter-path coverage for nested arrays.
ClickHouse.Driver.Tests/BulkCopy/BulkCopyTests.cs Adds bulk-copy test cases for nested arrays.

Comment thread ClickHouse.Driver/ADO/Readers/ClickHouseDataReader.cs Outdated
Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs Outdated
Copy link
Copy Markdown
Collaborator

@alex-clickhouse alex-clickhouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! Broadly looks good, but a few points before we can merge.

Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs
Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs Outdated
Comment thread ClickHouse.Driver/Formats/HttpParameterFormatter.cs
Comment thread ClickHouse.Driver.Tests/SQL/NestedArrayParameterTests.cs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

switches MultiDimArrayHelper to attempt to single sweep allocate.
tidies up tests.
adds support for non zero-bound arrays.
updates changelog + release notes
@DanielBunting DanielBunting requested a review from Copilot May 21, 2026 21:39
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Comment thread ClickHouse.Driver/Types/TypeConverter.cs
Comment thread ClickHouse.Driver/Formats/HttpParameterFormatter.cs
@DanielBunting DanielBunting requested a review from Copilot May 22, 2026 17:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Comment thread ClickHouse.Driver/ADO/Readers/ClickHouseDataReader.cs
Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs Outdated
Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs
Comment thread ClickHouse.Driver.Tests/SQL/NestedArrayParameterFormDataTests.cs Outdated
Comment thread ClickHouse.Driver.Tests/SQL/NestedArrayParameterFormDataTests.cs
@DanielBunting DanielBunting requested a review from Copilot May 24, 2026 00:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Comment thread ClickHouse.Driver/Formats/HttpParameterFormatter.cs
Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs Outdated
Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 877041e. Configure here.

Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs Outdated
@DanielBunting DanielBunting requested a review from Copilot May 26, 2026 08:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs
Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs
Comment thread ClickHouse.Driver/ADO/Readers/ClickHouseDataReader.cs
Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs Outdated
@DanielBunting DanielBunting requested a review from Copilot May 26, 2026 18:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs
Comment thread ClickHouse.Driver/Types/MultiDimArrayHelper.cs
@DanielBunting DanielBunting requested a review from Copilot May 26, 2026 23:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Comment thread ClickHouse.Driver/Types/TypeConverter.cs
Comment thread ClickHouse.Driver/Formats/HttpParameterFormatter.cs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants