Skip to content

[C++] Provide way for extension array to provide it's own value pretty printer #36648

@wjones127

Description

@wjones127

Describe the enhancement requested

We are looking at implementing BFloat16 data type as an extension type based on a FSB[2] storage type. In Python we can write the repr method for the array, but ChunkedArray and Table will print the storage type, which looks like nonsense to users:

>>> tab = pa.table({ "x": bfloat16_array([1.1, None, 3.4]) })
>>> tab
pyarrow.Table
x: extension<lance.bfloat16<BFloat16Type>>
----
x: [[8D3F,null,5A40]]
>>> tab['x']
<pyarrow.lib.ChunkedArray object at 0x12a1f3a60>
[
  [
    8D3F,
    null,
    5A40
  ]
]
>>> tab['x'].chunk(0)
<lance.arrow.BFloat16Array object at 0x0000000129818a60>
[1.1015625, None, 3.40625]

lance-format/lance#1002

Component(s)

C++

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions