Skip to content

Conversation

@felipecrv
Copy link
Contributor

@felipecrv felipecrv commented Apr 23, 2024

Rationale for this change

Users and other classes within Arrow itself (e.g. array builders) expect extension types to behave like their underlying storage type.

As it is now, ExtensionType::bit_width() is the default DataType::bit_width() implementation which returns -1. It should return the storage type's bit-width.

What changes are included in this PR?

Definition of ExtensionType::bit_width/byte_width functions.

Are these changes tested?

Tests added and confirmed to fail prior to these changes.

Are there any user-facing changes?

ExtensionType now define bit_width and byte_width according to their storage type.

@felipecrv felipecrv requested review from bkietz, kou and mapleFU April 23, 2024 15:03
@rok
Copy link
Member

rok commented Apr 23, 2024

Does this work ok for non-fixed width storage types?

@felipecrv
Copy link
Contributor Author

felipecrv commented Apr 23, 2024

Does this work ok for non-fixed width storage types?

They will continue to return -1 just like non-fixed-width types do today.

And I didn't make them final, so if a ExtensionType sub-class wants to return something different (even -1 to hide their fixed-width-ness), they can. But I doubt that will ever make much sense.

@felipecrv felipecrv merged commit fb7e468 into apache:main Apr 23, 2024
@felipecrv felipecrv removed the awaiting committer review Awaiting committer review label Apr 23, 2024
@felipecrv felipecrv deleted the bit_width_extension branch April 23, 2024 20:01
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit fb7e468.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 26 possible false positives for unstable benchmarks that are known to sometimes produce them.

zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Apr 24, 2024
…e in terms of the storage type (apache#41354)

### Rationale for this change

Users and other classes within Arrow itself (e.g. array builders) expect extension types to behave like their underlying storage type.

As it is now, `ExtensionType::bit_width()` is the default `DataType::bit_width()` implementation which returns `-1`. It should return the storage type's bit-width.

### What changes are included in this PR?

Definition of `ExtensionType::bit_width/byte_width` functions.

### Are these changes tested?

Tests added and confirmed to fail prior to these changes.

### Are there any user-facing changes?

`ExtensionType` now define `bit_width` and `byte_width` according to their storage type.
* GitHub Issue: apache#41353

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
raulcd pushed a commit that referenced this pull request Apr 29, 2024
…erms of the storage type (#41354)

### Rationale for this change

Users and other classes within Arrow itself (e.g. array builders) expect extension types to behave like their underlying storage type.

As it is now, `ExtensionType::bit_width()` is the default `DataType::bit_width()` implementation which returns `-1`. It should return the storage type's bit-width.

### What changes are included in this PR?

Definition of `ExtensionType::bit_width/byte_width` functions.

### Are these changes tested?

Tests added and confirmed to fail prior to these changes.

### Are there any user-facing changes?

`ExtensionType` now define `bit_width` and `byte_width` according to their storage type.
* GitHub Issue: #41353

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
…e in terms of the storage type (apache#41354)

### Rationale for this change

Users and other classes within Arrow itself (e.g. array builders) expect extension types to behave like their underlying storage type.

As it is now, `ExtensionType::bit_width()` is the default `DataType::bit_width()` implementation which returns `-1`. It should return the storage type's bit-width.

### What changes are included in this PR?

Definition of `ExtensionType::bit_width/byte_width` functions.

### Are these changes tested?

Tests added and confirmed to fail prior to these changes.

### Are there any user-facing changes?

`ExtensionType` now define `bit_width` and `byte_width` according to their storage type.
* GitHub Issue: apache#41353

Authored-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants