Skip to content

[C++] Change Scalar::CastTo to do safe cast by default and allow to specify cast options ? #35040

@jorisvandenbossche

Description

@jorisvandenbossche

Describe the enhancement requested

See #34901 for a longer discussion, but summarizing: the pyarrow.Scalar object has a cast() method, but in contrast with other cast methods in pyarrow it does an unsafe cast by default. We should probably change this to do a safe cast by default, and at the same time also allow to specify CastOptions (so a user can still choose to do an unsafe cast).

Example:

# scalar behaviour
>>> pa.scalar(1.5)
<pyarrow.DoubleScalar: 1.5>
>>> pa.scalar(1.5).cast(pa.int64())
<pyarrow.Int64Scalar: 1>

# vs array behaviour
>>> pa.array([1.5]).cast(pa.int64())
...
ArrowInvalid: Float value 1.5 was truncated converting to int64

The python cast() method calls the C++ Scalar::ToCast:

// TODO(bkietz) add compute::CastOptions
Result<std::shared_ptr<Scalar>> CastTo(std::shared_ptr<DataType> to) const;

which currently indeed doesn't have the option to pass CastOptions.

In addition, it seems that for casting Scalars, we do have a somewhat custom implementation, and this doesn't use the generic Cast implementation (from the compute kernels), but has a custom CastImpl in scalar.cc. Not fully sure about the reason for this, but maybe historically we wanted to have scalar casting without relying on the optional compute module? (cfr #25025)

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions