Skip to content

Conversation

@zanmato1984
Copy link
Contributor

@zanmato1984 zanmato1984 commented Feb 26, 2024

Rationale for this change

As #40069 shows, TSAN reports data race that is caused by concurrent filling the scratch space of a scalar instance. The concurrent use of the same scalar could be parallel executing an acero plan containing a literal (a "constant" that is simply represented by an underlying scalar), and this is totally legit. The problem lies in the fact that the scratch space of the scalar is filled "lazily" by the time when it is being involved in the computation and transformed to an array span, for every thread.

After piloting several approaches (relaxed atomic - an earlier version of this PR, locking - #40260), @pitrou and @bkietz suggested an immutable-after-initialization approach, for which the latest version of this PR is.

What changes are included in this PR?

There are generally two parts in this PR:

  1. Mandate the initialization of the scratch space in constructor of the concrete subclass of Scalar.
  2. In order to keep the content of the scratch space consistent with the underlying value of the scalar, make the value constant. This effectively makes legacy code that directly assigning to the value invalid, which is refactored accordingly:
    2.1 BoxScalar in https://github.com/apache/arrow/pull/40237/files#diff-08d11e02c001c82b1aa89565e16760a8bcca4a608c22619fb45da42fd0ebebac
    2.2 Scalar::CastTo in https://github.com/apache/arrow/pull/40237/files#diff-b4b83682450006616fa7e4f6f2ea3031cf1a22d734f4bee42a99af313e808f9e
    2.3 ScalarMinMax in https://github.com/apache/arrow/pull/40237/files#diff-368ab7e748bd4432c92d9fdc26b51e131742b968e3eb32a6fcea4b9f02fa36aa

Besides, when refactoring 2.2, I found the current Scalar::CastTo is not fully covered by the existing tests. So I also added some lacking ones.

Are these changes tested?

Yes.

Are there any user-facing changes?

This PR includes breaking changes to public APIs.
The value member of BaseBinaryScalar and subclasses/BaseListScalar and subclasses/SparseUnionScalar/DenseUnionScalar/RunEndEncodedScalar is made constant, thus code directly assigning to this member will no more compile.

Also the Scalar::mutable_data() member function is removed because it's against the immutable nature of Scalar.

However the impact of these changes seems limited. I don't think much user code is depending on these two old pieces of code.

Also after an quick search, I didn't find any document that need to be updated according to this change. There could be none. But if there is, may someone please redirect me to it so I can update. Thanks.

@github-actions
Copy link

⚠️ GitHub issue #40069 has been automatically assigned in GitHub to PR creator.

Copy link
Member

@bkietz bkietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Feb 26, 2024
@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting change review Awaiting change review awaiting changes Awaiting changes labels Feb 26, 2024
@zanmato1984
Copy link
Contributor Author

Hi @bkietz , I had some bad news about the current approach.

The current approach did prevent WW race by using atomic stores. The reading, however, is using plane memory load, thus leads to RW race (observable by TSAN, at this line https://github.com/apache/arrow/pull/40237/files#diff-5af65f049e302f76671ecbfc17b0f9c522ec94be291b2bcb442fa565d3607736R2006). Given that the reading is through ArraySpan and spread all over the compute component, it is not feasible to change every read to atomic load.

So I have to turn to an alternative approach drafted in #40260, using DCLP which could prevent both WW and RW race conditions. Would you step to #40260 to take a look? (It's of course in draft and lacks tests of many other scalar types, but it would be nice to make agreement on the approach first)

Once we have an agreement on the new approach, I'll close this PR and move on with the other.

cc @pitrou

@pitrou
Copy link
Member

pitrou commented Feb 27, 2024

Ok, before we introduce even more complexity and ad-hoc synchronization, let's step back a bit and ask ourselves the question: why is a ArraySpan or BufferSpan being initialized concurrently? We should probably fix that, instead of adding bandaids so that it doesn't trigger TSAN warnings.

@bkietz
Copy link
Member

bkietz commented Feb 27, 2024

why is a ArraySpan or BufferSpan being initialized concurrently?

The problem is we want FillFromScalar to act like copying a constant value into the ArraySpan. It would not be an error for multiple threads to copy a const int into a local int. However FillFromScalar mutates the constant scalar from which we're copying.

One way to resolve this without sync would be setting up the offset scratch space when the scalar is constructed. IIRC we didn't do that in the first place because there are places where we assign to BinaryScalar::value, which would invalidate the offsets. Cleaning those up would be annoying but is more in line with Scalar behaving as an immutable value

@pitrou
Copy link
Member

pitrou commented Feb 27, 2024

I think it would be good to move to an immutable-after-instantiation model. I don't think it's worth supporting the mutation of a Scalar after it was constructed (especially as Scalar is not meant to be high performance in the first place).

@zanmato1984
Copy link
Contributor Author

OK, l can check how much work is needed to move to the immutable-after-instantiation approach.

@zanmato1984
Copy link
Contributor Author

Just a concern. Will this approach break the API compatibility? Since BaseBinaryScalar::value is already a public API and there might already have been user code that is assigning to it.

@bkietz
Copy link
Member

bkietz commented Feb 28, 2024

Will this approach break the API compatibility?

Yes, this would be a breaking change. I would guess that not much user code would be invalidated by the change

@zanmato1984 zanmato1984 changed the title GH-40069: [C++] Use relaxed atomic store to write scalar scratch space GH-40069: [C++] Make scalars with scratch space immutable Apr 1, 2024
@zanmato1984
Copy link
Contributor Author

zanmato1984 commented Apr 1, 2024

Sorry for coming back to this late. I'm pushing a new approach into this PR as previously discussed (thus the title change).

It is still WIP, with some further refactoring to be done and a little more tests to be added. But I think it is enough to show the essential of the approach. So I'd like to start the discussion now before further coding, to prevent any last minute surprise. There are two things:

  1. The change in scalar.h may seem a little overkill, specifically the FillScalarScratchSpace and subclasses. This is because I want to mandate the initialization of the scratch space in the subclasses' constructors, meanwhile do it in a polymorphic fashion. Update: scattered ArraySpanFillFromScalarScratchSpace inheritance from middle of the inheritance chain (BaseXxxScalar/Union) to concrete subclasses and use CRTP to initialize the scratch space in constructor. Do you think this is the proper way to do it?
  2. I'm going to make the value immutable only for subclasses of ArraySpanFillFromScalarScratchSpace (only done for BaseBinaryScalar subclasses in this PR so far). This is sufficient to ensure the validity of the scratch space content. But this doesn't fully achieve the goal of "making scalar absolutely immutable by all means" - which is doable, but just requires a lot more changes. Should I move on towards the "full-immutability", e.g., the immutable value of other scalar types, and maybe even the immutable is_valid?

I appreciate your feedback @pitrou @bkietz . Thanks.

@zanmato1984
Copy link
Contributor Author

zanmato1984 commented Apr 7, 2024

The current approach, i.e., immutable value, has some corner cases uncovered with is_valid being mutable.

Say, a scalar instance is created with a valid value. Later setting its is_valid to false is fine because any attempt of using this scalar won't actually use the scratch space content. However it is not the case for the reverse: setting is_valid to true for an initially invalid scalar instance will leave the scratch space content inconsistent.

One way around it could be leveraging Scalar::Validate or Scalar::ValidateFull to detect such inconsistency, supposing user code should call these method aggressively to check the scalar. But seems Scalar::Validate/ValidateFull are only used in tests?

@zanmato1984 zanmato1984 changed the title GH-40069: [C++] Make scalars with scratch space immutable GH-40069: [C++] Make scalar scratch space immutable after initialization Apr 8, 2024
@zanmato1984 zanmato1984 marked this pull request as ready for review April 8, 2024 15:40
@zanmato1984 zanmato1984 requested a review from kou as a code owner April 8, 2024 15:40
@zanmato1984
Copy link
Contributor Author

Without early feedback, I just go ahead and now get the PR ready. The description has been updated accordingly. @pitrou @bkietz could you help to take a look? Thank you!

Copy link
Member

@bkietz bkietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks for working on this!

I do think we should avoid a mix of immutable and mutable scalars. However I think moving is_valid and value for non scratch space scalars could be done in separate PRs.

void FillScalarScratchSpace(Args... args) {
FillScalarScratchSpaceHelper<0>(std::forward<Args>(args)...);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be simplified to:

template <typename T, size_t N>
void WriteArray(void *d, T const (&arr)[N]) {
  static_assert(sizeof(arr) <= internal::kScalarScratchSpaceSize);
  std::memcpy(d, arr, sizeof(arr));
}

void BinaryScalar::FillScratchSpace() {
  WriteArray(scratch_space_,
             {int32_t(0), value ? static_cast<int32_t>(value->size()) : int32_t(0)});
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I wanted to make FillScalarScratchSpace able to accept arbitrary number of arguments of arbitrary types, meanwhile do the copy type-safely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FillScalarScratchSpaceHelper doesn't need to write arbitrary arguments, and actually there's a risk of undefined behavior as written since you might reinterpret_cast<int32_t*>(scratch_space + offset/*=1*/)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that it is unnecessary for this function to accept arbitrary arguments. I will change it for the sake of that.

However I am curious about the risk of UB you mentioned. Did you mean C++ UB wrt. strong aliasing rule? If yes, will memcpy instead eliminate this kind of UB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed. But my curiosity remains :)

Thank you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UB I'm referring to is violation of fundamental alignment requirements (basic.align#1): int32_t can only be allocated on a four byte boundary and it is known at compile time that reinterpret_cast<int32_t*>(scratch_space + offset/*=1*/) is not so aligned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, got it. Thanks for the explanation, and the review of course. Appreciate it!

@jorisvandenbossche
Copy link
Member

One question though (I'm not familiar with emscripten): is emscripten's threading compilable but not runnable?

We have a compile flag to disable threading (ARROW_ENABLE_THREADING) which is disabled for the builds for emscripten (https://arrow.apache.org/docs/dev/developers/cpp/emscripten.html)

bkietz added a commit that referenced this pull request Apr 30, 2024
…vent ub (#41421)

### Rationale for this change

In #40237, I introduced scalar scratch space filling in concrete scalar sub-class constructor, in which there is a static down-casting of `this` to sub-class pointer. Though this is common in CRTP, it happens in base cast constructor. And this is reported in #41407 to be UB by UBSAN's "vptr" sanitizing.

I'm not a language lawyer to tell if this is a true/false-positive. So I proposed two approaches:
1. The easy way: add suppression in [1], like we already did for `shared_ptr`. But apparently this won't be feasible if this is a true-positive (need some language lawyer's help to confirm).
2. The hard way: totally avoid this so-to-speak UB but may introduce more boilerplate code. This PR is the hard way.

[1] https://github.com/apache/arrow/blob/main/r/tools/ubsan.supp

### What changes are included in this PR?

Make `FillScratchSpace` static.

### Are these changes tested?

The existing UT should cover it well.

### Are there any user-facing changes?

None.

* GitHub Issue: #41407

Lead-authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Co-authored-by: Rossi Sun <zanmato1984@gmail.com>
Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com>
Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
jorisvandenbossche pushed a commit that referenced this pull request May 1, 2024
…ut threading support (#41461)

### Rationale for this change

See #41463 and #40237 (comment)

### What changes are included in this PR?

Skip test for platforms that have no threading support.

### Are these changes tested?

Change is test.

### Are there any user-facing changes?

None.

* GitHub Issue: #41463

Authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
raulcd pushed a commit that referenced this pull request May 3, 2024
…vent ub (#41421)

### Rationale for this change

In #40237, I introduced scalar scratch space filling in concrete scalar sub-class constructor, in which there is a static down-casting of `this` to sub-class pointer. Though this is common in CRTP, it happens in base cast constructor. And this is reported in #41407 to be UB by UBSAN's "vptr" sanitizing.

I'm not a language lawyer to tell if this is a true/false-positive. So I proposed two approaches:
1. The easy way: add suppression in [1], like we already did for `shared_ptr`. But apparently this won't be feasible if this is a true-positive (need some language lawyer's help to confirm).
2. The hard way: totally avoid this so-to-speak UB but may introduce more boilerplate code. This PR is the hard way.

[1] https://github.com/apache/arrow/blob/main/r/tools/ubsan.supp

### What changes are included in this PR?

Make `FillScratchSpace` static.

### Are these changes tested?

The existing UT should cover it well.

### Are there any user-facing changes?

None.

* GitHub Issue: #41407

Lead-authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Co-authored-by: Rossi Sun <zanmato1984@gmail.com>
Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com>
Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
raulcd pushed a commit that referenced this pull request May 3, 2024
…ut threading support (#41461)

### Rationale for this change

See #41463 and #40237 (comment)

### What changes are included in this PR?

Skip test for platforms that have no threading support.

### Are these changes tested?

Change is test.

### Are there any user-facing changes?

None.

* GitHub Issue: #41463

Authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
…alization (apache#40237)

### Rationale for this change

As apache#40069 shows, TSAN reports data race that is caused by concurrent filling the scratch space of a scalar instance. The concurrent use of the same scalar could be parallel executing an acero plan containing a literal (a "constant" that is simply represented by an underlying scalar), and this is totally legit. The problem lies in the fact that the scratch space of the scalar is filled "lazily" by the time when it is being involved in the computation and transformed to an array span, for *every* thread.

After piloting several approaches (relaxed atomic - an earlier version of this PR, locking - apache#40260), @ pitrou and @ bkietz suggested an immutable-after-initialization approach, for which the latest version of this PR is.

### What changes are included in this PR?

There are generally two parts in this PR:
1. Mandate the initialization of the scratch space in constructor of the concrete subclass of `Scalar`.
2. In order to keep the content of the scratch space consistent with the underlying `value` of the scalar, make the `value` constant. This effectively makes legacy code that directly assigning to the `value` invalid, which is refactored accordingly:
  2.1 `BoxScalar` in https://github.com/apache/arrow/pull/40237/files#diff-08d11e02c001c82b1aa89565e16760a8bcca4a608c22619fb45da42fd0ebebac
  2.2 `Scalar::CastTo` in https://github.com/apache/arrow/pull/40237/files#diff-b4b83682450006616fa7e4f6f2ea3031cf1a22d734f4bee42a99af313e808f9e
  2.3 `ScalarMinMax` in https://github.com/apache/arrow/pull/40237/files#diff-368ab7e748bd4432c92d9fdc26b51e131742b968e3eb32a6fcea4b9f02fa36aa

Besides, when refactoring 2.2, I found the current `Scalar::CastTo` is not fully covered by the existing tests. So I also added some lacking ones.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

**This PR includes breaking changes to public APIs.**
The `value` member of `BaseBinaryScalar` and subclasses/`BaseListScalar` and subclasses/`SparseUnionScalar`/`DenseUnionScalar`/`RunEndEncodedScalar` is made constant, thus code directly assigning to this member will no more compile.

Also the `Scalar::mutable_data()` member function is removed because it's against the immutable nature of `Scalar`.

However the impact of these changes seems limited. I don't think much user code is depending on these two old pieces of code.

Also after an quick search, I didn't find any document that need to be updated according to this change. There could be none. But if there is, may someone please redirect me to it so I can update. Thanks.

* GitHub Issue: apache#40069

Lead-authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Co-authored-by: Rossi Sun <zanmato1984@gmail.com>
Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
…to prevent ub (apache#41421)

### Rationale for this change

In apache#40237, I introduced scalar scratch space filling in concrete scalar sub-class constructor, in which there is a static down-casting of `this` to sub-class pointer. Though this is common in CRTP, it happens in base cast constructor. And this is reported in apache#41407 to be UB by UBSAN's "vptr" sanitizing.

I'm not a language lawyer to tell if this is a true/false-positive. So I proposed two approaches:
1. The easy way: add suppression in [1], like we already did for `shared_ptr`. But apparently this won't be feasible if this is a true-positive (need some language lawyer's help to confirm).
2. The hard way: totally avoid this so-to-speak UB but may introduce more boilerplate code. This PR is the hard way.

[1] https://github.com/apache/arrow/blob/main/r/tools/ubsan.supp

### What changes are included in this PR?

Make `FillScratchSpace` static.

### Are these changes tested?

The existing UT should cover it well.

### Are there any user-facing changes?

None.

* GitHub Issue: apache#41407

Lead-authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Co-authored-by: Rossi Sun <zanmato1984@gmail.com>
Co-authored-by: Benjamin Kietzman <bengilgit@gmail.com>
Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
… without threading support (apache#41461)

### Rationale for this change

See apache#41463 and apache#40237 (comment)

### What changes are included in this PR?

Skip test for platforms that have no threading support.

### Are these changes tested?

Change is test.

### Are there any user-facing changes?

None.

* GitHub Issue: apache#41463

Authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
pitrou pushed a commit that referenced this pull request Feb 5, 2025
### Rationale for this change

The PR #40237 introduced code:

```
// time to time
template <typename To, typename From, typename T = typename To::TypeClass>
```

However, the `Time64Type::TypeClass` doesn't exist, so SFINAE always failed.

### Are these changes tested?

Yes

### Are there any user-facing changes?

No.

* GitHub Issue: #45362

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
QuietCraftsmanship pushed a commit to QuietCraftsmanship/arrow that referenced this pull request Jul 7, 2025
### Rationale for this change

The PR apache/arrow#40237 introduced code:

```
// time to time
template <typename To, typename From, typename T = typename To::TypeClass>
```

However, the `Time64Type::TypeClass` doesn't exist, so SFINAE always failed.

### Are these changes tested?

Yes

### Are there any user-facing changes?

No.

* GitHub Issue: #45362

Authored-by: mwish <maplewish117@gmail.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting committer review Awaiting committer review Breaking Change Includes a breaking change to the API Component: C++ Component: GLib

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants