Skip to content

Conversation

@zanmato1984
Copy link
Contributor

@zanmato1984 zanmato1984 commented Feb 27, 2024

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

bkietz pushed a commit that referenced this pull request Apr 23, 2024
…ion (#40237)

### Rationale for this change

As #40069 shows, TSAN reports data race that is caused by concurrent filling the scratch space of a scalar instance. The concurrent use of the same scalar could be parallel executing an acero plan containing a literal (a "constant" that is simply represented by an underlying scalar), and this is totally legit. The problem lies in the fact that the scratch space of the scalar is filled "lazily" by the time when it is being involved in the computation and transformed to an array span, for *every* thread.

After piloting several approaches (relaxed atomic - an earlier version of this PR, locking - #40260), @ pitrou and @ bkietz suggested an immutable-after-initialization approach, for which the latest version of this PR is.

### What changes are included in this PR?

There are generally two parts in this PR:
1. Mandate the initialization of the scratch space in constructor of the concrete subclass of `Scalar`.
2. In order to keep the content of the scratch space consistent with the underlying `value` of the scalar, make the `value` constant. This effectively makes legacy code that directly assigning to the `value` invalid, which is refactored accordingly:
  2.1 `BoxScalar` in https://github.com/apache/arrow/pull/40237/files#diff-08d11e02c001c82b1aa89565e16760a8bcca4a608c22619fb45da42fd0ebebac
  2.2 `Scalar::CastTo` in https://github.com/apache/arrow/pull/40237/files#diff-b4b83682450006616fa7e4f6f2ea3031cf1a22d734f4bee42a99af313e808f9e
  2.3 `ScalarMinMax` in https://github.com/apache/arrow/pull/40237/files#diff-368ab7e748bd4432c92d9fdc26b51e131742b968e3eb32a6fcea4b9f02fa36aa

Besides, when refactoring 2.2, I found the current `Scalar::CastTo` is not fully covered by the existing tests. So I also added some lacking ones.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

**This PR includes breaking changes to public APIs.**
The `value` member of `BaseBinaryScalar` and subclasses/`BaseListScalar` and subclasses/`SparseUnionScalar`/`DenseUnionScalar`/`RunEndEncodedScalar` is made constant, thus code directly assigning to this member will no more compile.

Also the `Scalar::mutable_data()` member function is removed because it's against the immutable nature of `Scalar`.

However the impact of these changes seems limited. I don't think much user code is depending on these two old pieces of code.

Also after an quick search, I didn't find any document that need to be updated according to this change. There could be none. But if there is, may someone please redirect me to it so I can update. Thanks.

* GitHub Issue: #40069

Lead-authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Co-authored-by: Rossi Sun <zanmato1984@gmail.com>
Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
zanmato1984 added a commit to zanmato1984/arrow that referenced this pull request Apr 24, 2024
…alization (apache#40237)

### Rationale for this change

As apache#40069 shows, TSAN reports data race that is caused by concurrent filling the scratch space of a scalar instance. The concurrent use of the same scalar could be parallel executing an acero plan containing a literal (a "constant" that is simply represented by an underlying scalar), and this is totally legit. The problem lies in the fact that the scratch space of the scalar is filled "lazily" by the time when it is being involved in the computation and transformed to an array span, for *every* thread.

After piloting several approaches (relaxed atomic - an earlier version of this PR, locking - apache#40260), @ pitrou and @ bkietz suggested an immutable-after-initialization approach, for which the latest version of this PR is.

### What changes are included in this PR?

There are generally two parts in this PR:
1. Mandate the initialization of the scratch space in constructor of the concrete subclass of `Scalar`.
2. In order to keep the content of the scratch space consistent with the underlying `value` of the scalar, make the `value` constant. This effectively makes legacy code that directly assigning to the `value` invalid, which is refactored accordingly:
  2.1 `BoxScalar` in https://github.com/apache/arrow/pull/40237/files#diff-08d11e02c001c82b1aa89565e16760a8bcca4a608c22619fb45da42fd0ebebac
  2.2 `Scalar::CastTo` in https://github.com/apache/arrow/pull/40237/files#diff-b4b83682450006616fa7e4f6f2ea3031cf1a22d734f4bee42a99af313e808f9e
  2.3 `ScalarMinMax` in https://github.com/apache/arrow/pull/40237/files#diff-368ab7e748bd4432c92d9fdc26b51e131742b968e3eb32a6fcea4b9f02fa36aa

Besides, when refactoring 2.2, I found the current `Scalar::CastTo` is not fully covered by the existing tests. So I also added some lacking ones.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

**This PR includes breaking changes to public APIs.**
The `value` member of `BaseBinaryScalar` and subclasses/`BaseListScalar` and subclasses/`SparseUnionScalar`/`DenseUnionScalar`/`RunEndEncodedScalar` is made constant, thus code directly assigning to this member will no more compile.

Also the `Scalar::mutable_data()` member function is removed because it's against the immutable nature of `Scalar`.

However the impact of these changes seems limited. I don't think much user code is depending on these two old pieces of code.

Also after an quick search, I didn't find any document that need to be updated according to this change. There could be none. But if there is, may someone please redirect me to it so I can update. Thanks.

* GitHub Issue: apache#40069

Lead-authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Co-authored-by: Rossi Sun <zanmato1984@gmail.com>
Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
raulcd pushed a commit that referenced this pull request Apr 29, 2024
…ion (#40237)

### Rationale for this change

As #40069 shows, TSAN reports data race that is caused by concurrent filling the scratch space of a scalar instance. The concurrent use of the same scalar could be parallel executing an acero plan containing a literal (a "constant" that is simply represented by an underlying scalar), and this is totally legit. The problem lies in the fact that the scratch space of the scalar is filled "lazily" by the time when it is being involved in the computation and transformed to an array span, for *every* thread.

After piloting several approaches (relaxed atomic - an earlier version of this PR, locking - #40260), @ pitrou and @ bkietz suggested an immutable-after-initialization approach, for which the latest version of this PR is.

### What changes are included in this PR?

There are generally two parts in this PR:
1. Mandate the initialization of the scratch space in constructor of the concrete subclass of `Scalar`.
2. In order to keep the content of the scratch space consistent with the underlying `value` of the scalar, make the `value` constant. This effectively makes legacy code that directly assigning to the `value` invalid, which is refactored accordingly:
  2.1 `BoxScalar` in https://github.com/apache/arrow/pull/40237/files#diff-08d11e02c001c82b1aa89565e16760a8bcca4a608c22619fb45da42fd0ebebac
  2.2 `Scalar::CastTo` in https://github.com/apache/arrow/pull/40237/files#diff-b4b83682450006616fa7e4f6f2ea3031cf1a22d734f4bee42a99af313e808f9e
  2.3 `ScalarMinMax` in https://github.com/apache/arrow/pull/40237/files#diff-368ab7e748bd4432c92d9fdc26b51e131742b968e3eb32a6fcea4b9f02fa36aa

Besides, when refactoring 2.2, I found the current `Scalar::CastTo` is not fully covered by the existing tests. So I also added some lacking ones.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

**This PR includes breaking changes to public APIs.**
The `value` member of `BaseBinaryScalar` and subclasses/`BaseListScalar` and subclasses/`SparseUnionScalar`/`DenseUnionScalar`/`RunEndEncodedScalar` is made constant, thus code directly assigning to this member will no more compile.

Also the `Scalar::mutable_data()` member function is removed because it's against the immutable nature of `Scalar`.

However the impact of these changes seems limited. I don't think much user code is depending on these two old pieces of code.

Also after an quick search, I didn't find any document that need to be updated according to this change. There could be none. But if there is, may someone please redirect me to it so I can update. Thanks.

* GitHub Issue: #40069

Lead-authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Co-authored-by: Rossi Sun <zanmato1984@gmail.com>
Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
…alization (apache#40237)

### Rationale for this change

As apache#40069 shows, TSAN reports data race that is caused by concurrent filling the scratch space of a scalar instance. The concurrent use of the same scalar could be parallel executing an acero plan containing a literal (a "constant" that is simply represented by an underlying scalar), and this is totally legit. The problem lies in the fact that the scratch space of the scalar is filled "lazily" by the time when it is being involved in the computation and transformed to an array span, for *every* thread.

After piloting several approaches (relaxed atomic - an earlier version of this PR, locking - apache#40260), @ pitrou and @ bkietz suggested an immutable-after-initialization approach, for which the latest version of this PR is.

### What changes are included in this PR?

There are generally two parts in this PR:
1. Mandate the initialization of the scratch space in constructor of the concrete subclass of `Scalar`.
2. In order to keep the content of the scratch space consistent with the underlying `value` of the scalar, make the `value` constant. This effectively makes legacy code that directly assigning to the `value` invalid, which is refactored accordingly:
  2.1 `BoxScalar` in https://github.com/apache/arrow/pull/40237/files#diff-08d11e02c001c82b1aa89565e16760a8bcca4a608c22619fb45da42fd0ebebac
  2.2 `Scalar::CastTo` in https://github.com/apache/arrow/pull/40237/files#diff-b4b83682450006616fa7e4f6f2ea3031cf1a22d734f4bee42a99af313e808f9e
  2.3 `ScalarMinMax` in https://github.com/apache/arrow/pull/40237/files#diff-368ab7e748bd4432c92d9fdc26b51e131742b968e3eb32a6fcea4b9f02fa36aa

Besides, when refactoring 2.2, I found the current `Scalar::CastTo` is not fully covered by the existing tests. So I also added some lacking ones.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

**This PR includes breaking changes to public APIs.**
The `value` member of `BaseBinaryScalar` and subclasses/`BaseListScalar` and subclasses/`SparseUnionScalar`/`DenseUnionScalar`/`RunEndEncodedScalar` is made constant, thus code directly assigning to this member will no more compile.

Also the `Scalar::mutable_data()` member function is removed because it's against the immutable nature of `Scalar`.

However the impact of these changes seems limited. I don't think much user code is depending on these two old pieces of code.

Also after an quick search, I didn't find any document that need to be updated according to this change. There could be none. But if there is, may someone please redirect me to it so I can update. Thanks.

* GitHub Issue: apache#40069

Lead-authored-by: Ruoxi Sun <zanmato1984@gmail.com>
Co-authored-by: Rossi Sun <zanmato1984@gmail.com>
Signed-off-by: Benjamin Kietzman <bengilgit@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant