Skip to content

Conversation

@pitrou
Copy link
Member

@pitrou pitrou commented Jun 4, 2025

Rationale for this change

PR #46408 included a typo that changed list-view IPC tests to use the same data as list tests. This was detected as a duplicate corpus file by the OSS-Fuzz CI build.

What changes are included in this PR?

Undo mistake that led to using the same test data for lists and list-views. Also fix a regression in the CUDA tests, due to reading non-CPU memory when fetching the first offset in a list/binary array.

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

PR apache#46408 changed by mistake list-view IPC tests to use the same data as list tests.
This was detected as a duplicate corpus file by the OSS-Fuzz CI build.

This PR also includes a fix for a regression in the CUDA tests, due to reading non-CPU memory.
@pitrou
Copy link
Member Author

pitrou commented Jun 4, 2025

@github-actions crossbow submit -g cpp

@github-actions github-actions bot added the awaiting review Awaiting review label Jun 4, 2025
@github-actions
Copy link

github-actions bot commented Jun 4, 2025

Revision: 7ee42d1

Submitted crossbow builds: ursacomputing/crossbow @ actions-4d70c06e41

Task Status
example-cpp-minimal-build-static GitHub Actions
example-cpp-minimal-build-static-system-dependency GitHub Actions
example-cpp-tutorial GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-cuda-cpp-ubuntu-22.04-cuda-11.7.1 GitHub Actions
test-debian-12-cpp-amd64 GitHub Actions
test-debian-12-cpp-i386 GitHub Actions
test-fedora-39-cpp GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-20 GitHub Actions
test-ubuntu-22.04-cpp-bundled GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-bundled-offline GitHub Actions
test-ubuntu-24.04-cpp-gcc-13-bundled GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions
test-ubuntu-24.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-24.04-cpp-thread-sanitizer GitHub Actions

@pitrou pitrou requested review from felipecrv and zeroshade June 4, 2025 08:30
@pitrou
Copy link
Member Author

pitrou commented Jun 4, 2025

@brunal FYI. I should have caught this during code review :)

@brunal
Copy link
Contributor

brunal commented Jun 4, 2025

Thank you for cleaning up after me!

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Jun 4, 2025
@pitrou pitrou requested a review from zanmato1984 June 5, 2025 08:06
@pitrou
Copy link
Member Author

pitrou commented Jun 5, 2025

Any other comments @kou ?

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting changes Awaiting changes labels Jun 5, 2025
const int64_t required_bytes = sizeof(offset_type) * (array.length() + 1);

offset_type first_offset = 0;
RETURN_NOT_OK(MemoryManager::CopyBufferSliceToCPU(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll need some context here. Is this function supposed to be able to deal with non-CPU arrays? Does this rule apply to other functions in this source file too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's supposed to.

Comment on lines +342 to +344
if (!array.data()->buffers[1]->is_cpu()) {
return Status::NotImplemented("Rebasing non-CPU offsets");
}
Copy link
Contributor

@zanmato1984 zanmato1984 Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions:

  1. Is this error handling necessary to fix the "CUDA tests" mentioned in the PR description?
  2. It appears that if the CUDA tests were failing with accessing the non-CPU first_offset, and the above MemoryManager::CopyBufferSliceToCPU is supposed to fix it, wouldn't this check fail the tests again with Unimplemented? (I assume the said tests are exercising non-CPU buffers with first_offset > 0.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes
  2. This codepath is only taken with a non-zero first offset, but the CUDA tests don't exercise that (otherwise they would have been crashing on git main too)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #46747 for this.

Copy link
Contributor

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@pitrou pitrou merged commit 5f7d885 into apache:main Jun 9, 2025
45 of 47 checks passed
@pitrou pitrou removed the awaiting merge Awaiting merge label Jun 9, 2025
@pitrou pitrou deleted the gh46704-ipc-test-fix branch June 9, 2025 15:29
@conbench-apache-arrow
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 5f7d885.

There were 65 benchmark results with an error:

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 6 possible false positives for unstable benchmarks that are known to sometimes produce them.

alinaliBQ pushed a commit to Bit-Quill/arrow that referenced this pull request Jun 17, 2025
### Rationale for this change

PR apache#46408 included a typo that changed list-view IPC tests to use the same data as list tests. This was detected as a duplicate corpus file by the OSS-Fuzz CI build.

### What changes are included in this PR?

Undo mistake that led to using the same test data for lists and list-views. Also fix a regression in the CUDA tests, due to reading non-CPU memory when fetching the first offset in a list/binary array.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#46704

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants