Skip to content

Conversation

@benibus
Copy link
Contributor

@benibus benibus commented Mar 17, 2023

This adds a crossbow job for ARROW_IPC, ARROW_PARQUET, and ARROW_CSV - based on a minimal Ubuntu image.

The job primarily aims to test the core Arrow library + basic format support without the full kernel registry provided by ARROW_COMPUTE. Note that ARROW_JSON is implicitly enabled as well, since it's a dependency of ARROW_TESTING.

@github-actions
Copy link

@github-actions
Copy link

⚠️ GitHub issue #34615 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added the awaiting review Awaiting review label Mar 17, 2023
@benibus benibus force-pushed the GH-34615-minimal-cpp-ci-job branch 2 times, most recently from 5510338 to eee9253 Compare March 19, 2023 00:25
@benibus benibus force-pushed the GH-34615-minimal-cpp-ci-job branch from eee9253 to e3d60b9 Compare March 19, 2023 19:13
@benibus benibus marked this pull request as ready for review March 19, 2023 21:06
@kou
Copy link
Member

kou commented Mar 19, 2023

There are still some enabled components:

https://github.com/apache/arrow/actions/runs/4462272725/jobs/7836694435?pr=34617#step:6:1866

--   ARROW_BUILD_UTILITIES=ON [default=OFF]
--       Build Arrow commandline utilities
--   ARROW_CSV=ON [default=OFF]
--       Build the Arrow CSV Parser Module
--   ARROW_FILESYSTEM=ON [default=OFF]
--       Build the Arrow Filesystem Layer
--   ARROW_IPC=ON [default=ON]
--       Build the Arrow IPC extensions
--   ARROW_JSON=ON [default=OFF]
--       Build Arrow with JSON support (requires RapidJSON)
--   ARROW_PARQUET=ON [default=OFF]
--       Build the Parquet libraries

Are they intentional?

@benibus
Copy link
Contributor Author

benibus commented Mar 19, 2023

Are they intentional?

ARROW_BUILD_UTILITIES and ARROW_FILESYSTEM are unintentional (I think the latter is being enabled by PARQUET_REQUIRE_ENCRYPTION=ON). Regarding the others, see my edit to the body message (I attempted to clarify my intent as well).

@benibus
Copy link
Contributor Author

benibus commented Mar 19, 2023

Also, fair warning: I've never actually messed with CI system before so there may be a better way to do this. I just went with one of the preexisting Ubuntu images since it seemed like a reasonable base (to me, at least).

@kou
Copy link
Member

kou commented Mar 20, 2023

OK. Then "minimal" may be misleading. For example, we use "minimal" as "all components are disabled" in https://github.com/apache/arrow/blob/main/cpp/CMakePresets.json#L49 . Can we use other label for this job?

If we enable Parquet, CI time isn't short. How about running this as a nightly CI job instead like test-*-cpp-* in https://github.com/apache/arrow/blob/main/dev/tasks/tasks.yml#L1171-L1256 ?

@benibus
Copy link
Contributor Author

benibus commented Mar 20, 2023

I agree that it's misleading.

We could also handle both cases pretty easily, I think. The "minimal" image I have here could disable everything as it implies. Then, the nightly task would use that image but enable CSV/Parquet.

@benibus benibus force-pushed the GH-34615-minimal-cpp-ci-job branch from e3d60b9 to aa7cb5b Compare March 20, 2023 16:58
@benibus
Copy link
Contributor Author

benibus commented Mar 20, 2023

@github-actions crossbow submit test-ubuntu-20.04-cpp-minimal-with-formats

@github-actions
Copy link

Revision: aa7cb5b4fae7bd3505c8f6dbc976ae5131028bfd

Submitted crossbow builds: ursacomputing/crossbow @ actions-7a8059874a

Task Status
test-ubuntu-20.04-cpp-minimal-with-formats Github Actions

@github-actions
Copy link

Revision: aa7cb5b4fae7bd3505c8f6dbc976ae5131028bfd

Submitted crossbow builds: ursacomputing/crossbow @ actions-89628f20c6

Task Status
test-ubuntu-20.04-cpp-minimal-with-formats Github Actions

@kou
Copy link
Member

kou commented Mar 21, 2023

Could you also update "minimal" in the title and the description of this PR and GH-34615?

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Mar 21, 2023
@benibus benibus force-pushed the GH-34615-minimal-cpp-ci-job branch from aa7cb5b to 9147c15 Compare March 21, 2023 16:08
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Mar 21, 2023
@benibus benibus changed the title GH-34615: [CI][C++] Add CI job for minimal C++ build GH-34615: [CI][C++] Add CI job for basic format support without ARROW_COMPUTE Mar 21, 2023
@benibus benibus requested a review from kou March 21, 2023 16:44
@kou
Copy link
Member

kou commented Mar 21, 2023

@github-actions crossbow submit test-ubuntu-20.04-cpp-minimal-with-formats

@github-actions
Copy link

Revision: 9147c15ce4120b9b6fc0d9b233d50d7bc1d1d2a7

Submitted crossbow builds: ursacomputing/crossbow @ actions-b9824dee1c

Task Status
test-ubuntu-20.04-cpp-minimal-with-formats Github Actions

@raulcd
Copy link
Member

raulcd commented Mar 22, 2023

I've updated the body to include the closes #34655 as suggested.

@raulcd
Copy link
Member

raulcd commented Mar 22, 2023

@github-actions crossbow submit test-cuda-cpp

@github-actions
Copy link

Revision: 9147c15ce4120b9b6fc0d9b233d50d7bc1d1d2a7

Submitted crossbow builds: ursacomputing/crossbow @ actions-2e55c6dbfe

Task Status
test-cuda-cpp Github Actions

@raulcd
Copy link
Member

raulcd commented Mar 22, 2023

The test-cuda-cpp seems to fail with a similar error to the one reported on #34655 :

 '_error_or_value25.status()' failed with Key error: No function registered with name: sort_indices
/arrow/cpp/src/arrow/compute/exec/order_by_node_test.cc:61: Failure
Failed

@benibus let me know if I should open a different issue for that or should be fixed here. Thanks!

@benibus
Copy link
Contributor Author

benibus commented Mar 22, 2023

@raulcd Yeah, I saw that last night. Probably best to handle it here as well.

I'll do a light CMake refactor to hopefully prevent more of these issues since it's too easy to make mistakes ATM.

benibus added 5 commits March 22, 2023 10:11
Some temporal casts will fail without timezone info
compute-expression-test was recently merged into compute-internals-test
as part of a larger refactor. However, the expression tests depend on
additional kernels (i.e. equal) so I re-separated them.
@benibus benibus force-pushed the GH-34615-minimal-cpp-ci-job branch from 9147c15 to 42e52be Compare March 22, 2023 14:16
@benibus benibus force-pushed the GH-34615-minimal-cpp-ci-job branch from 42e52be to 0a79eb3 Compare March 22, 2023 14:18
@benibus
Copy link
Contributor Author

benibus commented Mar 22, 2023

@kou @raulcd test-ubuntu-20.04-cpp-minimal-with-formats and test-cuda-cpp should pass on this branch now.

@benibus
Copy link
Contributor Author

benibus commented Mar 22, 2023

@github-actions crossbow submit test-cuda-cpp

@github-actions
Copy link

Revision: 0a79eb3

Submitted crossbow builds: ursacomputing/crossbow @ actions-53125ee05c

Task Status
test-cuda-cpp Github Actions

@benibus
Copy link
Contributor Author

benibus commented Mar 22, 2023

@github-actions crossbow submit test-ubuntu-20.04-cpp-minimal-with-formats

@github-actions
Copy link

Revision: 0a79eb3

Submitted crossbow builds: ursacomputing/crossbow @ actions-a36f0bd26a

Task Status
test-ubuntu-20.04-cpp-minimal-with-formats Github Actions

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@kou kou merged commit 4487be0 into apache:main Mar 23, 2023
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Mar 23, 2023
@ursabot
Copy link

ursabot commented Mar 23, 2023

Benchmark runs are scheduled for baseline = 3600bd8 and contender = 4487be0. 4487be0 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.18% ⬆️0.0%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.09% ⬆️0.03%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 4487be02 ec2-t3-xlarge-us-east-2
[Finished] 4487be02 test-mac-arm
[Finished] 4487be02 ursa-i9-9960x
[Finished] 4487be02 ursa-thinkcentre-m75q
[Finished] 3600bd80 ec2-t3-xlarge-us-east-2
[Finished] 3600bd80 test-mac-arm
[Finished] 3600bd80 ursa-i9-9960x
[Finished] 3600bd80 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

ArgusLi pushed a commit to Bit-Quill/arrow that referenced this pull request May 15, 2023
… ARROW_COMPUTE (apache#34617)

This adds a crossbow job for `ARROW_IPC`, `ARROW_PARQUET`, and `ARROW_CSV` - based on a minimal Ubuntu image.

The job primarily aims to test the core Arrow library + basic format support without the full kernel registry provided by `ARROW_COMPUTE`. Note that `ARROW_JSON` is implicitly enabled as well, since it's a dependency of `ARROW_TESTING`.

* Closes: apache#34615
* Closes: apache#34655 

Authored-by: benibus <bpharks@gmx.com>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

4 participants