-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-14236: [C++] Add GCS testbench for testing #11375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cpp/src/arrow/filesystem/gcsfs.cc
Outdated
| private: | ||
| Result<FileInfo> GetFileInfoImpl(GcsPath const& path, google::cloud::Status status, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should status be passed with const ref?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
kou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already use boost::process for S3 tests. So it's not a new dependency.
Could you clarify what part of the change are you commenting on? Is it the changes to |
Sure. This is an answer of
. I wanted to say that we don't need to think about |
Thanks. I removed the change to |
|
C++ LGTM, @kou any more concerns? |
kou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the change to
vcpkg.json.
Sorry for my poor English skill. It's OK that we add boost-process to vcpkg.json in this pull request. I just wanted to say that there is no problem that we use boost::process for GCS test...
I think that we can merge this with minor Dockerfile related improvements.
|
@github-actions crossbow submit -g nightly |
|
Revision: 6aa1b3e36fee6231fe303e0aeafffac416bae6b8 Submitted crossbow builds: ursacomputing/crossbow @ actions-916 |
|
Can we set |
How large is the slowdown? Compiling embedded libraries from scratch adds to CI delays. |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much @coryan . Here are a number of minor comments.
cpp/src/arrow/filesystem/gcsfs.cc
Outdated
| @@ -70,7 +131,9 @@ bool GcsFileSystem::Equals(const FileSystem& other) const { | |||
| } | |||
|
|
|||
| Result<FileInfo> GcsFileSystem::GetFileInfo(const std::string& path) { | |||
| return Status::NotImplemented("The GCS FileSystem is not fully implemented"); | |||
| auto p = GcsPath::FromString(path); | |||
| if (!p.ok()) return std::move(p).status(); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simply write:
ARROW_ASSIGN_OR_RAISE(auto p, GcsPath::FromString(path));There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"simply" ... hmmm ... Done.
| return Status::Invalid(os.str()); | ||
| case google::cloud::StatusCode::kDeadlineExceeded: | ||
| case google::cloud::StatusCode::kNotFound: | ||
| // TODO: it is unclear if a better mapping would be possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest IOError for all external errors (such as kDeadlineExceeded, kPermissionDenied, kUnauthenticated...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| server_process_ = bp::child(boost::this_process::environment(), exe_path, "-m", | ||
| "testbench", "--port", port_); | ||
|
|
||
| // Create a bucket in the testbench so additional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this sentence incomplete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| @@ -59,6 +155,11 @@ TEST(GcsFileSystem, FileSystemCompare) { | |||
| EXPECT_FALSE(a->Equals(*b)); | |||
| } | |||
|
|
|||
| TEST_F(GcsIntegrationTest, MakeBucket) { | |||
| auto fs = internal::MakeGcsFileSystemForTest(TestGcsOptions()); | |||
| ASSERT_OK(fs->GetFileInfo(kPreexistingBucket)); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note we have utilities for testing in filesystem/test_util.h that would allow this test to be a bit more interesting, e.g.:
AssertFileInfo(fs.get(), kPreexistingBucket, FileType::Directory);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
The tests hang here in #0 0x00007fa8e4cc8aff in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fa8e542a769 in ?? () from /lib/x86_64-linux-gnu/libcurl.so.4
#2 0x00007fa8e5424342 in ?? () from /lib/x86_64-linux-gnu/libcurl.so.4
#3 0x00007fa8e5424536 in curl_multi_poll () from /lib/x86_64-linux-gnu/libcurl.so.4
#4 0x00007fa8e541ce24 in curl_easy_perform () from /lib/x86_64-linux-gnu/libcurl.so.4
#5 0x00007fa8e83a4b47 in google::cloud::storage::v1::internal::CurlHandle::EasyPerform (this=0x7ffe44de3958)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/internal/curl_handle.h:130
#6 0x00007fa8e83a460e in google::cloud::storage::v1::internal::CurlRequest::MakeRequestImpl (this=0x7ffe44de3880)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/internal/curl_request.cc:125
#7 0x00007fa8e83a4031 in google::cloud::storage::v1::internal::CurlRequest::MakeRequest (this=0x7ffe44de3880, payload="")
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/internal/curl_request.cc:77
#8 0x00007fa8e84598db in google::cloud::storage::v1::oauth2::ComputeEngineCredentials<google::cloud::storage::v1::internal::CurlRequestBuilder, std::chrono::_V2::system_clock>::DoMetadataServerGetRequest (this=0x55c6ae7c7700, path="/computeMetadata/v1/instance/service-accounts/default/", recursive=true)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/oauth2/compute_engine_credentials.h:147
#9 0x00007fa8e845935b in google::cloud::storage::v1::oauth2::ComputeEngineCredentials<google::cloud::storage::v1::internal::CurlRequestBuilder, std::chrono::_V2::system_clock>::RetrieveServiceAccountInfo (this=0x55c6ae7c7700)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/oauth2/compute_engine_credentials.h:158
#10 0x00007fa8e845835c in google::cloud::storage::v1::oauth2::ComputeEngineCredentials<google::cloud::storage::v1::internal::CurlRequestBuilder, std::chrono::_V2::system_clock>::Refresh (this=0x55c6ae7c7700) at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/oauth2/compute_engine_credentials.h:179
#11 0x00007fa8e8456a21 in google::cloud::storage::v1::oauth2::ComputeEngineCredentials<google::cloud::storage::v1::internal::CurlRequestBuilder, std::chrono::_V2::system_clock>::AuthorizationHeader[abi:cxx11]()::{lambda()#1}::operator()() const (this=0x55c6ae7c7700)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/oauth2/compute_engine_credentials.h:91
#12 0x00007fa8e845874e in google::cloud::storage::v1::oauth2::RefreshingCredentialsWrapper::AuthorizationHeader<google::cloud::storage::v1::oauth2::ComputeEngineCredentials<google::cloud::storage::v1::internal::CurlRequestBuilder, std::chrono::_V2::system_clock>::AuthorizationHeader()::{lambda()#1}>(std::chrono::time_point<std::chrono::_V2::system_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >, google::cloud::storage::v1::oauth2::ComputeEngineCredentials<google::cloud::storage::v1::internal::CurlRequestBuilder, std::chrono::_V2::system_clock>::AuthorizationHeader()::{lambda()#1}) const (this=0x55c6ae7c7738, now=..., refresh_fn=...)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/oauth2/refreshing_credentials_wrapper.h:49
#13 0x00007fa8e8456b2f in google::cloud::storage::v1::oauth2::ComputeEngineCredentials<google::cloud::storage::v1::internal::CurlRequestBuilder, std::chrono::_V2::system_clock>::AuthorizationHeader[abi:cxx11]() (this=0x55c6ae7c7700)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/oauth2/compute_engine_credentials.h:91
#14 0x00007fa8e8453294 in google::cloud::storage::v1::oauth2::GoogleDefaultCredentials (options=...)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/oauth2/google_credentials.cc:168
#15 0x00007fa8e844cf70 in google::cloud::storage::v1::internal::Visitor::visit (this=0x7ffe44de4020)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/internal/unified_rest_credentials.cc:45
#16 0x00007fa8e8741c8c in google::cloud::v1::internal::GoogleDefaultCredentialsConfig::dispatch (this=0x55c6ae7848d0, v=warning: RTTI symbol not found for class 'google::cloud::storage::v1::internal::MapCredentials(std::shared_ptr<google::cloud::v1::Credentials> const&)::Visitor'
...)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/internal/credentials_impl.h:71
#17 0x00007fa8e874b020 in google::cloud::v1::internal::CredentialsVisitor::dispatch (credentials=..., visitor=warning: RTTI symbol not found for class 'google::cloud::storage::v1::internal::MapCredentials(std::shared_ptr<google::cloud::v1::Credentials> const&)::Visitor'
...)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/internal/credentials_impl.cc:24
#18 0x00007fa8e844d362 in google::cloud::storage::v1::internal::MapCredentials (
credentials=std::shared_ptr<class google::cloud::v1::Credentials> (use count 1, weak count 0) = {...})
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/internal/unified_rest_credentials.cc:71
#19 0x00007fa8e832bd74 in google::cloud::storage::v1::internal::DefaultOptionsWithCredentials (opts=...)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/client_options.cc:230
#20 0x00007fa8e8311112 in google::cloud::storage::v1::Client::Client (this=0x55c6ae7c76d0, opts=...)
at /build/cpp/google_cloud_cpp_ep-prefix/src/google_cloud_cpp_ep/google/cloud/storage/client.cc:41
#21 0x00007fa8e7e40a9d in arrow::fs::GcsFileSystem::Impl::Impl (this=0x55c6ae7c7690, o=...) at /arrow/cpp/src/arrow/filesystem/gcsfs.cc:85
#22 0x00007fa8e7e45a69 in __gnu_cxx::new_allocator<arrow::fs::GcsFileSystem::Impl>::construct<arrow::fs::GcsFileSystem::Impl, arrow::fs::GcsOptions const&> (
this=0x7ffe44de439f, __p=0x55c6ae7c7690) at /usr/include/c++/9/ext/new_allocator.h:147
#23 0x00007fa8e7e45746 in std::allocator_traits<std::allocator<arrow::fs::GcsFileSystem::Impl> >::construct<arrow::fs::GcsFileSystem::Impl, arrow::fs::GcsOptions const&> (__a=..., __p=0x55c6ae7c7690) at /usr/include/c++/9/bits/alloc_traits.h:484
#24 0x00007fa8e7e45314 in std::_Sp_counted_ptr_inplace<arrow::fs::GcsFileSystem::Impl, std::allocator<arrow::fs::GcsFileSystem::Impl>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<arrow::fs::GcsOptions const&> (this=0x55c6ae7c7680, __a=...) at /usr/include/c++/9/bits/shared_ptr_base.h:548
#25 0x00007fa8e7e44a70 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<arrow::fs::GcsFileSystem::Impl, std::allocator<arrow::fs::GcsFileSystem::Impl>, arrow::fs::GcsOptions const&> (this=0x55c6ae7c7670, __p=@0x55c6ae7c7668: 0x0, __a=...) at /usr/include/c++/9/bits/shared_ptr_base.h:679
#26 0x00007fa8e7e43f0a in std::__shared_ptr<arrow::fs::GcsFileSystem::Impl, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<arrow::fs::GcsFileSystem::Impl>, arrow::fs::GcsOptions const&> (this=0x55c6ae7c7668, __tag=...) at /usr/include/c++/9/bits/shared_ptr_base.h:1344
#27 0x00007fa8e7e433a1 in std::shared_ptr<arrow::fs::GcsFileSystem::Impl>::shared_ptr<std::allocator<arrow::fs::GcsFileSystem::Impl>, arrow::fs::GcsOptions const&> (
this=0x55c6ae7c7668, __tag=...) at /usr/include/c++/9/bits/shared_ptr.h:359
#28 0x00007fa8e7e42a33 in std::allocate_shared<arrow::fs::GcsFileSystem::Impl, std::allocator<arrow::fs::GcsFileSystem::Impl>, arrow::fs::GcsOptions const&> (__a=...)
at /usr/include/c++/9/bits/shared_ptr.h:702
#29 0x00007fa8e7e41ad3 in std::make_shared<arrow::fs::GcsFileSystem::Impl, arrow::fs::GcsOptions const&> () at /usr/include/c++/9/bits/shared_ptr.h:718
#30 0x00007fa8e7e3ed70 in arrow::fs::GcsFileSystem::GcsFileSystem (this=0x55c6ae7c7620, options=..., context=...) at /arrow/cpp/src/arrow/filesystem/gcsfs.cc:202
#31 0x00007fa8e7e3ede8 in arrow::fs::internal::MakeGcsFileSystemForTest (options=...) at /arrow/cpp/src/arrow/filesystem/gcsfs.cc:209
#32 0x000055c6ac5fca88 in arrow::fs::(anonymous namespace)::GcsFileSystem_FileSystemCompare_Test::TestBody (this=0x55c6ae784800)
at /arrow/cpp/src/arrow/filesystem/gcsfs_test.cc:145
#33 0x00007fa8e55d1634 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x55c6ae784800, method=&virtual testing::Test::TestBody(), location=0x7fa8e55e857b "the test body") at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2607
#34 0x00007fa8e55c932b in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=0x55c6ae784800, method=&virtual testing::Test::TestBody(), location=0x7fa8e55e857b "the test body") at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2643
#35 0x00007fa8e559d5ec in testing::Test::Run (this=0x55c6ae784800) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2682
#36 0x00007fa8e559e02e in testing::TestInfo::Run (this=0x55c6ae7c5870) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2861
#37 0x00007fa8e559e948 in testing::TestSuite::Run (this=0x55c6ae7c5470) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:3015
#38 0x00007fa8e55ae3f6 in testing::internal::UnitTestImpl::RunAllTests (this=0x55c6ae7c51d0) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:5855
#39 0x00007fa8e55d2b62 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x55c6ae7c51d0, method=(bool (testing::internal::UnitTestImpl::*)(class testing::internal::UnitTestImpl * const)) 0x7fa8e55adfc2 <testing::internal::UnitTestImpl::RunAllTests()>, location=0x7fa8e55e9068 "auxiliary test code (environments or event listeners)") at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2607
#40 0x00007fa8e55ca569 in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x55c6ae7c51d0, method=(bool (testing::internal::UnitTestImpl::*)(class testing::internal::UnitTestImpl * const)) 0x7fa8e55adfc2 <testing::internal::UnitTestImpl::RunAllTests()>, location=0x7fa8e55e9068 "auxiliary test code (environments or event listeners)") at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:2643
#41 0x00007fa8e55acb0d in testing::UnitTest::Run (this=0x7fa8e5616ba0 <testing::UnitTest::GetInstance()::instance>) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest.cc:5438
#42 0x00007fa8e561829a in RUN_ALL_TESTS () at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/include/gtest/gtest.h:2490
#43 0x00007fa8e561821c in main (argc=1, argv=0x7ffe44de4ba8) at /build/cpp/googletest_ep-prefix/src/googletest_ep/googletest/src/gtest_main.cc:52
#44 0x00007fa8e4bda0b3 in __libc_start_main () from /lib/x86_64-linux-gnu/libc.so.6
#45 0x000055c6ac5fb0de in _start () |
Which ones? Some of them already have these settings, e.g.: https://github.com/apache/arrow/pull/11375/checks?check_run_id=3863850315#step:8:2266
That obviously depends on how much caching is going on and what the CI machine specs are. For reference, I used |
I've now run |
Oh, sorry. I missed them. |
|
I think I addressed all the comments, please let me know if I missed something. |
|
Hmm, |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 (but would be nice to eliminate the slowness in GcsFileSystem.FileSystemCompare)
Ugh, that looks painful. On my workstation that particular test completes in milliseconds: And the full test with with Which is to say: this would probably take some more effort to troubleshoot. I will open a separate bug. |
|
|
Thank you, I'll merge this one then. |
|
@github-actions crossbow submit -g cpp |
|
Revision: f3edc48 Submitted crossbow builds: ursacomputing/crossbow @ actions-935 |
|
Benchmark runs are scheduled for baseline = 42b0665 and contender = 74caee2. 74caee2 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
No description provided.