-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Describe the bug, including details regarding any error messages, version, and platform.
We have been seeing Valgrind errors for a while now in R.
==774== HEAP SUMMARY:
==774== in use at exit: 351,781,345 bytes in 69,559 blocks
==774== total heap usage: 16,807,335 allocs, 16,737,776 frees, 9,804,514,696 bytes allocated
==774==
==774== 400 bytes in 1 blocks are possibly lost in loss record 252 of 3,243
==774== at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==774== by 0x40147D9: calloc (rtld-malloc.h:44)
==774== by 0x40147D9: allocate_dtv (dl-tls.c:375)
==774== by 0x40147D9: _dl_allocate_tls (dl-tls.c:634)
==774== by 0x4DA67B4: allocate_stack (allocatestack.c:430)
==774== by 0x4DA67B4: pthread_create@@GLIBC_2.34 (pthread_create.c:647)
==774== by 0x11D614D3: je_arrow_private_je_pthread_create_wrapper (background_thread.c:47)
==774== by 0x11D614D3: background_thread_create_signals_masked (background_thread.c:287)
==774== by 0x11D614D3: background_thread_create_locked (background_thread.c:495)
==774== by 0x11D6275C: je_arrow_private_je_background_thread_create (background_thread.c:520)
==774== by 0x400647D: call_init.part.0 (dl-init.c:70)
==774== by 0x4006567: call_init (dl-init.c:33)
==774== by 0x4006567: _dl_init (dl-init.c:117)
==774== by 0x4E85AF4: _dl_catch_exception (dl-error-skeleton.c:182)
==774== by 0x400DFF5: dl_open_worker (dl-open.c:808)
==774== by 0x400DFF5: dl_open_worker (dl-open.c:771)
==774== by 0x4E85A97: _dl_catch_exception (dl-error-skeleton.c:208)
==774== by 0x400E34D: _dl_open (dl-open.c:883)
==774== by 0x4DA163B: dlopen_doit (dlopen.c:56)
==774==
==774== 723 (144 direct, 579 indirect) bytes in 1 blocks are definitely lost in loss record 288 of 3,243
==774== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==774== by 0x12F2B64D: CRYPTO_zalloc (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==774== by 0x12F10997: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==774== by 0x12F008F9: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==774== by 0x12F205E8: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==774== by 0x12F2050B: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==774== by 0x12F3FB2A: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==774== by 0x13009227: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==774== by 0x1300987D: ??? (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==774== by 0x12F0D392: EVP_MAC_fetch (in /usr/lib/x86_64-linux-gnu/libcrypto.so.3)
==774== by 0x1183F6B2: Aws::Utils::Crypto::Sha256HMACOpenSSLImpl::Calculate(Aws::Utils::Array<unsigned char> const&, Aws::Utils::Array<unsigned char> const&) (in /usr/local/RDvalgrind/lib/R/site-library/arrow/libs/arrow.so)
==774== by 0x11B6CF4F: Aws::Utils::Crypto::Sha256HMAC::Calculate(Aws::Utils::Array<unsigned char> const&, Aws::Utils::Array<unsigned char> const&) (in /usr/local/RDvalgrind/lib/R/site-library/arrow/libs/arrow.so)
==774==
==774== 1,248 bytes in 3 blocks are possibly lost in loss record 330 of 3,243
==774== at 0x484DA83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==774== by 0x40147D9: calloc (rtld-malloc.h:44)
==774== by 0x40147D9: allocate_dtv (dl-tls.c:375)
==774== by 0x40147D9: _dl_allocate_tls (dl-tls.c:634)
==774== by 0x4DA67B4: allocate_stack (allocatestack.c:430)
==774== by 0x4DA67B4: pthread_create@@GLIBC_2.34 (pthread_create.c:647)
==774== by 0x11D62513: je_arrow_private_je_pthread_create_wrapper (background_thread.c:47)
==774== by 0x11D62513: background_thread_create_signals_masked (background_thread.c:287)
==774== by 0x11D62513: check_background_thread_creation (background_thread.c:332)
==774== by 0x11D62513: background_thread0_work (background_thread.c:370)
==774== by 0x11D62513: background_work (background_thread.c:412)
==774== by 0x11D62513: background_thread_entry (background_thread.c:444)
==774== by 0x4DA5AC2: start_thread (pthread_create.c:442)
==774== by 0x4E36A03: clone (clone.S:100)
==774==
==774== 12,048 bytes in 4 blocks are possibly lost in loss record 1,495 of 3,243
==774== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==774== by 0x4013E4D: malloc (rtld-malloc.h:56)
==774== by 0x4013E4D: allocate_dtv_entry (dl-tls.c:684)
==774== by 0x4013E4D: allocate_and_init (dl-tls.c:709)
==774== by 0x4013E4D: tls_get_addr_tail (dl-tls.c:907)
==774== by 0x401820B: __tls_get_addr (tls_get_addr.S:55)
==774== by 0x11D616D3: tsd_state_get (tsd.h:269)
==774== by 0x11D616D3: tsd_fetch_impl (tsd.h:421)
==774== by 0x11D616D3: tsd_fetch_min (tsd.h:433)
==774== by 0x11D616D3: tsd_internal_fetch (tsd.h:439)
==774== by 0x11D616D3: background_thread_entry (background_thread.c:444)
==774== by 0x4DA5AC2: start_thread (pthread_create.c:442)
==774== by 0x4E36A03: clone (clone.S:100)
==774==
==774== LEAK SUMMARY:
==774== definitely lost: 144 bytes in 1 blocks
==774== indirectly lost: 579 bytes in 11 blocks
==774== possibly lost: 13,696 bytes in 8 blocks
==774== still reachable: 351,098,631 bytes in 69,538 blocks
==774== of which reachable via heuristic:
==774== length64 : 456 bytes in 2 blocks
==774== newarray : 4,264 bytes in 1 blocks
==774== suppressed: 668,295 bytes in 1 blocks
==774== Reachable blocks (those to which a pointer was found) are not shown.
==774== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==774==
==774== For lists of detected and suppressed errors, rerun with: -s
==774== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 1 from 1)
Judging from when these started, I suspect one of these PRs is what introduced this:
* #41419
* #41295
* #41421
* #41366
* #41434
Turns out, something changed with how we were looking for binaries when we were instantiating the Valgrind run and that was causing these issues. The strange thing is that there were no code changes around this which shouldn't have caused this build to start using binaries of libarrow, but seemingly did:
*** No nightly binaries were found for version 16.0.0.9000: falling back to libarrow build from source
*** Latest available nightly for 16.0.0.9000: 16.0.0.100000045
I've hardcoded don't-download-binaries in #42249 which resolves the issue, but I'm curious if you know of what changed around then to start this @assignUser ? We also might need to check other builds that we want to be source builds and confirm that they still are too.
Component(s)
C++, R