Skip to content

[C++] Corner case of temp vector stack overflow check #41738

@zanmato1984

Description

@zanmato1984

Describe the bug, including details regarding any error messages, version, and platform.

For temp vector stack, we will do overflow check when allocating a vector:

// Stack overflow check (see GH-39582).
// XXX cannot return a regular Status because most consumers do not either.
ARROW_CHECK_LE(new_top, buffer_size_)
<< "TempVectorStack::alloc overflow: allocating " << estimated_alloc_size
<< " on top of " << top_ << " in stack of size " << buffer_size_;

Notably it is checking if the top of the stack after allocation is above the buffer_size_, which is carefully padded:
buffer_size_ = EstimatedAllocationSize(size);

which is good.

The question is that the underlying buffer, on the other hand, is only allocated with a non-padded size:

ARROW_ASSIGN_OR_RAISE(auto buffer, AllocateResizableBuffer(size, pool));

This results in that the buffer size that the stack thinks (and subsequently uses to do overflow check) is actually larger than that it actually has. And the overflow check can fail to check certain allocations that may actually exceed the buffer boundary.

I've reproduced such a escaped stack overflow using the following case:

TEST_F(TempVectorStackTest, OverflowCheck) {
  int64_t stack_size = 64;  // exact 64b of actual buffer size.
  auto stack_allocation_size = EstimatedAllocationSize(
      stack_size);  // padded 144b = 64b(buffer) + 64b(padding) + 16b(two guards).
  std::cout << "stack_allocation_size: " << stack_allocation_size << std::endl;
  TempVectorStack stack;
  ASSERT_OK(stack.Init(default_memory_pool(), stack_size));

  uint32_t v1_size = 64;
  auto v1_allocation_size = EstimatedAllocationSize(v1_size);
  std::cout << "v1_allocation_size: " << v1_allocation_size << std::endl;
  TempVectorHolder<uint8_t> v1(&stack, v1_size);  // vector allocation is OK.
  auto v1_data =
      v1.mutable_data();  // data addr will be on buffer addr + 8b(heading guard).
  for (uint32_t i = 0; i < v1_size; ++i) {
    // The last 8b access will exceed buffer's boundary and can be caught by ASAN.
    v1_data[i] = i;
  }
}

(Note that to really catch the issue using ASAN, #41695 has to be reverted in your local)

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions