Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions src/runtime/hexagon/hexagon_buffer_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
#include <memory>
#include <unordered_map>
#include <utility>
#include <vector>

#include "hexagon_buffer.h"

Expand Down Expand Up @@ -85,6 +86,18 @@ class HexagonBufferManager {
return hexagon_buffer_map_.empty();
}

//! \brief Returns a vector of currently allocated pointers, owned by the manager.
// Note - this should only be used by the device API to keep track of what
// was in the manager when HexagonDeviceAPI::ReleaseResources is called.
std::vector<void*> current_allocations() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for the post-merge feedback. But ... having this method seems like a break in the abstraction of the HexagonBufferManager class, to me. If we have to do this it feels like something else is amiss.

Copy link
Contributor Author

@janetsc janetsc Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

I agree, this isn't ideal. However, I wanted to make sure that I don't give a "free pass" to all frees that come after ReleaseResources. The scenario I was trying to avoid was:

User allocates buffer B
Session ends, ReleaseResources is called and B is freed
User frees buffer B as objects are being torn down, which should not throw as it is somewhat expected
User frees buffer C, which SHOULD throw, as it was never allocated in the first place

I added this so that I can check if there's no runtime_hexbuffs, then I make sure that what is being attempted to be freed is something that we owned before ReleaseResources was called. In which case, that Free is a no-op. I also want to detect spurious calls to Free that wouldn't have been legit before ReleaseResources was called.

I'm happy to discuss more, and brainstorm other ways to accomplish this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative, what if the HexagonBufferManager allows three states for buffers instead of two.

  • Valid to use, valid to free (Initial state after allocating memory)
  • Invalid to use, valid to free (State after closing a session)
  • Invalid to use, invalid to free (State of unknown pointer)

That way, instead of discarding the HexagonBufferManager runtime_hexbuffs altogether when the session closes, it instead moves all keys from the internal std::unordered_map<void*, std::unique_ptr<HexagonBuffer>> hexagon_buffer_map_; to an internal std::unordered_set<void*>. This avoids exposing any of the buffers to the outside, but does allow the HexagonBufferManager to identify a previously freed pointer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Lunderberg, I think your suggestion would satisfy my concern. What do you think @janetsc? This would allow us to make valid frees a nop and to catch invalid frees as well. If I am understanding correctly runtime_hexbuffs would be instantiated by the Hexagon Device API in the same way that we instantiate the rpc_hexbuffs ... HexagonBufferManager runtime_hexbuffs;? If that is the case, we could potentially have just one HeagonBufferManager that implements the API and internal state that @Lunderberg suggests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also another state not captured - the buffer manager is in an invalid state to allocate more memory.

There needs to be two buffer managers, because on ReleaseResources how do we know which buffers need to be freed?

I'm thinking that a better model is to enforce that all frees should take place before ReleaseResources. That's cleaner. I will investigate which ones are outstanding. All objects should be cleanly torn down by the time the session ends. I will file a task to investigate.

std::vector<void*> allocated;
std::lock_guard<std::mutex> lock(map_mutex_);
for (const auto& [data_ptr, buffer] : hexagon_buffer_map_) {
allocated.push_back(data_ptr);
}
return allocated;
}

private:
//! \brief Contains the HexagonBuffer objects managed by this class.
std::unordered_map<void*, std::unique_ptr<HexagonBuffer>> hexagon_buffer_map_;
Expand Down
48 changes: 41 additions & 7 deletions src/runtime/hexagon/hexagon_device_api.cc
Original file line number Diff line number Diff line change
Expand Up @@ -90,18 +90,23 @@ void* HexagonDeviceAPI::AllocDataSpace(Device dev, int ndim, const int64_t* shap

const size_t typesize = (dtype.bits / 8) * dtype.lanes;

CHECK(runtime_hexbuffs) << "Attempted to allocate Hexagon data with "
<< "HexagonDeviceAPI::AllocDataSpace before initializing resources. "
<< "Please call HexagonDeviceAPI::AcquireResources";

if (ndim == 0) {
// Allocate storage for a single scalar value.
return mgr->AllocateHexagonBuffer(typesize, kHexagonAllocAlignment, mem_scope);
return runtime_hexbuffs->AllocateHexagonBuffer(typesize, kHexagonAllocAlignment, mem_scope);
} else if (ndim == 1) {
// Allocate a single, contiguous memory region.
size_t nbytes = shape[0] * typesize;
return mgr->AllocateHexagonBuffer(nbytes, kHexagonAllocAlignment, mem_scope);
return runtime_hexbuffs->AllocateHexagonBuffer(nbytes, kHexagonAllocAlignment, mem_scope);
} else if (ndim == 2) {
// Allocate the region(s) needed for Hexagon's indirect-tensor format.
size_t nallocs = shape[0];
size_t nbytes = shape[1] * typesize;
return mgr->AllocateHexagonBuffer(nallocs, nbytes, kHexagonAllocAlignment, mem_scope);
return runtime_hexbuffs->AllocateHexagonBuffer(nallocs, nbytes, kHexagonAllocAlignment,
mem_scope);
} else {
return nullptr; // unreachable
}
Expand All @@ -115,13 +120,34 @@ void* HexagonDeviceAPI::AllocDataSpace(Device dev, size_t nbytes, size_t alignme
if (alignment < kHexagonAllocAlignment) {
alignment = kHexagonAllocAlignment;
}
return mgr->AllocateHexagonBuffer(nbytes, alignment, String("global"));
CHECK(runtime_hexbuffs) << "Attempted to allocate Hexagon data with "
<< "HexagonDeviceAPI::AllocDataSpace before initializing resources. "
<< "Please call HexagonDeviceAPI::AcquireResources";
return runtime_hexbuffs->AllocateHexagonBuffer(nbytes, alignment, String("global"));
}

void HexagonDeviceAPI::FreeDataSpace(Device dev, void* ptr) {
CHECK(ptr) << "buffer pointer is null";
CHECK(IsValidDevice(dev)) << "dev.device_type: " << dev.device_type;
mgr->FreeHexagonBuffer(ptr);
if (runtime_hexbuffs) {
runtime_hexbuffs->FreeHexagonBuffer(ptr);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure on the value-add for this else case. The only thing that happens is that we might fail in the CHECK if we free a buffer after we release resources that was not in the released_runtime_buffers. Do we want this behavior? What happens on-system if we just get rid of this else case? We can then get rid of released_runtime_buffers and the need for HexagonBufferManager::current_allocations. We can simply ...

  if (runtime_hexbuffs) {
    runtime_hexbuffs->FreeHexagonBuffer(ptr);
  }

With some comments to explain why the "free" may happen after the "release".

Copy link
Contributor Author

@janetsc janetsc Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous reply is related.

If there's a manager, I can just use it to call Free. If there isn't (it has been released, or maybe acquire was never called), I can check what I had allocated before Release. If that buffer is in there, I will not throw. But if there's a bug and the user is attempting to free some pointer we never knew about, I do want that to throw.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making sure all "temporal" frees occur before Release is a good idea and worthwhile endeavor. ✅

I am not convinced that two HexagonBufferManager objects are required. Instead, you could have separate interfaces in a single HexagonBufferManager class to track "permanent" (for RPC e.g.) versus "temporal" allocations. And, in this way, alleviate the need for the current_allocations method which exposes the internals of the HexagonBufferManager class to its users. Something to consider for a future cleanup, perhaps.

auto it = std::find(released_runtime_buffers.begin(), released_runtime_buffers.end(), ptr);
CHECK(it != released_runtime_buffers.end()) << "Attempted to free Hexagon data with "
<< "HexagonDeviceAPI::FreeDataSpace that was not "
<< "allocated during the session.";
}
}

void* HexagonDeviceAPI::AllocRpcBuffer(size_t nbytes, size_t alignment) {
CHECK(nbytes) << "number of bytes is zero";
CHECK(alignment) << "alignment is zero";
return rpc_hexbuffs.AllocateHexagonBuffer(nbytes, alignment, String("global"));
}

void HexagonDeviceAPI::FreeRpcBuffer(void* ptr) {
CHECK(ptr) << "buffer pointer is null";
rpc_hexbuffs.FreeHexagonBuffer(ptr);
}

// WorkSpace: runtime allocations for Hexagon
Expand All @@ -137,7 +163,10 @@ void* HexagonDeviceAPI::AllocWorkspace(Device dev, size_t size, DLDataType type_

void HexagonDeviceAPI::FreeWorkspace(Device dev, void* data) {
CHECK(IsValidDevice(dev)) << "dev.device_type: " << dev.device_type;
CHECK(mgr->count(data) != 0)
CHECK(runtime_hexbuffs) << "Attempted to free Hexagon workspace with "
<< "HexagonDeviceAPI::FreeWorkspace outside of a session. "
<< "Please call HexagonDeviceAPI::AcquireResources";
CHECK(runtime_hexbuffs->count(data) != 0)
<< "Attempt made to free unknown or already freed workspace allocation";
dmlc::ThreadLocalStore<HexagonWorkspacePool>::Get()->FreeWorkspace(dev, data);
}
Expand All @@ -160,8 +189,13 @@ void HexagonDeviceAPI::CopyDataFromTo(DLTensor* from, DLTensor* to, TVMStreamHan
CHECK_EQ(from->byte_offset, 0);
CHECK_EQ(to->byte_offset, 0);
CHECK_EQ(GetDataSize(*from), GetDataSize(*to));
CHECK(runtime_hexbuffs) << "Attempted to copy Hexagon data with "
<< "HexagonDeviceAPI::CopyDataFromTo before initializing resources. "
<< "Please call HexagonDeviceAPI::AcquireResources";

auto lookup_hexagon_buffer = [this](void* ptr) -> HexagonBuffer* { return mgr->find(ptr); };
auto lookup_hexagon_buffer = [this](void* ptr) -> HexagonBuffer* {
return runtime_hexbuffs->find(ptr);
};

HexagonBuffer* hex_from_buf = lookup_hexagon_buffer(from->data);
HexagonBuffer* hex_to_buf = lookup_hexagon_buffer(to->data);
Expand Down
36 changes: 24 additions & 12 deletions src/runtime/hexagon/hexagon_device_api.h
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ class HexagonDeviceAPI final : public DeviceAPI {
static HexagonDeviceAPI* Global();

//! \brief Constructor
HexagonDeviceAPI() { mgr = &hexbuffs; }
HexagonDeviceAPI() {}

//! \brief Destructor
~HexagonDeviceAPI() {}
Expand All @@ -60,7 +60,7 @@ class HexagonDeviceAPI final : public DeviceAPI {

CHECK_EQ(runtime_hexbuffs, nullptr);
runtime_hexbuffs = std::make_unique<HexagonBufferManager>();
mgr = runtime_hexbuffs.get();
released_runtime_buffers.clear();

CHECK_EQ(runtime_threads, nullptr);
runtime_threads = std::make_unique<HexagonThreadManager>(threads, stack_size, pipe_size);
Expand All @@ -78,10 +78,10 @@ class HexagonDeviceAPI final : public DeviceAPI {
runtime_threads.reset();

CHECK(runtime_hexbuffs) << "runtime_hexbuffs was not created in AcquireResources";
if (runtime_hexbuffs && !runtime_hexbuffs->empty()) {
LOG(INFO) << "runtime_hexbuffs was not empty in ReleaseResources";
if (!runtime_hexbuffs->empty()) {
DLOG(INFO) << "runtime_hexbuffs was not empty in ReleaseResources";
released_runtime_buffers = runtime_hexbuffs->current_allocations();
}
mgr = &hexbuffs;
runtime_hexbuffs.reset();

CHECK(runtime_vtcm) << "runtime_vtcm was not created in AcquireResources";
Expand All @@ -106,6 +106,12 @@ class HexagonDeviceAPI final : public DeviceAPI {
//! \brief Free the allocated HexagonBuffer.
void FreeDataSpace(Device dev, void* ptr) final;

//! \brief Hexagon-only interface to allocate buffers used for the RPC server
void* AllocRpcBuffer(size_t nbytes, size_t alignment);

//! \brief Hexagon-only interface to free buffers used for the RPC server
void FreeRpcBuffer(void* ptr);

/*! \brief Request a dynamically allocated HexagonBuffer from a workspace pool.
* \returns The underlying allocation pointer.
*/
Expand Down Expand Up @@ -190,15 +196,21 @@ class HexagonDeviceAPI final : public DeviceAPI {
(DLDeviceType(dev.device_type) == kDLCPU);
}

//! \brief Manages underlying HexagonBuffer allocations
// runtime_hexbuffs is used for runtime allocations. It is created
// with a call to AcquireResources, and destroyed on ReleaseResources.
// hexbuffs is used for all allocations outside of the session lifetime.
HexagonBufferManager hexbuffs;
//! \brief Manages RPC HexagonBuffer allocations
// rpc_hexbuffs is used only in Alloc/FreeRpcBuffer. It is static because it lives for the
// lifetime of the static Device API.
HexagonBufferManager rpc_hexbuffs;

//! \brief Manages runtime HexagonBuffer allocations
// runtime_hexbuffs is used for runtime allocations, separate from rpc_hexbuffs. It is created
// with a call to AcquireResources, and destroyed on ReleaseResources. The buffers in this
// manager are scoped to the lifetime of a user application session.
std::unique_ptr<HexagonBufferManager> runtime_hexbuffs;

//! \brief Current buffer manager
HexagonBufferManager* mgr;
//! \brief Keeps a list of released runtime HexagonBuffer allocations
// ReleaseResources can be called when there are still buffers in runtime_hexbuffs. This list
// stores the buffers that were released.
std::vector<void*> released_runtime_buffers;

//! \brief Thread manager
std::unique_ptr<HexagonThreadManager> runtime_threads;
Expand Down
32 changes: 31 additions & 1 deletion src/runtime/hexagon/rpc/hexagon/rpc_server.cc
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ extern "C" {
#include "../../../library_module.h"
#include "../../../minrpc/minrpc_server.h"
#include "../../hexagon/hexagon_common.h"
#include "../../hexagon/hexagon_device_api.h"
#include "hexagon_rpc.h"

namespace tvm {
Expand Down Expand Up @@ -145,6 +146,35 @@ class HexagonIOHandler {
uint32_t write_buffer_available_length_;
};

// Internal allocator that redirects alloc to TVM's C API.
template <typename TIOHandler>
class HexagonPageAllocator {
public:
using ArenaPageHeader = tvm::support::ArenaPageHeader;

explicit HexagonPageAllocator(TIOHandler* io) : io_(io) {}

ArenaPageHeader* allocate(size_t min_size) {
size_t npages = ((min_size + kPageSize - 1) / kPageSize);
void* data;

data = HexagonDeviceAPI::Global()->AllocRpcBuffer(npages * kPageSize, kPageAlign);

ArenaPageHeader* header = static_cast<ArenaPageHeader*>(data);
header->size = npages * kPageSize;
header->offset = sizeof(ArenaPageHeader);
return header;
}

void deallocate(ArenaPageHeader* page) { HexagonDeviceAPI::Global()->FreeRpcBuffer(page); }

static const constexpr int kPageSize = 2 << 10;
static const constexpr int kPageAlign = 8;

private:
TIOHandler* io_;
};

class HexagonRPCServer {
public:
explicit HexagonRPCServer(uint8_t* receive_buffer, size_t receive_buffer_size_bytes)
Expand Down Expand Up @@ -185,7 +215,7 @@ class HexagonRPCServer {

private:
HexagonIOHandler io_;
MinRPCServer<HexagonIOHandler> rpc_server_;
MinRPCServer<HexagonIOHandler, HexagonPageAllocator> rpc_server_;
};

} // namespace hexagon
Expand Down
27 changes: 18 additions & 9 deletions tests/cpp-runtime/hexagon/hexagon_device_api_tests.cc
Original file line number Diff line number Diff line change
Expand Up @@ -147,22 +147,31 @@ TEST_F(HexagonDeviceAPITest, DISABLED_alloc_free_diff_dev) {
EXPECT_THROW(hexapi->FreeDataSpace(cpu_dev, buf), InternalError);
}

// Alloc a non-runtime buffer
// Alloc a runtime buffer
// "Release" resources for runtime
// Verify the runtime buffer cannot be freed, but the non-runtime buffer can
// This test should be run last
TEST_F(HexagonDeviceAPITest, leak_resources) {
// Ensure runtime buffer manager is properly configured and destroyed
// in Acquire/Release
TEST_F(HexagonDeviceAPITest, runtime_buffer_manager) {
hexapi->ReleaseResources();
void* pre_runtime_buf = hexapi->AllocDataSpace(hex_dev, nbytes, alignment, int8);
CHECK(pre_runtime_buf != nullptr);
EXPECT_THROW(hexapi->AllocDataSpace(hex_dev, nbytes, alignment, int8), InternalError);
hexapi->AcquireResources();
void* runtime_buf = hexapi->AllocDataSpace(hex_dev, nbytes, alignment, int8);
CHECK(runtime_buf != nullptr);
hexapi->ReleaseResources();
hexapi->FreeDataSpace(hex_dev, runtime_buf);
hexapi->AcquireResources();
EXPECT_THROW(hexapi->FreeDataSpace(hex_dev, runtime_buf), InternalError);
hexapi->FreeDataSpace(hex_dev, pre_runtime_buf);
}

// Ensure RPC buffer manager is always available
TEST_F(HexagonDeviceAPITest, rpc_buffer_manager) {
void* rpc_buf;
rpc_buf = hexapi->AllocRpcBuffer(nbytes, alignment);
CHECK(rpc_buf != nullptr);
hexapi->ReleaseResources();
hexapi->FreeRpcBuffer(rpc_buf);
rpc_buf = hexapi->AllocRpcBuffer(nbytes, alignment);
CHECK(rpc_buf != nullptr);
hexapi->AcquireResources();
hexapi->FreeRpcBuffer(rpc_buf);
}

// Ensure thread manager is properly configured and destroyed
Expand Down