Conversation
…ssion_options object. Return error if compiled model doesn't contain EPContext nodes (compile API only)
ce389f1
|
|
||
| /** \brief Sets the file path for the output ONNX model generated by CompileModel. | ||
| * | ||
| * If the output model path is not specified and the output model is not to be stored in a buffer, |
| * \param[in] model_compile_options The OrtModelCompilationOptions instance. | ||
| * \param[in] allocator The allocator used to allocate the buffer for the compiled model. | ||
| * \param[out] output_model_buffer_ptr Pointer to the buffer that stores the compiled model. | ||
| * \param[out] output_model_buffer_size_ptr Pointer set to the size of output buffer in bytes. |
| ORT_DEFINE_RELEASE(Node); | ||
| ORT_DEFINE_RELEASE(Graph); | ||
| ORT_DEFINE_RELEASE(Model); | ||
| #if !defined(ORT_MINIMAL_BUILD) |
There was a problem hiding this comment.
My understanding ORT_MINIMAL_BUILD is a build option for the ORT, but not for the customer source code.
They code will probably not have a notion of MINIMAL build. So this will always be undefined. This may cause discrepancy between the binary and the header that the customer sees.
I suggest we do not leak our internal defines to public headers and adjust the source code assumptions accordingly.
| using Base::Base; | ||
|
|
||
| explicit ModelCompilationOptions(std::nullptr_t) {} ///< Create an empty ModelCompilationOptions object, must be assigned a valid one to be used. | ||
| explicit ModelCompilationOptions(OrtModelCompilationOptions* p) ///< Takes ownership of an OrtModelCompilationOptions |
| "Cannot serialize ONNX ModelProto larger than 2GB"); | ||
|
|
||
| OrtAllocator* allocator = ep_context_gen_options.output_model_buffer_allocator; | ||
| void* buffer = allocator->Alloc(allocator, buffer_size); |
| ORT_RETURN_IF(buffer_size > static_cast<size_t>(std::numeric_limits<int>::max()), | ||
| "Cannot serialize ONNX ModelProto larger than 2GB"); | ||
|
|
||
| OrtAllocator* allocator = ep_context_gen_options.output_model_buffer_allocator; |
There was a problem hiding this comment.
Typically, we wrap the incoming OrtAllocator pointers into a special Wrapper class, so it is represented as AllocatorPtr. This can be used in other places in our code safely and it can be used within a smart ptr.
Perhaps, we need to check if storing a raw ptr here is a good idea.
|
|
||
| struct ModelCompilationOptions { | ||
| const OrtEnv* env = nullptr; | ||
| std::unique_ptr<OrtSessionOptions> session_options = nullptr; |
|
|
||
| Status ModelCompilationOptions::ResetOutputModelSettings() { | ||
| EpContextModelGenerationOptions& ep_context_gen_options = session_options->value.ep_context_gen_options; | ||
| ep_context_gen_options.output_model_file_path = ""; |
|
|
||
| namespace onnxruntime { | ||
| void ModelCompilationOptions::ResetInputModelSettings() { | ||
| input_model_path = ""; |
### Description Address additional review comments on #24207: - Remove use of `#ifdef ORT_MINIMAL_BUILD` in public C/C++ API headers for Compile API - Use `AllocatorPtr` internally to ensure memory is properly released if an exception is thrown while serializing the output model to the user's buffer. - Improve C API function documentation. - Clean up internal `ModelCompilationOptions` class ### Motivation and Context Useful review comments were left on the original PR after merge. This addresses those comments.
### Description
- Adds C/C++ API functionality to compile a model (i.e., generate a
model with EPContext nodes) using explicit APIs.
- Adds support for compiling when input or output models are in memory
(not just files).
- Allows specifying the threshold for when initializers are stored in an
external file.
- Allows file paths of arbitrary lengths (session_option key/value
configs limited string length to 2048).
List of C API functions:
```C++
ORT_API(const OrtCompileApi*, GetCompileApi);
ORT_API(void, ReleaseModelCompilationOptions, _Frees_ptr_opt_ OrtModelCompilationOptions*);
ORT_API2_STATUS(CreateModelCompilationOptionsFromSessionOptions, _In_ const OrtEnv* env,
_In_ const OrtSessionOptions* session_options, _Outptr_ OrtModelCompilationOptions** out);
ORT_API2_STATUS(ModelCompilationOptions_SetInputModelPath, _In_ OrtModelCompilationOptions* model_compile_options,
_In_ const ORTCHAR_T* input_model_path);
ORT_API2_STATUS(ModelCompilationOptions_SetInputModelFromBuffer, _In_ OrtModelCompilationOptions* model_compile_options,
_In_ const void* input_model_data, size_t input_model_data_size);
ORT_API2_STATUS(ModelCompilationOptions_SetOutputModelPath, _In_ OrtModelCompilationOptions* model_compile_options,
_In_ const ORTCHAR_T* output_model_path);
ORT_API2_STATUS(ModelCompilationOptions_SetOutputModelExternalInitializersFile,
_In_ OrtModelCompilationOptions* model_compile_options,
_In_ const ORTCHAR_T* external_initializers_file_path,
size_t external_initializer_size_threshold);
ORT_API2_STATUS(ModelCompilationOptions_SetOutputModelBuffer, _In_ OrtModelCompilationOptions* model_compile_options,
_Inout_ OrtAllocator* allocator, void** output_model_buffer_ptr, size_t* output_model_buffer_size_ptr);
ORT_API2_STATUS(ModelCompilationOptions_SetEpContextEmbedMode, _In_ OrtModelCompilationOptions* model_compile_options,
bool embed_ep_context_in_model);
ORT_API2_STATUS(CompileModel, _In_ const OrtEnv* env, _In_ const OrtModelCompilationOptions* model_options);
```
Example (see unit tests for others):
```C++
#include "onnxruntime_cxx_api.h"
// Test using the CompileModel() API with settings:
// - input model from buffer
// - output model file
// - EPContext nodes in output model use embedded binary blobs.
TEST_F(QnnHTPBackendTests, CompileApi_FromSessionOptions_InputModelAsBuffer_Embedded) {
const ORTCHAR_T* output_model_file = ORT_TSTR("./qnn_context_binary_multi_partition_test.onnx");
std::filesystem::remove(output_model_file);
// Initialize session options with QNN EP
Ort::SessionOptions session_options;
ProviderOptions provider_options;
#if defined(_WIN32)
provider_options["backend_path"] = "QnnHtp.dll";
#else
provider_options["backend_path"] = "libQnnHtp.so";
#endif
provider_options["offload_graph_io_quantization"] = "0";
session_options.AppendExecutionProvider("QNN", provider_options);
// Create model compilation options from the session options.
Ort::ModelCompilationOptions compile_options(*ort_env, session_options);
compile_options.SetInputModelFromBuffer(reinterpret_cast<const void*>(model_data.data()), model_data.size());
compile_options.SetOutputModelPath(output_model_file);
compile_options.SetEpContextEmbedMode(true);
// Compile the model.
Ort::Status status = Ort::CompileModel(*ort_env, compile_options);
ASSERT_TRUE(status.IsOK());
// Make sure the compiled model was generated and has the expected number of EPContext nodes.
ASSERT_TRUE(std::filesystem::exists(output_model_file));
CheckEpContextNodeCounts(output_model_file, 2, 2);
}
```
### Motivation and Context
Improve compilation workflow and add new capabilities.
---------
Co-authored-by: Scott McKay <skottmckay@gmail.com>
### Description Address additional review comments on #24207: - Remove use of `#ifdef ORT_MINIMAL_BUILD` in public C/C++ API headers for Compile API - Use `AllocatorPtr` internally to ensure memory is properly released if an exception is thrown while serializing the output model to the user's buffer. - Improve C API function documentation. - Clean up internal `ModelCompilationOptions` class ### Motivation and Context Useful review comments were left on the original PR after merge. This addresses those comments.
… EP (microsoft#24406) ### Description A new overload of CreateProvider() was added to the OpenVINO EP to handle the extraction of EP options from the session option configurations. ### Motivation and Context Allows use of new Compile API. Refer to microsoft#24207 Signed-off-by: bfilipek <bartlomiej.filipek@intel.com>
…ession options (microsoft#24445) ### Description A new overload of CreateProvider() was added to the to handle the extraction of EP options from the session option configurations. ### Motivation and Context Allows use of new Compile API. Refer to microsoft#24207 Signed-off-by: bfilipek <bartlomiej.filipek@intel.com>
Description
List of C API functions:
Example (see unit tests for others):
Motivation and Context
Improve compilation workflow and add new capabilities.