From 9a25afd1c25596e1c382a05b495156f7049f933f Mon Sep 17 00:00:00 2001 From: Robin Kuzmin Date: Fri, 4 Jun 2021 16:53:01 -0700 Subject: [PATCH 1/6] Initial add of QIR-RT-API-Design-Guidelines.md --- src/Qir/QIR-RT-API-Design-Guidelines.md | 715 ++++++++++++++++++++++++ 1 file changed, 715 insertions(+) create mode 100644 src/Qir/QIR-RT-API-Design-Guidelines.md diff --git a/src/Qir/QIR-RT-API-Design-Guidelines.md b/src/Qir/QIR-RT-API-Design-Guidelines.md new file mode 100644 index 00000000000..04ce108334b --- /dev/null +++ b/src/Qir/QIR-RT-API-Design-Guidelines.md @@ -0,0 +1,715 @@ +# QIR Runtime API Design Guidelines + +## Contents +* [Summary of This Document](#summary-of-this-document) +* [Guidelines](#guidelines) + * [Terms](#terms) + * [Know the Issues When Crossing the ABI Boundary](#know-the-issues-when-crossing-the-abi-boundary) + * [Be Aware of Different Compilers for Different Binaries](#be-aware-of-different-compilers-for-different-binaries) + * [Expose as Little as Necessary](#expose-as-little-as-necessary) + * [Expose C Only (and IR Wrappers For It)](#expose-c-only-(and-ir-wrappers-for-it)) + * [Allocate and Deallocate In The Same Binary](#allocate-and-deallocate-in-the-same-binary) + * [Avoid Exposing Entities With Multiple Layouts](#avoid-exposing-entities-with-multiple-layouts) + * [Be Careful When Exposing Certain Data Types](#be-careful-when-exposing-certain-data-types) + * [Integer Types](#integer-types) + * [`char`](#`char`) + * [`enum`](#`enum`) + * [`bool`](#`bool`) + * [Floating-Point Types](#floating-point-types) + * [Don't Let the C++ Exceptions Cross the ABI Boundary](#don't-let-the-c++-exceptions-cross-the-abi-boundary) + * [The `noexcept` Specifier and Exception Specification](#the-`noexcept`-specifier-and-exception-specification) + * [Let the User Decide Whether to Crash or Not](#let-the-user-decide-whether-to-crash-or-not) +* [Considerations Taken Into Account](#considerations-taken-into-account) + * [The `bool` Type](#the-`bool`-type) + * [The `char` Type](#the-`char`-type) + * [The `enum` Type](#the-`enum`-type) + * [The Floating-Point Types](#the-floating-point-types) + * [The Integer Types](#the-integer-types) +* [Details About the Starting Point](#details-about-the-starting-point) +* [Information Sources](#information-sources) +* [Questions Still To Resolve](#questions-still-to-resolve) + * [Do we want to move from a _set_ of dynamic libraries to a _single_ dynamic library?](#do-we-want-to-move-from-a-set-of-dynamic-libraries-to-a-single-dynamic-library?) + * [Do we need the IR wrappers?](#do-we-need-the-ir-wrappers?) + +## Summary of This Document + +At the moment of writing the [QIR Runtime](https://github.com/microsoft/qsharp-runtime/tree/main/src/Qir/Runtime) +was a set of dynamic libraries exposing the API in the form of both C and C++ entities +(plus IR wrappers around certain C functions). +The C++ part and some other features, according to +[this post](https://stackoverflow.com/questions/22797418/how-do-i-safely-pass-objects-especially-stl-objects-to-and-from-a-dll/22797419#22797419), +were to cause problems for the users. + +Since the QIR Runtime was going public, we needed to mitigate those problems. These guidelines were among the first steps in that effort, +intended to provide the instructions about what to do and what to avoid when designing the QIR Runtime API or making changes to it. + +Those who need just the summary of recommendations (the _result_ of the decision-making process), +can run throuhg the [Guidelines](#guidelines) section. +Those who would like to see the grounds/reasoning behind the decisions made, can see the +[Considerations Taken Into Account](#considerations-taken-into-account) section. + + +## Guidelines + +### Terms + +**ABI** - [Application Binary Interface](https://en.wikipedia.org/wiki/Application_binary_interface) +**binary** - A dynamic library or an executable file. +The interacting binaries can be written in different languages and built with different compilers. +**cross-ABI interface** - The interface between the binaries. +This document mostly deals with the interface between the QIR Runtime dynamic library on the one hand, +and the executable and other dynamic libraries (like simulator) on the other hand. +**IR** - +[intermediate representation](https://blog.gopheracademy.com/advent-2018/llvm-ir-and-go/#:~:text=LLVM%20IR%20is%20a%20low,number%20of%20function%20local%20registers.&text=Front%2Dend%3A%20compiles%20source%20language,compiles%20IR%20to%20machine%20code.) +of [LLVM](https://llvm.org/). +**QIR** - quantum intermediate representation - a subset of the IR. Is generated by the Q# compiler from the Q# code. +**RT** - Runtime. + +### Know the Issues When Crossing the ABI Boundary +See +[the post](https://stackoverflow.com/questions/22797418/how-do-i-safely-pass-objects-especially-stl-objects-to-and-from-a-dll/22797419#22797419) +that resulted in these guidelines. + +### Be Aware of Different Compilers for Different Binaries +Take into account that different binaries can be written in different languages, by different people, +built with different compilers or different versions of the compiler, +with different command line arguments, by different scripts, on different machines. + +Bear in mind that QIR RT API needs to be callable from other languages and environments. +At the moment of writing the particular plan was to call the QIR RT API from the +[IQ#](https://ms-quantum.visualstudio.com/Quantum%20Program/_wiki/wikis/Quantum-OKRs.wiki/470/IQSharp). +Next, there was a chance of calling from Python, and possibly Rust. + +Avoid using the features that preclude calling the QIR RT API from other languages and environments. + + +### Expose as Little as Necessary +Every piece of code has a long-term cost associated with it. +The more code we write, the more bugs we can introduce, the more care is needed during migration to the newer compiler +and while porting to a different platform. The higher is the risk of identifier conflict with the other people's code +and new features of the language. The more identifiers we use, the less freedom remains for a different use of those identifiers. + +### Expose C Only (and IR Wrappers For It) +Do not expose the C++ identifiers and features (not present in C) - +the C++ data types (e.g. `class`), templates, function overloads, namespaces, etc. +To facilitate that for yourself and for the others, consider strictly separating +the C and C++ code into .h/.c files for C, and .hpp/.cpp. files for C++. Expose C headers only. + +If the exposed headers need to include the other headers, then include the C headers only, not the C++ headers. +In particular, the standard C headers, +e.g. [``](https://en.cppreference.com/w/c/io), have the corresponding wrapping headers in C++, +e.g. [``](https://en.cppreference.com/w/cpp/io/c). Such wrapping headers add the namespace prefix +`std::` to certain C declarations, but not all. The C++'s wrapping headers +may enclose the corresponding C header in the `extern "C"` block (and probably do something else). +This is the right thing for the C++ code internal to a binary, but not what we want in the API exposed by the QIR RT. +So, in the C headers that QIR RT exposes, include C's header `` (if needed), but not C++'s header ``. + +Everything that your C header exposes, needs to be within the +[`extern "C"`](https://en.cppreference.com/w/cpp/language/language_linkage) block +(visible to C++ compiler but not to C compiler, because `extern "C"` is a C++ feature). Example: +```c++ +#ifdef __cplusplus +extern "C" +{ +#endif + +.. // `#include`s +.. // Exposed API. + +#ifdef __cplusplus +} // extern "C" +#endif +``` +Such that the exposed API will be treated as C code (code having C language linkage), +regardless of whether this header is included by the C or C++ source file. + +### Allocate and Deallocate In The Same Binary +Since different binaries can be built with differnt compilers, +these bianaries can be linked with different C/C++ runtimes, i.e. they can be using different heaps. +The allocation made in a heap must be deallocated in the same heap. +I.e. the allocation and corresponding deallocation must be done by the same binary. +If the binary exposes a function that returns a pointer to the heap-allocated block, +then that binary must also expose a function that deallocates such a pointer. +Otherwise the heap used for allocation can leak the data, +and the heap used for deallocation can be corrupted. + +### Avoid Exposing Entities With Multiple Layouts +Avoid exposing the entities that may have different layouts in different binaries. +E.g. the alignment padding between the `struct` fields +can be different in the QIR RT dynamic library and the executable that uses the QIR RT. Thus + +_Avoid exposing_ `struct`. + +If you need to expose a mechanism that manipulates a `struct` instance, then you can expose +* a function that allocates the `struct` and returns the `void*` (`void` pointer) to it, +* a function that deallocates the `struct`, taking the `void*` parameter, +* functions that manipulate the `struct`, taking the `void*` to the `struct` as the first argument. + +E.g. for the +[`QirArray`](https://github.com/microsoft/qsharp-runtime/blob/9eb00a0f3f4beb1eee97dccb23e16ad995f5398e/src/Qir/Runtime/public/QirTypes.hpp#L15) + +| Instead Of | Possible Option | +|--------------------------------------------------------------------|---------------------| +| `QirArray::QirArray(int64_t cQubits)` (constructor) | `void* QirArray_NewQubits(int64_t qubitCount)` | +| `QirArray::QirArray(const QirArray* other)` (copy constructor) | `void* QirArray_NewCopy(void* other)` | +| `QirArray::~QirArray()` (destructor) | `void QirArray_Delete(void *)` | +| `char* QirArray::GetItemPointer(int64_t index)` (memeber function) | `void* QirArray_GetItemPointer(uint64_t index)` | + +Note: In IR the `void*` is not available as a type. You can expose `i8*` instead, and in the calling C/C++ code use `int8_t*`. + +If it is still better to expose `struct` then strictly document the `struct` layout and size your code expects. +You can enforce the layout checks with the +[compile-time C assertions](https://en.cppreference.com/w/c/error/static_assert): +```c +#include // offsetof() +#include // static_assert() + +typedef struct +{ + char c; // 1-byte field. + // 3 bytes of alignment gap. + uint32_t u32; // Is expected at offset 4. + // Total size is 8. +} +MyStruct; + +static_assert(offsetof(MyStruct, u32) == sizeof(uint32_t), "Unexpected layout of `MyStruct`"); +static_assert(sizeof(MyStruct) == (2 * sizeof(uint32_t)), "Unexpected size of `MyStruct`"); +``` + +Note: The +[`#pragma pack`](https://stackoverflow.com/questions/3318410/pragma-pack-effect/3318475#3318475) +is not guaranteed to help your exposed API. The pragma can be silently ignored by the user's compiler. + +Note: For providing a direct access to the `struct` fileds +you may want to expose functions that return the address of a particular field inside of the `struct`: +```c +uint32_t* MyStruct_GetU32Ptr(void* pStruct); // Returns the address of `u32` field of a struct + // pointed to by the parameter. +``` + +### Be Careful When Exposing Certain Data Types +The size (in bytes) of such data types as `bool`, C's `enum`, `signed` and `unsigned` {`short`, `int`, `long`, `long long`} +is implementation-specific, thus can be different on different sides of the ABI boundary. +In general prefer the fixed-size alternatives. See the type-specific details below. + +#### Integer Types +In the exposed API, for integers, prefer the fixed-size types defined in +[``](https://www.cplusplus.com/reference/cstdint/?kw=stdint.h), e.g. `uint8_t`, `int32_t`, etc. + +Note: Avoid using in the exposed API the _non-fixed-size_ types from that header, +such as `uint_least8_t`, `uint_fast16_t` (but feel free to use them internally in your binary). + +Note: If you use the [printf-family functions](https://man7.org/linux/man-pages/man3/printf.3.html) +for formatted output of the types defined in ``, +or [scanf-family functions](https://man7.org/linux/man-pages/man3/scanf.3.html) for formatted input, +then use the macros defined in [``](https://www.cplusplus.com/reference/cinttypes/). +Example: +```c +#include // uint32_t +#include // SCNu32, PRIu32 +#include // scanf(), printf() + +uint32_t u32; // Fixed-size variable. +scanf(SCNu32, &u32); // Formatted input. +printf("Value is " PRIu32 "\n", u32); // Formatted output. +``` + +#### `char` +The size of types `char`, `unsigned char`, `signed char` is fixed and is equal to 1 byte. + +In the exposed API it is safe to use `char` for characters and C strings (null-terminated sequences of `char`s). + +For integers of size 1 prefer `uint8_t` or `int8_t`. If you use `char` for integers, +then specify explicitly either `unsigned char` or `signed char`, +because the plain `char` is interpreted differently in different implementations. +In some implementations `char` defaults to `unsigned char`, in the others - to `signed char`. + +#### `enum` +The [C's `enum`](https://en.cppreference.com/w/c/language/enum) type is _non-fixed-size_ +(as opposed to [C++'s `enum`](https://en.cppreference.com/w/cpp/language/enum) that can have an _underlying type_). + +The code will be safe if you do not use `enum` type in the function signatures (function parameter types or return type). +Exposing `enum` as a stand-alone type (and exposing variables of that type) requires care, +control over the compatible fixed-size types that are safe to cast the `enum` to. +So, think twice before exposing `enum`. Document well the ranges of values, safe types to cast to, etc. +Use mechanisms that prevent the erroneous use. +See details in [The `enum` Type](#the-`enum`-type) section. + +If you still plan to expose `enum`: + +Note: In C language if you define the `enum` type like this: +```c +enum MyEnumType +{ + .. +}; +``` +then _in C_ you will have to write `enum MyEnumType`, i.e. 2 words, every time you use the `enum`. +If you want to use just 1 word - `MyEnumType` - you can define the `enum` type like this: +```c +typedef enum +{ + .. +} +MyEnumType; +``` + +Note: You can expose the enumerators only (rather that the whole `enum` type). +But remember, the enumerators in C have the type `int` - _non-fixed-size type_. +```c +enum +{ + ENUMERATOR_A, // Has type `int`. + ENUMERATOR_B // Has type `int`. +}; +``` +Trick: Chris Granade has shown how Rust's cbindgen exposes enums to C as having a fixed size: +```c +enum Pauli +{ + I = 0, + X = 1, + Y = 3, + Z = 2, +}; +typedef uint8_t Pauli; +``` +As long as in the exposed API you type `Pauli` (rather than `enum Pauli`), you use the fixed-size type `uint8_t`. +And the type `enum Pauli` defines the range of values +and the constants that can be used with `Pauli` type (internally in the binaries). + +#### `bool` +The `bool` type is non-fixed-size (both in [C](https://en.cppreference.com/w/c/types/boolean) and +[C++](https://en.cppreference.com/w/cpp/language/types#Boolean_type)). +Be careful if using `bool` in the exposed API, prefer a fixed-size integer type instead (e.g. `uint8_t`). + +If using fixed-size integer instead of `bool`, strictly document the range of values, +in particular what value corresponds to the concept of `true` +(value of `1`, or any non-zero value, or other). + +For your information: [In Rust](https://doc.rust-lang.org/reference/types/boolean.html) +> **An object with the boolean type has a size** and alignment **of 1** each. +The value false has the bit pattern `0x00` and the value true has the bit pattern `0x01`. + +Note: One may want for `bool` to follow the Chris Granade's trick for `enum` mentioned above. +Example: +```c +// enum Bool : uint8_t { False, True }; +typedef uint8_t Bool; +``` +As long as in the exposed API you write `Bool` (rather than `bool`), you use the fixed-size type `uint8_t`. +And the comment explains the type `Bool`. + +When choosing the range of values, +and for the safe internal use of Booleans and their substitutes, see also +* [Avoid Comparing Booleans to `true`](https://github.com/kuzminrobin/code_review_notes/blob/master/cpp_design_bookmarks.md#avoid-comparing-booleans-to-true) +* [Know the Limitations of `memset()` When Initializing](https://github.com/kuzminrobin/code_review_notes/blob/master/cpp_design_bookmarks.md#know-the-limitations-of-memset-when-initializing) + + +#### Floating-Point Types + +Strictly speaking, the floating-point types (`float`, `double`, `long double`) are non-fixed-size. +However these guidelines suggest to expect that the following floating-point types conform to the same standards +on both sides of the ABI boundary: +* `float` matches IEEE-754 _32 bit_ floating point type (IEC 60559 single format), +* `double` matches IEEE-754 _64 bit_ floating point type (IEC 60559 double format). + +To detect the size discrepancy, you can use the following +[compile-time C assertions](https://en.cppreference.com/w/c/error/static_assert) +in the exposed C headers (if your exposed API uses a floating-point type): +```c +static_assert(sizeof(float ) == sizeof(uint32_t), "Unexpected size of `float`" ); +static_assert(sizeof(double) == sizeof(uint64_t), "Unexpected size of `double`"); +``` +As for [`long double` type](https://en.cppreference.com/w/c/language/arithmetic_types#Real_floating_types), +the decision has not been made at the time of writing. If you need to expose the mechanisms that use this type, +then you can follow the recomendations in the +[Avoid Exposing Entities With Multiple Layouts](#avoid-exposing-entities-with-multiple-layouts) section. + +The examples of where the floating-point types were used at the moment of writing are +* [`%String* @__quantum__rt__double_to_string(double %val)`](https://github.com/microsoft/qsharp-runtime/blob/a15856ea9aac6244718865481dbbd89b70056166/src/Qir/Runtime/lib/QIR/bridge-rt.ll#L457), +* [`QIR_SHARED_API QirString* quantum__rt__double_to_string(double)`](https://github.com/microsoft/qsharp-runtime/blob/4ee33c0a1fb89cc859254175fea74b841a4fd01c/src/Qir/Runtime/public/QirRuntime.hpp#L203), + +a number of math functions, e.g. +* [`dllexport double @__quantum__qis__sin__body(double %theta)`](https://github.com/microsoft/qsharp-runtime/blob/4ee33c0a1fb89cc859254175fea74b841a4fd01c/src/Qir/Runtime/lib/QSharpFoundation/qsharp-foundation-qis.ll#L121). + + +### Don't Let the C++ Exceptions Cross the ABI Boundary +Our cross-ABI interface is a subset of C interface. The C interface does not include the C++ exceptions. The C++ exceptions +crossing the ABI Boundary are likely to stay uncaught and result in a crash. +If you are still thinking about designing an interface that tunnels the C++ exceptions through our subset of C +(an interface where the C++ exceptions are thrown on one side and caught on the other side of the ABI boundary), +then you should know the following. + +In some implementations the C++ exception instances are allocated on the heap during `throw`, and then deallocated during `catch`. +If `throw` and `catch` are on different sides of the ABI boundary then they are likely to use different heaps. +The consequence is described in the +[Allocate and Deallocate In The Same Binary](#allocate-and-deallocate-in-the-same-binary) section above. + +At the moment of writing the +[Fail-Statement](https://docs.microsoft.com/en-us/azure/quantum/user-guide/language/statements/returnsandtermination#fail-statement) +implementation +([`quantum__rt__fail_cstr()`](https://github.com/microsoft/qsharp-runtime/blob/b10447eff672b7ba4f2e8564302888a8eb6bce8a/src/Qir/Runtime/lib/QIR/utils.cpp#L60)) +was throwing an exception not caught anywhere. +This was causing a reasonably peaceful termination as long as the same compiler was used on both sides of the ABI boundary. +The plan was to replace the exception throw with the [`exit()`](https://www.cplusplus.com/reference/cstdlib/exit/) +or [`terminate()`](https://www.cplusplus.com/reference/exception/terminate) +or [`abort()`](https://www.cplusplus.com/reference/cstdlib/abort) or other alternatives. + +To minimize the risk of getting the C++ exceptions crossing the ABI boundary, minimize the use of the throwing calls. +Consider the non-throwing versions of the calls that you use. +E.g. +* Consider the _nothrow_ version of the [`operator new()`](https://www.cplusplus.com/reference/new/operator%20new/) +(for single elements) and +[`operator new[]()`](https://www.cplusplus.com/reference/new/operator%20new[]/) (for arrays), e.g. +```c++ +int * myDynamicIntArray = new int[10]; // In case of allocation failure THROWS! Avoid using this. +int * myDynamicIntArray = new (std::nothrow) int[10]; // In case of allocation failure returns the nullptr. Consider this first. +``` +* If using the [`dynamic_cast<>()`](https://en.cppreference.com/w/cpp/language/dynamic_cast), +consider casting the _pointers_ rather than _references_, e.g. +```c++ +MyClass myInstance = .. // Create an instance of a class that may or may not implement the `IMyInterface` interface. +IMyInterface & myItfRef = dynamic_cast( myInstance); // Casts the _reference_, in case of failure THROWS! Avoid using this. +IMyInterface * myItfPtr = dynamic_cast(&myInstance); // Casts the _pointer_, in case of failure returns the nullptr. Consider this first. +``` + +#### The `noexcept` Specifier and Exception Specification + +At the moment of writing there is _no_ recommendation about using or avoiding the +[`noexcept` specifier](https://en.cppreference.com/w/cpp/language/noexcept_spec) or +[exception specification](https://en.cppreference.com/w/cpp/language/except_spec) +(for the C++ code internal to your binary), e.g. +> void func() **noexcept** +void func2() **throw()** + +This absence of recommendation is based on the materials listed below. Parts of them are definitely out-of-date. +If you have a strong opinion (and, for example, can demonstrate in more details +why `noexcept` and/or `throw()` are better than not using them) +then you are more than welcome to share your knowledge. + +Materials: +* (Current state) The [`noexcept` specifier](https://en.cppreference.com/w/cpp/language/noexcept_spec), +the fragments of a special interest: +> Non-throwing functions are permitted to call potentially-throwing functions .. +> Note that a `noexcept` specification on a function is not a compile-time check .. +> `void baz() noexcept { throw 42; }` + +* (1996) [[MEC++]](https://github.com/kuzminrobin/code_review_notes/blob/master/book_list.md#mec++) +"More Effective C++" by Scott Meyers, Item 14: Use exception specifications judiciously +* (2002.07) [[pl@es]](https://github.com/kuzminrobin/code_review_notes/blob/master/article_list.md#pl@es) +Herb Sutter. A Pragmatic Look at Exception Specifications +* (<=2004) [[eses]](https://github.com/kuzminrobin/code_review_notes/blob/master/article_list.md#eses) +Herb Sutter. Exception Safety and Exception Specifications: Are They Worth It? +Item 4. When is it worth it to write exception specifications on functions? Why would you choose to write one, or why not? +* (Current state) +[C++ Core Guidelines](https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Re-noexcept) +(search for "noexcept") + +### Let the User Decide Whether to Crash or Not + +In some scenarios it is reasonable for the application or algorithm to crash upon memory allocation failure. +But if the crash is silent then we are not sure what to do, to restart, or to chase and fix a bug, +or to run on a different machine with more memory, etc. + +We want at least a message in the console, or a pop-up window, telling the reason of a crash, +and even better providing other actionalble information, like a stack trace, values of the local variables and parameters +(and ideally an opportunity to break in the debugger and/or getting a core dump for the subsequent analysis). +Generating such an information may require from our application some extra memory which is already exhausted. +To work around this we may want our application to pre-allocate a resonable chunk of memory during the start (the _rescue chunk_), +and once our application hits the memory allocation failure, +the application will deallocate the rescue chunk and generate the premortal disgnostics. + +In a more strict enviroment we may want our application to run 24 hours a day, 365 days a year (as a part of a cloud service for example). +During such a lengthy run the memory can get extremely fragmented, but a significant part of it can still be availble in small pieces. +The attempt to allocate a resonably large piece can fail because of the fragmentation, but would succeed if the memory was not fragmented. +To work around, our application, upon memory allocation failure, may want to run the defragmentation algorithm, e.g. +* deallocate the rescue chunk, +* save the data to a file, +* delete the data from the memory (memory becomes nearly empty), +* pre-allocate the resue chunk again (for the next run of the defragmentation algorithm), +* read the data from the file back to the memory (memory becomes defragmented), + +Then the application will repeat the failed memory allocation operation, likely succeed, and continue +(or save the current state and clearly tell the user to run again (continue) on a machine with more memory). + +The same applies to the exhaustion of any other resourse, like a number of qubits, number of file handles, disk space, etc. + +To summarize, exhaustion of a resource is not necessarily a bug, and is not as severe as it might look. +It is something that can be worked around (or stopped and continued in a better environment). +And only the user (not runtime) knows the requirements and scenarios. +That is why it seems reasonable to let the _user_ decide whether to crash or not. + +**Take-away:** In case of resource axhaustion consider returning the failure to the user, rather than crashing. + +(From C/C++ runtime we don't expect +[`malloc()`](https://man7.org/linux/man-pages/man3/malloc.3.html), +[`fopen()`](https://en.cppreference.com/w/c/io/fopen), or +[`operator new()`](https://www.cplusplus.com/reference/new/operator%20new/) +to crash, right? +The users of QIR RT very likely don't expect the resource allocation functions to crash either) + +Some more details about handling the memory allocation failure can be found in +* [[EC++3]](https://github.com/kuzminrobin/code_review_notes/blob/master/book_list.md#ec++3) +Scott Meyers. _Effective C++: 55 Specific Ways to Improve Your Programs and Designs_ (3rd Edition). +Chapter 8: Customizing `new` and `delete`. + +## Considerations Taken Into Account + +### The `bool` Type + +In C language `bool` is a macro that expands to the keyword `_Bool` ([details](https://en.cppreference.com/w/c/types/boolean)). + +[[C17_N2176](https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf)], +6.2.5 Types, paragraph 2 +> An object declared as type `_Bool` **is large enough to store the values 0 and 1**. + +I.e. the size of `bool` is _at least 1 bit_, but there is no upper limit. Type `bool` is _non-fixed-size_. + +Be careful if using `bool` in the exposed API, prefer a fixed-size integer type instead (e.g. `uint8_t`). +Illustration: +* Your dynamic library exposes the `bool` global variable - `extern bool MyVar;`. +* In your dynamic library the type `bool` is _1 byte_. +* The user's executable includes your header. +* In the user's executable the type `bool` is _4 bytes_. +* When the user's code updates your variable - `MyVar = true`, the code updates _4 bytes_ (rather than _1_) - +the byte of your variable and _3 more bytes outside of it_ (thus corrupting the data after `MyVar`). + +### The `char` Type + +**`sizeof(char)`, `sizeof(unsigned char)`, `sizeof(signed char)`, are 1** +[[C17_N2176](https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf)]: +6.5.3.4 The `sizeof` and `_Alignof` operators, paragraph 4: +> When `sizeof` is applied to an operand that has type `char`, `unsigned char`, or `signed char`, +(or a qualified version thereof) the result is 1. + +**`char` can default to `signed char` or `unsigned char`** +6.2.5 Types, paragraph 15: +> The three types `char`, `signed char`, and `unsigned char` are collectively called the _character types_. +The **implementation shall define** `char` to have the same range, +representation, and behavior as **either `signed char` or `unsigned char`**.45) + +> 45\) `CHAR_MIN`, defined in ``, will have one of the values 0 or `SCHAR_MIN`, +and this can be used to distinguish the two options. Irrespective of the choice made, +**`char` is a separate type from the other two and is not compatible with either**. + +So, don't use `char` for integers, use `unsigned char` or `signed char` instead. + +6.3.1.1 Boolean, characters, and integers. Paragraph 3 +> ... whether a "plain" char can hold negative values is **implementation-defined**. + + +### The `enum` Type +[[C17_N2176](https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf)] +6.7.2.2 Enumeration specifiers, Paragraph 4 +> Each enumerated type shall be compatible with `char`, a signed integer type, or an unsigned integer type. +**The choice of type is implementation-defined**... + +I.e. the [C's `enum`](https://en.cppreference.com/w/c/language/enum) type is _non-fixed-size_ +(as opposed to [C++'s `enum`](https://en.cppreference.com/w/cpp/language/enum) that can have an _underlying type_). + + +**The `enum` in function signatures** +The code will be safe if you _do not_ use `enum` type in the function signatures (function parameter types or return type). + +Illustration: +* Let's say your dynamic library exposes a function having a parameter `e` of the `enum` type, +and some parameters before and after it: +```c +typedef enum { .. } MyEnumType; +void func(uint16 uia, MyEnumType e, uint16_t uib); +``` +* Let's imagine that (according to the [calling convention](https://en.wikipedia.org/wiki/Calling_convention)) +the parameter `e` needs to be passed through the _stack_ (rather than a CPU register). +* Let's say in your dynamic library the `enum` type is 2 bytes in size. So, during the call, +the 2 bytes need to be allocated on the stack for the parameter `e`. +* Immediately after it, the 2 bytes need to be allocated on the stack for the parameter `uib` +(or the parameter `uia`, the order of the parameters in the stack depends on the +[language linkage](https://en.cppreference.com/w/cpp/language/language_linkage) and/or +calling convention). +* The user's executable includes your header and calls your function `func()`. +* But in the user's executable the `MyEnumType` is _4 bytes_ in size. +So the 4 bytes are allocated on the stack (instead of 2). +* And the value for parameter `uib` is pushed to the stack _after those 4 bytes_, +to a place where the function `func()` does not expect it. + +To summarize, the caller passes `uib` through that part of the stack where the callee `func()` does not expect it, +and where the callee expects it, the 0 is passed. + +**The stand-alone `enum`** +Exposing `enum` as a stand-alone type (and exposing variables of that type) requires care, +control over the compatible fixed-size types that are safe to cast the `enum` to. + +Illustration: +* You need your dynamic library to expose a function having a parameter of the `enum` type. +* You expose the `enum` type separately. +```c +typedef enum +{ + ENUMERATOR_A, + ENUMERATOR_B +} +MyEnumType; +``` +* You expose the function, but, to be safe, you expose the parameter as a fixed-size type `uint8_t`. +```c +void func(uint8_t param); +``` + +The user's executable uses your dynamic library. The executable's source file (.c) includes your header +(that exposes the `enum` type and the function). The user's code creates a local varaible of the `enum` type, +initializes the variable with one of the enumerators (of that `enum` type), +and passes the variable as an argument to your function. +The argument (of `enum` type) is implicitly cast to the type of the parameter - `uint8_t`. + +_Is it safe_ to implicitly cast such an `enum` to `unit8_t`? Or, maybe the `enum` type has enumerators with value above `0xFF`, +and that's why the `enum` type is at least 2 bytes in size +(and upon cast to `uint8_t` only the least significant byte will end up in the parameter)? + +Do you rely on the _C_ compiler to warn? +Even if it _does_ warn, the programmers sometimes just carelessly _explicitly_ cast `enum` to `unit8_t` +(without the anlysis of the value range). + +So, think twice before exposing `enum`. Document well the ranges of values, safe types to cast to, etc. +Use mechanisms that prevent the erroneous use, e.g. +[compile-time C assertions](https://en.cppreference.com/w/c/error/static_assert): +```c +static_assert(ENUMERATOR_B <= UINT8_MAX, + "`MyEnumType` is incompatible with the parameter of `func()`. " + "Increase the parameter type"); +``` + + +### The Floating-Point Types + +The explicit specification of the *size* for the floating-point types was not found in one of the late drafts +of the C17 standard (see [Information Sources](#information-sources)). However the section "F.2 Types" says: +> — The `float` type matches the IEC 60559 single format. +— The `double` type matches the IEC 60559 double format. + +Unfortunately in the ISO/IEC 60559, Edition 2.0 2020-05 (and IEEE Std 754™-2019, IEEE Std 754™-2008), the size specification +of "single format" and "double format" was not found. + +However the C's [Real floating types](https://en.cppreference.com/w/c/language/arithmetic_types#Real_floating_types) +section of cppreference states: +> +* `float` - single precision floating point type. Matches IEEE-754 32 bit floating point type if supported. +* `double` - double precision floating point type. Matches IEEE-754 64 bit floating point type if supported +* `long double` - extended precision floating point type. Matches IEEE-754 extended floating-point type if supported, +otherwise matches some non-standard extended floating-point type as long as its precision is better than `double` +and range is at least as good as `double`, otherwise matches the type `double`. +Some x86 and x86_64 implementations use the 80-bit x87 floating point type. + +Also the following was taken into account +* [Single-precision floating-point format](https://en.wikipedia.org/wiki/Single-precision_floating-point_format) +(Wikipedia) +* [Double-precision floating-point format](https://en.wikipedia.org/wiki/Double-precision_floating-point_format) +(Wikipedia) +* [Fixed-size floating point types](https://stackoverflow.com/questions/2524737/fixed-size-floating-point-types) +(Stackoverflow) +* [What is the size of float and double in C and C++? [duplicate]](https://stackoverflow.com/questions/25524355/what-is-the-size-of-float-and-double-in-c-and-c) +(Stackoverflow) + + +### The Integer Types + +(Not strictly related to this document, but is still provided for those who are interested. +Also answers a number of other questions about the integer types, and supports certain statements in the other sections). + +It seemed commonly known that +`sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)` +But looks like C does not guarantee that, as opposed to C++, and the topic is rather arguable. +See details [here](https://stackoverflow.com/a/67678644/6362941). + + +## Details About the Starting Point + +(Provided for historical reasons) + +At the moment of writing the QIR Runtime was provided as a set of dynamic libraries, generated during the build in +the directory `qsharp-runtime\src\Qir\Runtime\bin\Debug\bin` (and `qsharp-runtime\src\Qir\drops\bin`). +The dynamic library names were +* `Microsoft.Quantum.Qir.QSharp.Core.dll` +* `Microsoft.Quantum.Qir.QSharp.Foundation.dll` +* `Microsoft.Quantum.Qir.Runtime.dll` + +The QIR Runtime had two interfaces. +On the one side, it *exposed* the functionality that could be called by the IR code (`*.ll` files) +generated by the Q# compiler from the Q# code. + +E.g. the user wrote a Q# program that outputed "Hello, World". +The Q# compiler translated that code to the IR (`*.ll` file). +The IR was making calls to the QIR RT, and the QIR RT outputed the string to the console. + +On the other side the QIR RT *used* the quantum-hardware-specific functionality (simulator to start with), +implemented in a separate dynamic library, external to the QIR RT. +That external library in some cases could be a part of the quantum simulator (virtual quantum computer), +in the other cases it could be a part of the quantum computer infrastructure (real quantum computer). + +E.g. the user wrote a Q# program that did some operations on a qubit. Q# compiler translated that code to the IR. +The IR made calls to the QIR RT, and the QIR RT forwarded those calls +to the quantum computer (virtual or real) dynamic library. + +This document mostly discusses the API *exposed* by the QIR RT to the IR generated from the Q#. +But further changes to the API that QIR RT _uses_ +(the interface to the quantum simulator or computer) are also subject to these guidelines. + +The exposed API at the moment of writing consisted of +* C API, e.g. the functions in +[QirRuntime.hpp](https://github.com/microsoft/qsharp-runtime/blob/613ff5d2a5b20fa538fb2188148b07fa7f71d703/src/Qir/Runtime/public/QirRuntime.hpp#L53), +* [IR wrappers](https://github.com/microsoft/qsharp-runtime/blob/a15856ea9aac6244718865481dbbd89b70056166/src/Qir/Runtime/lib/QIR/bridge-rt.ll#L135) +around some of the C API functions, +* C++ API, e.g. C++ data types +([std::vector](https://github.com/microsoft/qsharp-runtime/blob/613ff5d2a5b20fa538fb2188148b07fa7f71d703/src/Qir/Runtime/public/QirTypes.hpp#L21)), +C++ +[member functions](https://github.com/microsoft/qsharp-runtime/blob/613ff5d2a5b20fa538fb2188148b07fa7f71d703/src/Qir/Runtime/public/QirTypes.hpp#L34). + +The exposed API was also using the +[non-fixed-size data types](https://github.com/microsoft/qsharp-runtime/blob/613ff5d2a5b20fa538fb2188148b07fa7f71d703/src/Qir/Runtime/public/QirTypes.hpp#L20) - +`int`. + +The C++ API, non-fixed-size data types, and some other features, were about to cause issues for the users. + + +## Information Sources + +* [C/C++ Standards and Late Drafts](https://stackoverflow.com/a/83763/6362941) +* [[C11_N1570]](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) - One of the late drafts of C11. +* [[C11_N1570_HTML]](https://port70.net/~nsz/c/c11/n1570.html) - HTML version of the late draft. +* [[C17_N2176]](https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf) - +One of the late drafts of C17. +* IEEE Std 754™-2008. +* IEEE Std 754™-2019. +* [How do I safely pass objects, especially STL objects, to and from a DLL?](https://stackoverflow.com/questions/22797418/how-do-i-safely-pass-objects-especially-stl-objects-to-and-from-a-dll/22797419#22797419) (Stackoverflow answer). +* ISO/IEC 60559, Edition 2.0 2020-05. + + +## Questions Still To Resolve + +(Temporary section, would be nice to resolve during the code review. Please provide your opinion) + +### Do we want to move from a _set_ of dynamic libraries to a _single_ dynamic library? +The current set is + * Microsoft.Quantum.Qir.QSharp.Core.dll + * Microsoft.Quantum.Qir.QSharp.Foundation.dll + * Microsoft.Quantum.Qir.Runtime.dll + +If they are merged to a single dynamic library, then they can use a full power of C++ to interact with each other, +but still _not to expose_ C++. +If not, then they will have to +* either follow these guidelines and expose a subset of C, even for interacting with each other, +* or to expose C++ (for interacting with each other), with the risk that the other users will start using the exposed C++. + +### Do we need the IR wrappers? +The known reason for the IR wrappers (in `*.ll` files) to exist is that the IR wrappers names start with the double underscore `__`. + +The identifiers starting with double underscore are reserved in the standards of C and C++, see +[[C17_N2176]](https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf), +7.1.3 Reserved identifiers: +> **All identifiers that begin with an underscore and** either an uppercase letter or +**another underscore are always reserved** for any use... + +Are there any other reasons? +Can Q# compiler generate the QIR such that instead of the IR wrappers the corresponding wrapped functions are called? +(without `__` in the beginning) \ No newline at end of file From b334113d50623c0f8bc105a85615e4c9db2866fc Mon Sep 17 00:00:00 2001 From: Robin Kuzmin Date: Fri, 4 Jun 2021 16:55:51 -0700 Subject: [PATCH 2/6] Removed couple of sections not for public audience --- src/Qir/QIR-RT-API-Design-Guidelines.md | 84 ------------------------- 1 file changed, 84 deletions(-) diff --git a/src/Qir/QIR-RT-API-Design-Guidelines.md b/src/Qir/QIR-RT-API-Design-Guidelines.md index 04ce108334b..2f1d1a34a5a 100644 --- a/src/Qir/QIR-RT-API-Design-Guidelines.md +++ b/src/Qir/QIR-RT-API-Design-Guidelines.md @@ -25,11 +25,7 @@ * [The `enum` Type](#the-`enum`-type) * [The Floating-Point Types](#the-floating-point-types) * [The Integer Types](#the-integer-types) -* [Details About the Starting Point](#details-about-the-starting-point) * [Information Sources](#information-sources) -* [Questions Still To Resolve](#questions-still-to-resolve) - * [Do we want to move from a _set_ of dynamic libraries to a _single_ dynamic library?](#do-we-want-to-move-from-a-set-of-dynamic-libraries-to-a-single-dynamic-library?) - * [Do we need the IR wrappers?](#do-we-need-the-ir-wrappers?) ## Summary of This Document @@ -622,56 +618,6 @@ It seemed commonly known that But looks like C does not guarantee that, as opposed to C++, and the topic is rather arguable. See details [here](https://stackoverflow.com/a/67678644/6362941). - -## Details About the Starting Point - -(Provided for historical reasons) - -At the moment of writing the QIR Runtime was provided as a set of dynamic libraries, generated during the build in -the directory `qsharp-runtime\src\Qir\Runtime\bin\Debug\bin` (and `qsharp-runtime\src\Qir\drops\bin`). -The dynamic library names were -* `Microsoft.Quantum.Qir.QSharp.Core.dll` -* `Microsoft.Quantum.Qir.QSharp.Foundation.dll` -* `Microsoft.Quantum.Qir.Runtime.dll` - -The QIR Runtime had two interfaces. -On the one side, it *exposed* the functionality that could be called by the IR code (`*.ll` files) -generated by the Q# compiler from the Q# code. - -E.g. the user wrote a Q# program that outputed "Hello, World". -The Q# compiler translated that code to the IR (`*.ll` file). -The IR was making calls to the QIR RT, and the QIR RT outputed the string to the console. - -On the other side the QIR RT *used* the quantum-hardware-specific functionality (simulator to start with), -implemented in a separate dynamic library, external to the QIR RT. -That external library in some cases could be a part of the quantum simulator (virtual quantum computer), -in the other cases it could be a part of the quantum computer infrastructure (real quantum computer). - -E.g. the user wrote a Q# program that did some operations on a qubit. Q# compiler translated that code to the IR. -The IR made calls to the QIR RT, and the QIR RT forwarded those calls -to the quantum computer (virtual or real) dynamic library. - -This document mostly discusses the API *exposed* by the QIR RT to the IR generated from the Q#. -But further changes to the API that QIR RT _uses_ -(the interface to the quantum simulator or computer) are also subject to these guidelines. - -The exposed API at the moment of writing consisted of -* C API, e.g. the functions in -[QirRuntime.hpp](https://github.com/microsoft/qsharp-runtime/blob/613ff5d2a5b20fa538fb2188148b07fa7f71d703/src/Qir/Runtime/public/QirRuntime.hpp#L53), -* [IR wrappers](https://github.com/microsoft/qsharp-runtime/blob/a15856ea9aac6244718865481dbbd89b70056166/src/Qir/Runtime/lib/QIR/bridge-rt.ll#L135) -around some of the C API functions, -* C++ API, e.g. C++ data types -([std::vector](https://github.com/microsoft/qsharp-runtime/blob/613ff5d2a5b20fa538fb2188148b07fa7f71d703/src/Qir/Runtime/public/QirTypes.hpp#L21)), -C++ -[member functions](https://github.com/microsoft/qsharp-runtime/blob/613ff5d2a5b20fa538fb2188148b07fa7f71d703/src/Qir/Runtime/public/QirTypes.hpp#L34). - -The exposed API was also using the -[non-fixed-size data types](https://github.com/microsoft/qsharp-runtime/blob/613ff5d2a5b20fa538fb2188148b07fa7f71d703/src/Qir/Runtime/public/QirTypes.hpp#L20) - -`int`. - -The C++ API, non-fixed-size data types, and some other features, were about to cause issues for the users. - - ## Information Sources * [C/C++ Standards and Late Drafts](https://stackoverflow.com/a/83763/6362941) @@ -683,33 +629,3 @@ One of the late drafts of C17. * IEEE Std 754™-2019. * [How do I safely pass objects, especially STL objects, to and from a DLL?](https://stackoverflow.com/questions/22797418/how-do-i-safely-pass-objects-especially-stl-objects-to-and-from-a-dll/22797419#22797419) (Stackoverflow answer). * ISO/IEC 60559, Edition 2.0 2020-05. - - -## Questions Still To Resolve - -(Temporary section, would be nice to resolve during the code review. Please provide your opinion) - -### Do we want to move from a _set_ of dynamic libraries to a _single_ dynamic library? -The current set is - * Microsoft.Quantum.Qir.QSharp.Core.dll - * Microsoft.Quantum.Qir.QSharp.Foundation.dll - * Microsoft.Quantum.Qir.Runtime.dll - -If they are merged to a single dynamic library, then they can use a full power of C++ to interact with each other, -but still _not to expose_ C++. -If not, then they will have to -* either follow these guidelines and expose a subset of C, even for interacting with each other, -* or to expose C++ (for interacting with each other), with the risk that the other users will start using the exposed C++. - -### Do we need the IR wrappers? -The known reason for the IR wrappers (in `*.ll` files) to exist is that the IR wrappers names start with the double underscore `__`. - -The identifiers starting with double underscore are reserved in the standards of C and C++, see -[[C17_N2176]](https://web.archive.org/web/20181230041359if_/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf), -7.1.3 Reserved identifiers: -> **All identifiers that begin with an underscore and** either an uppercase letter or -**another underscore are always reserved** for any use... - -Are there any other reasons? -Can Q# compiler generate the QIR such that instead of the IR wrappers the corresponding wrapped functions are called? -(without `__` in the beginning) \ No newline at end of file From 5bf25fedbce9f9ee7e625beaf24a902c08776d67 Mon Sep 17 00:00:00 2001 From: Robin Kuzmin Date: Mon, 7 Jun 2021 10:44:48 -0700 Subject: [PATCH 3/6] Links fix --- src/Qir/QIR-RT-API-Design-Guidelines.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/src/Qir/QIR-RT-API-Design-Guidelines.md b/src/Qir/QIR-RT-API-Design-Guidelines.md index 2f1d1a34a5a..c041f50ad27 100644 --- a/src/Qir/QIR-RT-API-Design-Guidelines.md +++ b/src/Qir/QIR-RT-API-Design-Guidelines.md @@ -7,22 +7,22 @@ * [Know the Issues When Crossing the ABI Boundary](#know-the-issues-when-crossing-the-abi-boundary) * [Be Aware of Different Compilers for Different Binaries](#be-aware-of-different-compilers-for-different-binaries) * [Expose as Little as Necessary](#expose-as-little-as-necessary) - * [Expose C Only (and IR Wrappers For It)](#expose-c-only-(and-ir-wrappers-for-it)) + * [Expose C Only](#expose-c-only) * [Allocate and Deallocate In The Same Binary](#allocate-and-deallocate-in-the-same-binary) * [Avoid Exposing Entities With Multiple Layouts](#avoid-exposing-entities-with-multiple-layouts) * [Be Careful When Exposing Certain Data Types](#be-careful-when-exposing-certain-data-types) * [Integer Types](#integer-types) - * [`char`](#`char`) - * [`enum`](#`enum`) - * [`bool`](#`bool`) + * [`char`](#char) + * [`enum`](#enum) + * [`bool`](#bool) * [Floating-Point Types](#floating-point-types) - * [Don't Let the C++ Exceptions Cross the ABI Boundary](#don't-let-the-c++-exceptions-cross-the-abi-boundary) - * [The `noexcept` Specifier and Exception Specification](#the-`noexcept`-specifier-and-exception-specification) + * [Don't Let the C++ Exceptions Cross the ABI Boundary](#dont-let-the-c-exceptions-cross-the-abi-boundary) + * [The `noexcept` Specifier and Exception Specification](#the-noexcept-specifier-and-exception-specification) * [Let the User Decide Whether to Crash or Not](#let-the-user-decide-whether-to-crash-or-not) * [Considerations Taken Into Account](#considerations-taken-into-account) - * [The `bool` Type](#the-`bool`-type) - * [The `char` Type](#the-`char`-type) - * [The `enum` Type](#the-`enum`-type) + * [The `bool` Type](#the-bool-type) + * [The `char` Type](#the-char-type) + * [The `enum` Type](#the-enum-type) * [The Floating-Point Types](#the-floating-point-types) * [The Integer Types](#the-integer-types) * [Information Sources](#information-sources) From 5ad1d5c6a94fb5aaacbf778bac411f8e6e025b1b Mon Sep 17 00:00:00 2001 From: Robin Kuzmin Date: Mon, 7 Jun 2021 11:06:48 -0700 Subject: [PATCH 4/6] Link Fix --- src/Qir/QIR-RT-API-Design-Guidelines.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/Qir/QIR-RT-API-Design-Guidelines.md b/src/Qir/QIR-RT-API-Design-Guidelines.md index c041f50ad27..b56987b0ac6 100644 --- a/src/Qir/QIR-RT-API-Design-Guidelines.md +++ b/src/Qir/QIR-RT-API-Design-Guidelines.md @@ -85,7 +85,7 @@ The more code we write, the more bugs we can introduce, the more care is needed and while porting to a different platform. The higher is the risk of identifier conflict with the other people's code and new features of the language. The more identifiers we use, the less freedom remains for a different use of those identifiers. -### Expose C Only (and IR Wrappers For It) +### Expose C Only Do not expose the C++ identifiers and features (not present in C) - the C++ data types (e.g. `class`), templates, function overloads, namespaces, etc. To facilitate that for yourself and for the others, consider strictly separating From e0ff6d10277f8ff8b3ced47604fb6366a9bfb4ca Mon Sep 17 00:00:00 2001 From: DmitryVasilevsky <60718360+DmitryVasilevsky@users.noreply.github.com> Date: Fri, 11 Jun 2021 12:49:30 -0700 Subject: [PATCH 5/6] Use of exceptions recommendation (#719) * Use of exceptions recommendation * Update src/Qir/QIR-RT-API-Design-Guidelines.md * Co-authored-by: Mathias Soeken --- src/Qir/QIR-RT-API-Design-Guidelines.md | 148 ++++++++++-------------- 1 file changed, 60 insertions(+), 88 deletions(-) diff --git a/src/Qir/QIR-RT-API-Design-Guidelines.md b/src/Qir/QIR-RT-API-Design-Guidelines.md index b56987b0ac6..f3f1492176f 100644 --- a/src/Qir/QIR-RT-API-Design-Guidelines.md +++ b/src/Qir/QIR-RT-API-Design-Guidelines.md @@ -353,16 +353,67 @@ The plan was to replace the exception throw with the [`exit()`](https://www.cplu or [`terminate()`](https://www.cplusplus.com/reference/exception/terminate) or [`abort()`](https://www.cplusplus.com/reference/cstdlib/abort) or other alternatives. -To minimize the risk of getting the C++ exceptions crossing the ABI boundary, minimize the use of the throwing calls. -Consider the non-throwing versions of the calls that you use. -E.g. -* Consider the _nothrow_ version of the [`operator new()`](https://www.cplusplus.com/reference/new/operator%20new/) -(for single elements) and -[`operator new[]()`](https://www.cplusplus.com/reference/new/operator%20new[]/) (for arrays), e.g. +On the other hand, exceptions are a natural part of the C++ standard. It is virtually impossible to use +the standard library (STL, std namespace) and not to use exceptions. All _exceptional_ situations are +reported by using exceptions (such as [`out_of_range`](https://en.cppreference.com/w/cpp/error/out_of_range) or [`invalid_argument`](https://en.cppreference.com/w/cpp/error/invalid_argument) to name a few). So it is safe to assume +that the C++ code we are developing will throw exceptions even if they aren't thrown right in the code we write. + +Recommendation is to use std namespace and exceptions in C++ code **only where appropriate**. Many recommendations +exist on how to use exceptions properly and minimize performance penalties. +* As part of [C++ core guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines). +* MSVC Documentation: [Modern C++ best practices for exceptions and error handling](https://docs.microsoft.com/en-us/cpp/cpp/errors-and-exception-handling-modern-cpp?view=msvc-160) and [How to: Design for exception safety](https://docs.microsoft.com/en-us/cpp/cpp/how-to-design-for-exception-safety?view=msvc-160). +* [Exceptions and Error Handling](https://isocpp.org/wiki/faq/exceptions) As part of Modern C++ FAQ. +* Google guidelines recommend against exceptions, but [allow them in windows code, especially while using STL](https://google.github.io/styleguide/cppguide.html#Windows_Code). +* [Vishal Chovatiya Guidelines](http://www.vishalchovatiya.com/7-best-practices-for-exception-handling-in-cpp-with-example/), which may be easier to read. +This section will describe only exceptions with regards to the runtime API. + +* Do not handle hardware exceptions and out-of-memory exception + +It is not recommended to catch asynchronous harware exceptions. (MSVC compiler calls this +[Structured Exception Handling](https://docs.microsoft.com/en-us/cpp/cpp/structured-exception-handling-c-cpp?view=msvc-160).) +Examples of hardware exceptions are the exceptions thrown when accessing a forbidden memory location as a result of a null pointer access, stack overflow, corrupt pointer dereferencing, unaligned memory access, etc.. If such exceptional situation arises, application should fail fast and quit. Typically hardware exceptions are handled in a system level code, and our runtime is an application with +regards to the host OS. NOTE: The [MSVC's default exception handling behavior](https://docs.microsoft.com/en-us/cpp/build/reference/eh-exception-handling-model?view=msvc-160#default-exception-handling-behavior) does not conform to the C++ standard. Use the `/EHsc` flag to [conform to the standard](https://docs.microsoft.com/en-us/cpp/build/reference/eh-exception-handling-model?view=msvc-160#standard-c-exception-handling). + +It is also not recommended to handle out-of-memory exception as there's a very little chance of handling it right. +For such handling to work correctly there shouldn't be any memory allocation from the point of exception, in the stack +unwinding process, till the handling of exception and in the handling of exception until the out-of-memory situation +is resolved. Just let the application fail. For the same reason, using non-throwing version of operator new is not recommended. + +* Mark interface functions with `noexcept` + +If you implement an interface function mark it with the [noexcept specifier](https://en.cppreference.com/w/cpp/language/noexcept_spec): + +Inside of your binary if you implement a function that should not throw exceptions, then mark it with the [noexcept specifier](https://en.cppreference.com/w/cpp/language/noexcept_spec). + +> void interface_func() **noexcept** + +extern "C" functions aren't considered noexcept by default. Consider marking them noexcept. +Keep in mind that `noexcept` specifier is a C++ feature. If you need to add `noexcept` specifier +to a public header or any other header that should be compatible with a pure C compiler +make sure to use `__cplusplus` guards. For example: + ```c++ -int * myDynamicIntArray = new int[10]; // In case of allocation failure THROWS! Avoid using this. -int * myDynamicIntArray = new (std::nothrow) int[10]; // In case of allocation failure returns the nullptr. Consider this first. +#ifdef __cplusplus +#define NOEXCEPT noexcept // `noexcept` in C++ only. +extern "C" +{ +#else +#define NOEXCEPT // In C this macro is empty. +#endif + +.. // `#include`s +void f() NOEXCEPT; // Exposed function. It is `noexcept` if the header is included to the C++ source file. + +#ifdef __cplusplus +} // extern "C" +#endif ``` + +If exception is thrown (and not caught) within the noexcept function, the application will terminate. Termination will +ensure that exceptions will never cross the API boundaries. An overhead is incurred if a noexcept function +calls functions that can throw exceptions, so consider marking all functions that shouldn't throw exceptions as noexcept. +If you are calling functions that can throw exceptions, make sure to handle them properly. + * If using the [`dynamic_cast<>()`](https://en.cppreference.com/w/cpp/language/dynamic_cast), consider casting the _pointers_ rather than _references_, e.g. ```c++ @@ -370,86 +421,7 @@ MyClass myInstance = .. // Create an instance of a class that may or may not imp IMyInterface & myItfRef = dynamic_cast( myInstance); // Casts the _reference_, in case of failure THROWS! Avoid using this. IMyInterface * myItfPtr = dynamic_cast(&myInstance); // Casts the _pointer_, in case of failure returns the nullptr. Consider this first. ``` - -#### The `noexcept` Specifier and Exception Specification - -At the moment of writing there is _no_ recommendation about using or avoiding the -[`noexcept` specifier](https://en.cppreference.com/w/cpp/language/noexcept_spec) or -[exception specification](https://en.cppreference.com/w/cpp/language/except_spec) -(for the C++ code internal to your binary), e.g. -> void func() **noexcept** -void func2() **throw()** - -This absence of recommendation is based on the materials listed below. Parts of them are definitely out-of-date. -If you have a strong opinion (and, for example, can demonstrate in more details -why `noexcept` and/or `throw()` are better than not using them) -then you are more than welcome to share your knowledge. - -Materials: -* (Current state) The [`noexcept` specifier](https://en.cppreference.com/w/cpp/language/noexcept_spec), -the fragments of a special interest: -> Non-throwing functions are permitted to call potentially-throwing functions .. -> Note that a `noexcept` specification on a function is not a compile-time check .. -> `void baz() noexcept { throw 42; }` - -* (1996) [[MEC++]](https://github.com/kuzminrobin/code_review_notes/blob/master/book_list.md#mec++) -"More Effective C++" by Scott Meyers, Item 14: Use exception specifications judiciously -* (2002.07) [[pl@es]](https://github.com/kuzminrobin/code_review_notes/blob/master/article_list.md#pl@es) -Herb Sutter. A Pragmatic Look at Exception Specifications -* (<=2004) [[eses]](https://github.com/kuzminrobin/code_review_notes/blob/master/article_list.md#eses) -Herb Sutter. Exception Safety and Exception Specifications: Are They Worth It? -Item 4. When is it worth it to write exception specifications on functions? Why would you choose to write one, or why not? -* (Current state) -[C++ Core Guidelines](https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Re-noexcept) -(search for "noexcept") - -### Let the User Decide Whether to Crash or Not - -In some scenarios it is reasonable for the application or algorithm to crash upon memory allocation failure. -But if the crash is silent then we are not sure what to do, to restart, or to chase and fix a bug, -or to run on a different machine with more memory, etc. - -We want at least a message in the console, or a pop-up window, telling the reason of a crash, -and even better providing other actionalble information, like a stack trace, values of the local variables and parameters -(and ideally an opportunity to break in the debugger and/or getting a core dump for the subsequent analysis). -Generating such an information may require from our application some extra memory which is already exhausted. -To work around this we may want our application to pre-allocate a resonable chunk of memory during the start (the _rescue chunk_), -and once our application hits the memory allocation failure, -the application will deallocate the rescue chunk and generate the premortal disgnostics. - -In a more strict enviroment we may want our application to run 24 hours a day, 365 days a year (as a part of a cloud service for example). -During such a lengthy run the memory can get extremely fragmented, but a significant part of it can still be availble in small pieces. -The attempt to allocate a resonably large piece can fail because of the fragmentation, but would succeed if the memory was not fragmented. -To work around, our application, upon memory allocation failure, may want to run the defragmentation algorithm, e.g. -* deallocate the rescue chunk, -* save the data to a file, -* delete the data from the memory (memory becomes nearly empty), -* pre-allocate the resue chunk again (for the next run of the defragmentation algorithm), -* read the data from the file back to the memory (memory becomes defragmented), - -Then the application will repeat the failed memory allocation operation, likely succeed, and continue -(or save the current state and clearly tell the user to run again (continue) on a machine with more memory). - -The same applies to the exhaustion of any other resourse, like a number of qubits, number of file handles, disk space, etc. - -To summarize, exhaustion of a resource is not necessarily a bug, and is not as severe as it might look. -It is something that can be worked around (or stopped and continued in a better environment). -And only the user (not runtime) knows the requirements and scenarios. -That is why it seems reasonable to let the _user_ decide whether to crash or not. - -**Take-away:** In case of resource axhaustion consider returning the failure to the user, rather than crashing. - -(From C/C++ runtime we don't expect -[`malloc()`](https://man7.org/linux/man-pages/man3/malloc.3.html), -[`fopen()`](https://en.cppreference.com/w/c/io/fopen), or -[`operator new()`](https://www.cplusplus.com/reference/new/operator%20new/) -to crash, right? -The users of QIR RT very likely don't expect the resource allocation functions to crash either) - -Some more details about handling the memory allocation failure can be found in -* [[EC++3]](https://github.com/kuzminrobin/code_review_notes/blob/master/book_list.md#ec++3) -Scott Meyers. _Effective C++: 55 Specific Ways to Improve Your Programs and Designs_ (3rd Edition). -Chapter 8: Customizing `new` and `delete`. +This way you can handle the erroneous situation without raising exceptions. ## Considerations Taken Into Account From 6e5a1b3a0d5c63c4253b1fa43182926fe28eb6c5 Mon Sep 17 00:00:00 2001 From: Robin Kuzmin Date: Fri, 11 Jun 2021 13:11:40 -0700 Subject: [PATCH 6/6] Minor corrections after merge --- src/Qir/QIR-RT-API-Design-Guidelines.md | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/src/Qir/QIR-RT-API-Design-Guidelines.md b/src/Qir/QIR-RT-API-Design-Guidelines.md index f3f1492176f..ccdbdbd7a5f 100644 --- a/src/Qir/QIR-RT-API-Design-Guidelines.md +++ b/src/Qir/QIR-RT-API-Design-Guidelines.md @@ -17,8 +17,6 @@ * [`bool`](#bool) * [Floating-Point Types](#floating-point-types) * [Don't Let the C++ Exceptions Cross the ABI Boundary](#dont-let-the-c-exceptions-cross-the-abi-boundary) - * [The `noexcept` Specifier and Exception Specification](#the-noexcept-specifier-and-exception-specification) - * [Let the User Decide Whether to Crash or Not](#let-the-user-decide-whether-to-crash-or-not) * [Considerations Taken Into Account](#considerations-taken-into-account) * [The `bool` Type](#the-bool-type) * [The `char` Type](#the-char-type) @@ -354,16 +352,16 @@ or [`terminate()`](https://www.cplusplus.com/reference/exception/terminate) or [`abort()`](https://www.cplusplus.com/reference/cstdlib/abort) or other alternatives. On the other hand, exceptions are a natural part of the C++ standard. It is virtually impossible to use -the standard library (STL, std namespace) and not to use exceptions. All _exceptional_ situations are +the standard library (STL, `std` namespace) and not to use exceptions. All _exceptional_ situations are reported by using exceptions (such as [`out_of_range`](https://en.cppreference.com/w/cpp/error/out_of_range) or [`invalid_argument`](https://en.cppreference.com/w/cpp/error/invalid_argument) to name a few). So it is safe to assume that the C++ code we are developing will throw exceptions even if they aren't thrown right in the code we write. -Recommendation is to use std namespace and exceptions in C++ code **only where appropriate**. Many recommendations +Recommendation is to use `std` namespace and exceptions in C++ code **only where appropriate**. Many recommendations exist on how to use exceptions properly and minimize performance penalties. * As part of [C++ core guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines). * MSVC Documentation: [Modern C++ best practices for exceptions and error handling](https://docs.microsoft.com/en-us/cpp/cpp/errors-and-exception-handling-modern-cpp?view=msvc-160) and [How to: Design for exception safety](https://docs.microsoft.com/en-us/cpp/cpp/how-to-design-for-exception-safety?view=msvc-160). * [Exceptions and Error Handling](https://isocpp.org/wiki/faq/exceptions) As part of Modern C++ FAQ. -* Google guidelines recommend against exceptions, but [allow them in windows code, especially while using STL](https://google.github.io/styleguide/cppguide.html#Windows_Code). +* Google guidelines recommend against exceptions, but [allow them in Windows code, especially while using STL](https://google.github.io/styleguide/cppguide.html#Windows_Code). * [Vishal Chovatiya Guidelines](http://www.vishalchovatiya.com/7-best-practices-for-exception-handling-in-cpp-with-example/), which may be easier to read. This section will describe only exceptions with regards to the runtime API. @@ -387,9 +385,9 @@ Inside of your binary if you implement a function that should not throw exceptio > void interface_func() **noexcept** -extern "C" functions aren't considered noexcept by default. Consider marking them noexcept. +The `extern "C"` functions aren't considered `noexcept` by default. Consider marking them `noexcept`. Keep in mind that `noexcept` specifier is a C++ feature. If you need to add `noexcept` specifier -to a public header or any other header that should be compatible with a pure C compiler +to a public header or any other header that should be compatible with a C compiler make sure to use `__cplusplus` guards. For example: ```c++ @@ -409,9 +407,9 @@ void f() NOEXCEPT; // Exposed function. It is `noexcept` if the header is inc #endif ``` -If exception is thrown (and not caught) within the noexcept function, the application will terminate. Termination will +If exception is thrown (and not caught) within the `noexcept` function, the application will terminate. Termination will ensure that exceptions will never cross the API boundaries. An overhead is incurred if a noexcept function -calls functions that can throw exceptions, so consider marking all functions that shouldn't throw exceptions as noexcept. +calls functions that can throw exceptions, so consider marking all functions that shouldn't throw exceptions as `noexcept`. If you are calling functions that can throw exceptions, make sure to handle them properly. * If using the [`dynamic_cast<>()`](https://en.cppreference.com/w/cpp/language/dynamic_cast),