Skip to content

Conversation

@kparzysz-quic
Copy link
Contributor

For conversions between _Float16 and float, LLVM uses runtime functions __extendhfsf2 and __truncsfhf2. On X86 up until version 14, LLVM used uint16_t for representing _Float16. Starting with LLVM 15, half- precision values can be passed in XMM registers (i.e. as floating-point). This happens when the compilation target has SSE2 enabled (either directly, or by enabling a feature that implies SSE2).
Because the names of the conversion functions remain unchanged, it is impossible for TVM to provide them in the runtime, and have them work in both cases. To solve this issue, emit these functions directly into the target module after detecting whether or not to use floating-point ABI. To allow the linker to remove potential duplicates (or if they are unused), they are weak and reside in a separate section.

@tmoreau89
Copy link
Contributor

CC @Lunderberg @areusch @masahi

@tmoreau89
Copy link
Contributor

tmoreau89 commented Sep 23, 2022

@kparzysz-quic it appears one of the CI unit tests is failing: test_minimal_target_codegen_llvm.py::test_llvm_add_pipeline

Are you able to reproduce this yourself from the LLVM versin used in CI?

@kparzysz-quic
Copy link
Contributor Author

Are you able to reproduce this yourself from the LLVM versin used in CI?

I'm pretty sure it was due to a section name that was invalid for MachO. I limited the section specification to apply to ELF only.

For conversions between `_Float16` and `float`, LLVM uses runtime functions
`__extendhfsf2` and `__truncsfhf2`.  On X86 up until version 14, LLVM used
`uint16_t` for representing `_Float16`. Starting with LLVM 15, half-
precision values can be passed in XMM registers (i.e. as floating-point).
This happens when the compilation target has SSE2 enabled (either directly,
or by enabling a feature that implies SSE2).
Because the names of the conversion functions remain unchanged, it is
impossible for TVM to provide them in the runtime, and have them work in
both cases. To solve this issue, emit these functions directly into the
target module after detecting whether or not to use floating-point ABI.
To allow the linker to remove potential duplicates (or if they are unused),
they are weak and reside in a separate section.
@kparzysz-quic
Copy link
Contributor Author

😎

@masahi
Copy link
Member

masahi commented Sep 26, 2022

Thanks @kparzysz-quic, I'll verify tomorrow.

Copy link
Member

@masahi masahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've verified that this fixes the correctness issue I observed when using fp16 compute on LLVM 16.

@masahi masahi merged commit f64e933 into apache:main Sep 27, 2022
@kparzysz-quic kparzysz-quic deleted the llvm-conv-fp16 branch September 27, 2022 12:16
supersat added a commit to supersat/tvm that referenced this pull request Oct 20, 2022
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
)

For conversions between `_Float16` and `float`, LLVM uses runtime functions
`__extendhfsf2` and `__truncsfhf2`.  On X86 up until version 14, LLVM used
`uint16_t` for representing `_Float16`. Starting with LLVM 15, half-
precision values can be passed in XMM registers (i.e. as floating-point).
This happens when the compilation target has SSE2 enabled (either directly,
or by enabling a feature that implies SSE2).
Because the names of the conversion functions remain unchanged, it is
impossible for TVM to provide them in the runtime, and have them work in
both cases. To solve this issue, emit these functions directly into the
target module after detecting whether or not to use floating-point ABI.
To allow the linker to remove potential duplicates (or if they are unused),
they are weak and reside in a separate section.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants