[LLVM] Emit fp16/fp32 builtins directly into target module #12877

kparzysz-quic · 2022-09-22T20:22:53Z

For conversions between _Float16 and float, LLVM uses runtime functions __extendhfsf2 and __truncsfhf2. On X86 up until version 14, LLVM used uint16_t for representing _Float16. Starting with LLVM 15, half- precision values can be passed in XMM registers (i.e. as floating-point). This happens when the compilation target has SSE2 enabled (either directly, or by enabling a feature that implies SSE2).
Because the names of the conversion functions remain unchanged, it is impossible for TVM to provide them in the runtime, and have them work in both cases. To solve this issue, emit these functions directly into the target module after detecting whether or not to use floating-point ABI. To allow the linker to remove potential duplicates (or if they are unused), they are weak and reside in a separate section.

tmoreau89 · 2022-09-22T21:28:48Z

CC @Lunderberg @areusch @masahi

tmoreau89 · 2022-09-23T15:21:13Z

@kparzysz-quic it appears one of the CI unit tests is failing: test_minimal_target_codegen_llvm.py::test_llvm_add_pipeline

Are you able to reproduce this yourself from the LLVM versin used in CI?

kparzysz-quic · 2022-09-23T16:25:27Z

Are you able to reproduce this yourself from the LLVM versin used in CI?

I'm pretty sure it was due to a section name that was invalid for MachO. I limited the section specification to apply to ELF only.

For conversions between `_Float16` and `float`, LLVM uses runtime functions `__extendhfsf2` and `__truncsfhf2`. On X86 up until version 14, LLVM used `uint16_t` for representing `_Float16`. Starting with LLVM 15, half- precision values can be passed in XMM registers (i.e. as floating-point). This happens when the compilation target has SSE2 enabled (either directly, or by enabling a feature that implies SSE2). Because the names of the conversion functions remain unchanged, it is impossible for TVM to provide them in the runtime, and have them work in both cases. To solve this issue, emit these functions directly into the target module after detecting whether or not to use floating-point ABI. To allow the linker to remove potential duplicates (or if they are unused), they are weak and reside in a separate section.

kparzysz-quic · 2022-09-23T22:00:11Z

😎

masahi · 2022-09-26T09:05:55Z

Thanks @kparzysz-quic, I'll verify tomorrow.

masahi

I've verified that this fixes the correctness issue I observed when using fp16 compute on LLVM 16.

…pache#12877)" This reverts commit f64e933.

) For conversions between `_Float16` and `float`, LLVM uses runtime functions `__extendhfsf2` and `__truncsfhf2`. On X86 up until version 14, LLVM used `uint16_t` for representing `_Float16`. Starting with LLVM 15, half- precision values can be passed in XMM registers (i.e. as floating-point). This happens when the compilation target has SSE2 enabled (either directly, or by enabling a feature that implies SSE2). Because the names of the conversion functions remain unchanged, it is impossible for TVM to provide them in the runtime, and have them work in both cases. To solve this issue, emit these functions directly into the target module after detecting whether or not to use floating-point ABI. To allow the linker to remove potential duplicates (or if they are unused), they are weak and reside in a separate section.

masahi approved these changes Sep 27, 2022

View reviewed changes

masahi merged commit f64e933 into apache:main Sep 27, 2022

kparzysz-quic deleted the llvm-conv-fp16 branch September 27, 2022 12:16

supersat added a commit to supersat/tvm that referenced this pull request Oct 20, 2022

Revert "[LLVM] Emit fp16/fp32 builtins directly into target module (a…

7f37cd7

…pache#12877)" This reverts commit f64e933.

leandron mentioned this pull request Feb 1, 2023

TVM v0.11.0 Release Candidate Notes #13899

Closed

Lunderberg mentioned this pull request Aug 22, 2024

[CI][Windows] Workaround for error in Findzstd.cmake #17283

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLVM] Emit fp16/fp32 builtins directly into target module #12877

[LLVM] Emit fp16/fp32 builtins directly into target module #12877

Uh oh!

kparzysz-quic commented Sep 22, 2022

Uh oh!

tmoreau89 commented Sep 22, 2022

Uh oh!

tmoreau89 commented Sep 23, 2022 •

edited

Loading

Uh oh!

kparzysz-quic commented Sep 23, 2022

Uh oh!

kparzysz-quic commented Sep 23, 2022

Uh oh!

masahi commented Sep 26, 2022

Uh oh!

masahi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[LLVM] Emit fp16/fp32 builtins directly into target module #12877

[LLVM] Emit fp16/fp32 builtins directly into target module #12877

Uh oh!

Conversation

kparzysz-quic commented Sep 22, 2022

Uh oh!

tmoreau89 commented Sep 22, 2022

Uh oh!

tmoreau89 commented Sep 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kparzysz-quic commented Sep 23, 2022

Uh oh!

kparzysz-quic commented Sep 23, 2022

Uh oh!

masahi commented Sep 26, 2022

Uh oh!

masahi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tmoreau89 commented Sep 23, 2022 •

edited

Loading