LLVM miscompiles consecutive `half` operations by using too much precision on several backends #97975

beetrees · 2024-07-07T22:33:59Z

Consider the following IR:

define half @f(half %a, half %b, half %c) {
    %d = fadd half %a, %b
    %e = fadd half %d, %c
    ret half %e
}

On backends without native half support, LLVM generally lowers half by converting to float, performing the desired operation, and then converting back to half. For this to be a valid lowering, a conversion back to f16 must occur after each operation, otherwise the excess precision of float will affect the result. For example 65504.0 + 65504.0 + -65504.0 equals infinity if each operation is done at half precision, but will result in 65504.0 if the intermediate value is not rounded to a half but instead kept as a float. This excess runtime precision can lead to miscompilations similar to #89885.

However, it appears that on several backends LLVM will fail to round the intermediate result of consecutive half operations, leading to these kind of bugs. I'm filing this as a single issue rather than a separate issue for each backend as I believe this is an issue with LLVM's default handling of half operations rather than a problem in any specific backend (for instance, AFAIK the only backend-specific code LoongArch has for handling half was added in #94456).

By inspecting the assembly LLVM emits, I've discovered that at least the following backends appear to be affected (in brackets are a specific target triple that I've checked):

AVR (avr-unknown-unknown)
C-SKY (csky-unknown-linux-gnuabiv2)
Hexagon (hexagon-unknown-linux-musl)
LoongArch (loongarch64-unknown-linux-gnu): Fixed by [loongarch][DAG][FREEZE] Fix crash when FREEZE a half(f16) type on loongarch #107791
M68k (m68k-unknown-linux-gnu)
MIPS (mips64el-unknown-linux-gnuabi64)
MSP430 (msp430-none-elf)
PowerPC (powerpc64le-unknown-linux-gnu)
SPARC (sparc64-unknown-linux-gnu)
WASM (wasm32-unknown-wasi): Already reported in On wasm32, half operation results aren't correctly rounded between each operation #96437
Xtensa (xtensa-none-elf)

I also noticed that 32-bit ARM has this problem in LLVM 18 but not on the main branch. Looking through the issue tracker, it seems it's likely that this was fixed by #80440.

Related to #97981.

The text was updated successfully, but these errors were encountered:

llvmbot · 2024-07-07T22:36:33Z

@llvm/issue-subscribers-backend-powerpc

Author: None (beetrees)

Consider the following IR: ```llvm define half @f(half %a, half %b, half %c) { %d = fadd half %a, %b %e = fadd half %d, %c ret half %e } ```

On backends without native half support, LLVM generally lowers half by converting to float, performing the desired operation, and then converting back to half. For this to be a valid lowering, a conversion back to f16 must occur after each operation, otherwise the excess precision of float will affect the result. For example 65504.0 + 65504.0 + -65504.0 equals infinity if each operation is done at half precision, but will result in 65504.0 if the intermediate value is not rounded to a half but instead kept as a float. This excess runtime precision can lead to miscompilations similar to #89885.

However, it appears that on several backends LLVM will fail to round the intermediate result of consecutive half operations, leading to these kind of bugs. I'm filing this as a single issue rather than a separate issue for each backend as I believe this is an issue with LLVM's default handling of half operations rather than a problem in any specific backend (for instance, AFAIK the only backend-specific code LoongArch has for handling half was added in #94456).

By inspecting the assembly LLVM emits, I've discovered that at least the following backends appear to be affected (in brackets are a specific target triple that I've checked):

AVR (avr-unknown-unknown)
Hexagon (hexagon-unknown-linux-musl)
LoongArch (loongarch64-unknown-linux-gnu)
M68k (m68k-unknown-linux-gnu)
MIPS (mips64el-unknown-linux-gnuabi64)
MSP430 (msp430-none-elf)
PowerPC (powerpc64le-unknown-linux-gnu)
SPARC (sparc64-unknown-linux-gnu)
WASM (wasm32-unknown-wasi): Already reported in #96437

I also noticed that 32-bit ARM has this problem in LLVM 18 but not on the main branch. Looking through the issue tracker, it seems it's likely that this was fixed by #80440.

llvmbot · 2024-07-07T22:36:34Z

@llvm/issue-subscribers-backend-hexagon

Author: None (beetrees)

Consider the following IR: ```llvm define half @f(half %a, half %b, half %c) { %d = fadd half %a, %b %e = fadd half %d, %c ret half %e } ```

On backends without native half support, LLVM generally lowers half by converting to float, performing the desired operation, and then converting back to half. For this to be a valid lowering, a conversion back to f16 must occur after each operation, otherwise the excess precision of float will affect the result. For example 65504.0 + 65504.0 + -65504.0 equals infinity if each operation is done at half precision, but will result in 65504.0 if the intermediate value is not rounded to a half but instead kept as a float. This excess runtime precision can lead to miscompilations similar to #89885.

However, it appears that on several backends LLVM will fail to round the intermediate result of consecutive half operations, leading to these kind of bugs. I'm filing this as a single issue rather than a separate issue for each backend as I believe this is an issue with LLVM's default handling of half operations rather than a problem in any specific backend (for instance, AFAIK the only backend-specific code LoongArch has for handling half was added in #94456).

By inspecting the assembly LLVM emits, I've discovered that at least the following backends appear to be affected (in brackets are a specific target triple that I've checked):

AVR (avr-unknown-unknown)
Hexagon (hexagon-unknown-linux-musl)
LoongArch (loongarch64-unknown-linux-gnu)
M68k (m68k-unknown-linux-gnu)
MIPS (mips64el-unknown-linux-gnuabi64)
MSP430 (msp430-none-elf)
PowerPC (powerpc64le-unknown-linux-gnu)
SPARC (sparc64-unknown-linux-gnu)
WASM (wasm32-unknown-wasi): Already reported in #96437

I also noticed that 32-bit ARM has this problem in LLVM 18 but not on the main branch. Looking through the issue tracker, it seems it's likely that this was fixed by #80440.

llvmbot · 2024-07-07T22:36:36Z

@llvm/issue-subscribers-backend-mips

Author: None (beetrees)

Consider the following IR: ```llvm define half @f(half %a, half %b, half %c) { %d = fadd half %a, %b %e = fadd half %d, %c ret half %e } ```

On backends without native half support, LLVM generally lowers half by converting to float, performing the desired operation, and then converting back to half. For this to be a valid lowering, a conversion back to f16 must occur after each operation, otherwise the excess precision of float will affect the result. For example 65504.0 + 65504.0 + -65504.0 equals infinity if each operation is done at half precision, but will result in 65504.0 if the intermediate value is not rounded to a half but instead kept as a float. This excess runtime precision can lead to miscompilations similar to #89885.

However, it appears that on several backends LLVM will fail to round the intermediate result of consecutive half operations, leading to these kind of bugs. I'm filing this as a single issue rather than a separate issue for each backend as I believe this is an issue with LLVM's default handling of half operations rather than a problem in any specific backend (for instance, AFAIK the only backend-specific code LoongArch has for handling half was added in #94456).

By inspecting the assembly LLVM emits, I've discovered that at least the following backends appear to be affected (in brackets are a specific target triple that I've checked):

AVR (avr-unknown-unknown)
Hexagon (hexagon-unknown-linux-musl)
LoongArch (loongarch64-unknown-linux-gnu)
M68k (m68k-unknown-linux-gnu)
MIPS (mips64el-unknown-linux-gnuabi64)
MSP430 (msp430-none-elf)
PowerPC (powerpc64le-unknown-linux-gnu)
SPARC (sparc64-unknown-linux-gnu)
WASM (wasm32-unknown-wasi): Already reported in #96437

I also noticed that 32-bit ARM has this problem in LLVM 18 but not on the main branch. Looking through the issue tracker, it seems it's likely that this was fixed by #80440.

llvmbot · 2024-07-07T22:36:36Z

@llvm/issue-subscribers-lld-wasm

Author: None (beetrees)

Consider the following IR: ```llvm define half @f(half %a, half %b, half %c) { %d = fadd half %a, %b %e = fadd half %d, %c ret half %e } ```

On backends without native half support, LLVM generally lowers half by converting to float, performing the desired operation, and then converting back to half. For this to be a valid lowering, a conversion back to f16 must occur after each operation, otherwise the excess precision of float will affect the result. For example 65504.0 + 65504.0 + -65504.0 equals infinity if each operation is done at half precision, but will result in 65504.0 if the intermediate value is not rounded to a half but instead kept as a float. This excess runtime precision can lead to miscompilations similar to #89885.

However, it appears that on several backends LLVM will fail to round the intermediate result of consecutive half operations, leading to these kind of bugs. I'm filing this as a single issue rather than a separate issue for each backend as I believe this is an issue with LLVM's default handling of half operations rather than a problem in any specific backend (for instance, AFAIK the only backend-specific code LoongArch has for handling half was added in #94456).

By inspecting the assembly LLVM emits, I've discovered that at least the following backends appear to be affected (in brackets are a specific target triple that I've checked):

AVR (avr-unknown-unknown)
Hexagon (hexagon-unknown-linux-musl)
LoongArch (loongarch64-unknown-linux-gnu)
M68k (m68k-unknown-linux-gnu)
MIPS (mips64el-unknown-linux-gnuabi64)
MSP430 (msp430-none-elf)
PowerPC (powerpc64le-unknown-linux-gnu)
SPARC (sparc64-unknown-linux-gnu)
WASM (wasm32-unknown-wasi): Already reported in #96437

I also noticed that 32-bit ARM has this problem in LLVM 18 but not on the main branch. Looking through the issue tracker, it seems it's likely that this was fixed by #80440.

llvmbot · 2024-07-07T22:36:37Z

@llvm/issue-subscribers-backend-m68k

Author: None (beetrees)

Consider the following IR: ```llvm define half @f(half %a, half %b, half %c) { %d = fadd half %a, %b %e = fadd half %d, %c ret half %e } ```

On backends without native half support, LLVM generally lowers half by converting to float, performing the desired operation, and then converting back to half. For this to be a valid lowering, a conversion back to f16 must occur after each operation, otherwise the excess precision of float will affect the result. For example 65504.0 + 65504.0 + -65504.0 equals infinity if each operation is done at half precision, but will result in 65504.0 if the intermediate value is not rounded to a half but instead kept as a float. This excess runtime precision can lead to miscompilations similar to #89885.

However, it appears that on several backends LLVM will fail to round the intermediate result of consecutive half operations, leading to these kind of bugs. I'm filing this as a single issue rather than a separate issue for each backend as I believe this is an issue with LLVM's default handling of half operations rather than a problem in any specific backend (for instance, AFAIK the only backend-specific code LoongArch has for handling half was added in #94456).

By inspecting the assembly LLVM emits, I've discovered that at least the following backends appear to be affected (in brackets are a specific target triple that I've checked):

AVR (avr-unknown-unknown)
Hexagon (hexagon-unknown-linux-musl)
LoongArch (loongarch64-unknown-linux-gnu)
M68k (m68k-unknown-linux-gnu)
MIPS (mips64el-unknown-linux-gnuabi64)
MSP430 (msp430-none-elf)
PowerPC (powerpc64le-unknown-linux-gnu)
SPARC (sparc64-unknown-linux-gnu)
WASM (wasm32-unknown-wasi): Already reported in #96437

I also noticed that 32-bit ARM has this problem in LLVM 18 but not on the main branch. Looking through the issue tracker, it seems it's likely that this was fixed by #80440.

llvmbot · 2024-07-07T22:36:38Z

@llvm/issue-subscribers-backend-msp430

Author: None (beetrees)

Consider the following IR: ```llvm define half @f(half %a, half %b, half %c) { %d = fadd half %a, %b %e = fadd half %d, %c ret half %e } ```

On backends without native half support, LLVM generally lowers half by converting to float, performing the desired operation, and then converting back to half. For this to be a valid lowering, a conversion back to f16 must occur after each operation, otherwise the excess precision of float will affect the result. For example 65504.0 + 65504.0 + -65504.0 equals infinity if each operation is done at half precision, but will result in 65504.0 if the intermediate value is not rounded to a half but instead kept as a float. This excess runtime precision can lead to miscompilations similar to #89885.

However, it appears that on several backends LLVM will fail to round the intermediate result of consecutive half operations, leading to these kind of bugs. I'm filing this as a single issue rather than a separate issue for each backend as I believe this is an issue with LLVM's default handling of half operations rather than a problem in any specific backend (for instance, AFAIK the only backend-specific code LoongArch has for handling half was added in #94456).

By inspecting the assembly LLVM emits, I've discovered that at least the following backends appear to be affected (in brackets are a specific target triple that I've checked):

AVR (avr-unknown-unknown)
Hexagon (hexagon-unknown-linux-musl)
LoongArch (loongarch64-unknown-linux-gnu)
M68k (m68k-unknown-linux-gnu)
MIPS (mips64el-unknown-linux-gnuabi64)
MSP430 (msp430-none-elf)
PowerPC (powerpc64le-unknown-linux-gnu)
SPARC (sparc64-unknown-linux-gnu)
WASM (wasm32-unknown-wasi): Already reported in #96437

I also noticed that 32-bit ARM has this problem in LLVM 18 but not on the main branch. Looking through the issue tracker, it seems it's likely that this was fixed by #80440.

llvmbot · 2024-07-07T22:36:50Z

@llvm/issue-subscribers-backend-webassembly

Author: None (beetrees)

Consider the following IR: ```llvm define half @f(half %a, half %b, half %c) { %d = fadd half %a, %b %e = fadd half %d, %c ret half %e } ```

On backends without native half support, LLVM generally lowers half by converting to float, performing the desired operation, and then converting back to half. For this to be a valid lowering, a conversion back to f16 must occur after each operation, otherwise the excess precision of float will affect the result. For example 65504.0 + 65504.0 + -65504.0 equals infinity if each operation is done at half precision, but will result in 65504.0 if the intermediate value is not rounded to a half but instead kept as a float. This excess runtime precision can lead to miscompilations similar to #89885.

However, it appears that on several backends LLVM will fail to round the intermediate result of consecutive half operations, leading to these kind of bugs. I'm filing this as a single issue rather than a separate issue for each backend as I believe this is an issue with LLVM's default handling of half operations rather than a problem in any specific backend (for instance, AFAIK the only backend-specific code LoongArch has for handling half was added in #94456).

By inspecting the assembly LLVM emits, I've discovered that at least the following backends appear to be affected (in brackets are a specific target triple that I've checked):

AVR (avr-unknown-unknown)
Hexagon (hexagon-unknown-linux-musl)
LoongArch (loongarch64-unknown-linux-gnu)
M68k (m68k-unknown-linux-gnu)
MIPS (mips64el-unknown-linux-gnuabi64)
MSP430 (msp430-none-elf)
PowerPC (powerpc64le-unknown-linux-gnu)
SPARC (sparc64-unknown-linux-gnu)
WASM (wasm32-unknown-wasi): Already reported in #96437

I also noticed that 32-bit ARM has this problem in LLVM 18 but not on the main branch. Looking through the issue tracker, it seems it's likely that this was fixed by #80440.

beetrees · 2024-07-07T23:02:15Z

I'm not sure that the missed-optimization label is correct: not truncating to half after each operation (the current behaviour) is presumably faster than doing the extra work; the issue here is that the current lowering is a miscompilation that violates the semantics of LLVM IR.

vchuravy · 2024-07-08T18:26:45Z

In Julia we handle this by inserting manual conversion operations: https://github.com/JuliaLang/julia/blob/master/src/llvm-demote-float16.cpp

While I am not a fan of this I think the current state is that this is a "frontend" problem, and not a backend issue.

GCC and Clang only recently introduced -fexcess-precision=16
I believe it was previously argued that the C standard allows for the widening of floating point operations with extended precision (the whole x87 debacle).

85d049a

beetrees · 2024-07-08T21:26:56Z

Nice workaround! However I think it's fair to say that frontends shouldn't be required to insert custom passes late in the LLVM optimisation pipeline in order for LLVM to not miscompile basic floating point operations. As the comment at the top of the file mentions, it relies on some other optimisation passes (instcombine) not running after it that could undo all its hard work.

LLVM IR requires that the basic floating point operations (fadd etc.) give accurately rounded deteministic results. While the LangRef itself is a bit ambiguous in this regard (see #60942), this determinism is relied upon by LLVM IR optimisations, and backends not following this requirement can lead to miscompilations that segfault or worse (see #89885 for an example using the x87 floating point bugs). What the C standard allows is relevant to C frontends like clang (which will explicitly fpext to float if it wants to use excess precision for _Float16 operations), but LLVM IR has its own independently-specified semantics. In fact, -fexcess-precision=16 relies on these LLVM IR semantics to work.

Given that this appears to have been fixed for 32-bit ARM in #80440, it seems that most of the fix will be changing softPromoteHalfType() to return true for these targets. The documentation for the function even explicitly states that returning false will lead to the miscompilations.

beetrees · 2024-07-08T21:42:07Z

@vchuravy I also notice the Julia pass also handles bfloat. Out of interest, do you know if there any targets where this is known to be an issue? The LLVM source code (here) appears to indicate that soft promotion is always used for bfloat, and spot checking a bunch of targets in compiler explorer I haven't been able to find any targets where there are excess precision bugs.

vchuravy · 2024-07-09T09:11:11Z

Out of interest, do you know if there any targets where this is known to be an issue?

@maleadt might remember, I think our experience with Float16 made us not trust LLVM handling on platforms that don't support BFloat16 and so we just extended our support there.

jcranmer-intel · 2024-07-09T20:32:22Z

GCC and Clang only recently introduced -fexcess-precision=16 I believe it was previously argued that the C standard allows for the widening of floating point operations with extended precision (the whole x87 debacle).

LLVM IR semantics for the floating-point types are that, e.g., fadd half means "do IEEE-754-precise binary16 addition", effectively as if FLT_EVAL_METHOD were 0. Deviations from that model are bugs that should be fixed; for the types like half that can be emulated in larger types without double rounding issues, that's not a difficult aspect; the x87 issue (where x87_fp80 induces double-rounding if you try to emulate double with forced-conversion) is much more difficult to fix, although still acknowledged to be a bug (see #44218).

The emulation of half via float happens during CodeGen legalization, which isn't a topic I'm terribly familiar with. Off-hand, with a little spelunking, it looks like @topperc implemented the correct lowering of half-via-float emulation in https://reviews.llvm.org/D73749 with a per-target opt-in mechanism, and this has been opted into by x86, ARM, AArch64, and RISC-V. This means that the default handling of half emulation appears to be wrong, which is worrying to me (especially as it means it's likely to be wrong for any other necessary smaller-via-larger-float emulation targets). On the other hand, it does suggest that there's a relatively easy path to fixing this.

topperc · 2024-07-09T21:34:50Z

The emulation of half via float happens during CodeGen legalization, which isn't a topic I'm terribly familiar with. Off-hand, with a little spelunking, it looks like @topperc implemented the correct lowering of half-via-float emulation in https://reviews.llvm.org/D73749 with a per-target opt-in mechanism, and this has been opted into by x86, ARM, AArch64, and RISC-V.

That is indeed that patch. At the time, I couldn't afford to spend the time fixing every target so I didn't try. Thus the opt-in mechanism.

This means that the default handling of half emulation appears to be wrong, which is worrying to me (especially as it means it's likely to be wrong for any other necessary smaller-via-larger-float emulation targets)

I don't think we use the broken TypePromoteFloat handling for anything other than f16. bf16 defaults to TypeSoftPromoteHalf.

Maybe I'll try changing the default for f16 for all targets and see what happens.

maleadt · 2024-07-12T10:17:25Z

Out of interest, do you know if there any targets where this is known to be an issue?

@maleadt might remember, I think our experience with Float16 made us not trust LLVM handling on platforms that don't support BFloat16 and so we just extended our support there.

Yeah, because of the uncertainty how LLVM would treat these operations we inserted the conversions to make sure the results can always be trusted (see JuliaLang/julia#51470 (comment) and the comments below).

beetrees · 2024-08-03T13:19:46Z

I have confirmed that the experimental C-SKY and Xtensa backends appear to also have this issue.

github-actions bot added the new issue label Jul 7, 2024

EugeneZelenko added backend:AVR backend:Hexagon backend:MIPS backend:MSP430 backend:PowerPC backend:Sparc lld:wasm backend:m68k backend:loongarch and removed new issue labels Jul 7, 2024

EugeneZelenko added backend:WebAssembly missed-optimization and removed lld:wasm labels Jul 7, 2024

beetrees changed the title ~~LLVM lowers consecutive half operations with too much precision on several backends~~ LLVM miscompiles consecutive half operations by using too much precision on several backends Jul 7, 2024

EugeneZelenko removed the missed-optimization label Jul 8, 2024

This was referenced Jul 8, 2024

LLVM miscompiles passing/returning half on several backends by using lossy conversions #97981

Open

Tracking Issue for f16 and f128 float types rust-lang/rust#116909

Open

dtcxzyw added miscompilation floating-point Floating-point math labels Jul 8, 2024

This was referenced Aug 8, 2024

WIP: try integrate BFloat16 eschnett/SIMD.jl#124

Open

Fix bfloat16 detection on aarch64 JuliaLang/julia#55417

Closed

giordano mentioned this issue Aug 12, 2024

Consider dropping bfloat16 demotion pass JuliaLang/julia#55479

Closed

This was referenced Aug 22, 2024

Enable f16 tests on loongarch rust-lang/rust#129384

Closed

Enable f16 tests on platforms that were missing conversion symbols rust-lang/rust#129385

Open

nikic mentioned this issue Sep 18, 2024

[loongarch][DAG][FREEZE] Fix crash when FREEZE a half(f16) type on loongarch #107791

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLVM miscompiles consecutive `half` operations by using too much precision on several backends #97975

LLVM miscompiles consecutive `half` operations by using too much precision on several backends #97975

beetrees commented Jul 7, 2024 •

edited by heiher

Loading

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

beetrees commented Jul 7, 2024 •

edited

Loading

vchuravy commented Jul 8, 2024 •

edited

Loading

beetrees commented Jul 8, 2024

beetrees commented Jul 8, 2024

vchuravy commented Jul 9, 2024

jcranmer-intel commented Jul 9, 2024

topperc commented Jul 9, 2024

maleadt commented Jul 12, 2024

beetrees commented Aug 3, 2024

LLVM miscompiles consecutive half operations by using too much precision on several backends #97975

LLVM miscompiles consecutive half operations by using too much precision on several backends #97975

Comments

beetrees commented Jul 7, 2024 • edited by heiher Loading

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

llvmbot commented Jul 7, 2024

beetrees commented Jul 7, 2024 • edited Loading

vchuravy commented Jul 8, 2024 • edited Loading

beetrees commented Jul 8, 2024

beetrees commented Jul 8, 2024

vchuravy commented Jul 9, 2024

jcranmer-intel commented Jul 9, 2024

topperc commented Jul 9, 2024

maleadt commented Jul 12, 2024

beetrees commented Aug 3, 2024

LLVM miscompiles consecutive `half` operations by using too much precision on several backends #97975

LLVM miscompiles consecutive `half` operations by using too much precision on several backends #97975

beetrees commented Jul 7, 2024 •

edited by heiher

Loading

beetrees commented Jul 7, 2024 •

edited

Loading

vchuravy commented Jul 8, 2024 •

edited

Loading