-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[DAG] SelectionDAG::canCreateUndefOrPoison - Mark AVGFLOORS and AVGCEILS as safe #148191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ILS as safe This patch updates `SelectionDAG::canCreateUndefOrPoison` to indicate that `ISD::AVGFLOORS` and `ISD::AVGCEILS` do not introduce poison or undef values. This is formally verified using Alive2: - AVGFLOORS: https://alive2.llvm.org/ce/z/JWZcNr - AVGCEILS: https://alive2.llvm.org/ce/z/cW3jrR These patterns are safe due to the use of `sext i8` into `i32`, which ensures no signed overflow occurs. The arithmetic is done in the wider domain before truncating safely back to `i8`. Includes test coverage to ensure correctness. Differential Revision: <fill this after uploading to Phabricator or GH>
@llvm/pr-subscribers-llvm-selectiondag Author: None (aabhinavg1) Changes…ILS as safe This patch updates This is formally verified using Alive2:
These patterns are safe due to the use of Includes test coverage to ensure correctness. Differential Revision: <fill this after uploading to Phabricator or GH> Full diff: https://github.com/llvm/llvm-project/pull/148191.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index c1356239ad206..78f00809e3862 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -5542,6 +5542,10 @@ bool SelectionDAG::canCreateUndefOrPoison(SDValue Op, const APInt &DemandedElts,
case ISD::UADDSAT:
case ISD::SSUBSAT:
case ISD::USUBSAT:
+ case ISD::AVGFLOORS:
+ case ISD::AVGFLOORU:
+ case ISD::AVGCEILS:
+ case ISD::AVGCEILU:
case ISD::MULHU:
case ISD::MULHS:
case ISD::SMIN:
diff --git a/llvm/test/CodeGen/AArch64/avgfloor-u8.ll b/llvm/test/CodeGen/AArch64/avgfloor-u8.ll
new file mode 100644
index 0000000000000..669f6bbad14a3
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/avgfloor-u8.ll
@@ -0,0 +1,16 @@
+; RUN: llc < %s -march=arm64 -mcpu=apple-m1 | FileCheck %s
+
+; CHECK-LABEL: avg:
+; CHECK: add
+; CHECK: lsr
+; CHECK: ret
+
+define zeroext i8 @avg(i8 noundef zeroext %a, i8 noundef zeroext %b) {
+entry:
+ %conv = zext i8 %a to i16
+ %conv1 = zext i8 %b to i16
+ %add = add nuw nsw i16 %conv1, %conv
+ %div3 = lshr i16 %add, 1
+ %conv2 = trunc nuw i16 %div3 to i8
+ ret i8 %conv2
+}
|
@llvm/pr-subscribers-backend-aarch64 Author: None (aabhinavg1) Changes…ILS as safe This patch updates This is formally verified using Alive2:
These patterns are safe due to the use of Includes test coverage to ensure correctness. Differential Revision: <fill this after uploading to Phabricator or GH> Full diff: https://github.com/llvm/llvm-project/pull/148191.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index c1356239ad206..78f00809e3862 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -5542,6 +5542,10 @@ bool SelectionDAG::canCreateUndefOrPoison(SDValue Op, const APInt &DemandedElts,
case ISD::UADDSAT:
case ISD::SSUBSAT:
case ISD::USUBSAT:
+ case ISD::AVGFLOORS:
+ case ISD::AVGFLOORU:
+ case ISD::AVGCEILS:
+ case ISD::AVGCEILU:
case ISD::MULHU:
case ISD::MULHS:
case ISD::SMIN:
diff --git a/llvm/test/CodeGen/AArch64/avgfloor-u8.ll b/llvm/test/CodeGen/AArch64/avgfloor-u8.ll
new file mode 100644
index 0000000000000..669f6bbad14a3
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/avgfloor-u8.ll
@@ -0,0 +1,16 @@
+; RUN: llc < %s -march=arm64 -mcpu=apple-m1 | FileCheck %s
+
+; CHECK-LABEL: avg:
+; CHECK: add
+; CHECK: lsr
+; CHECK: ret
+
+define zeroext i8 @avg(i8 noundef zeroext %a, i8 noundef zeroext %b) {
+entry:
+ %conv = zext i8 %a to i16
+ %conv1 = zext i8 %b to i16
+ %add = add nuw nsw i16 %conv1, %conv
+ %div3 = lshr i16 %add, 1
+ %conv2 = trunc nuw i16 %div3 to i8
+ ret i8 %conv2
+}
|
Code change looks fine but I don't understand what the test you added actually shows.
Don't add this! We haven't used this since we moved away from Phabricator for code reviews. |
Also, I don't understand how your alive2 links are relevant. They verify an expansion of avgfloors/avgceils, but what does this have to do with introducing poison? |
Hi @jayfoad I included the Alive2 links to show that the lowering of AVGFLOORS and AVGCEILS avoids poison, but I get that it doesn't directly prove that the SelectionDAG op itself is safe. If you don’t mind, could you suggest what kind of test cases would better show that these ops don’t introduce poison or undef? |
You need to remove the noundef and compare the effect of freezing the result vs the inputs:
(if you remove either of the dst freezes it should fail) |
case ISD::AVGFLOORS: | ||
case ISD::AVGFLOORU: | ||
case ISD::AVGCEILS: | ||
case ISD::AVGCEILU: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need test coverage for all 4 patterns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @RKSimon
Thanks for the feedback!
Alive2 Proofs for AVG Operations
Opcode | Operation Type | Alive2 Proof Link |
---|---|---|
AVGFLOORS |
Signed Floor Avg | Alive2 Link |
AVGCEILS |
Signed Ceil Avg | Alive2 Link |
AVGFLOORU |
Unsigned Floor Avg | Alive2 Link |
AVGCEILU |
Unsigned Ceil Avg | Alive2 Link |
Let me know if need to add anything else.
Could you add a tests to llvm/test/CodeGen/AArch64/hadd-combine.ll that use the aarch64.neon.uhadd/urhadd intrinsics? They should be converted to these avg nodes so can be tested directly. |
Hi @davemgreen, Thanks for the suggestion! I've drafted the following test cases in llvm/test/CodeGen/AArch64/hadd-combine.ll to exercise the aarch64.neon.uhadd and aarch64.neon.urhadd intrinsics: ; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -mattr=+neon | FileCheck %s
declare <8 x i8> @llvm.aarch64.neon.uhadd.v8i8(<8 x i8>, <8 x i8>)
declare <8 x i8> @llvm.aarch64.neon.urhadd.v8i8(<8 x i8>, <8 x i8>)
declare <16 x i8> @llvm.aarch64.neon.uhadd.v16i8(<16 x i8>, <16 x i8>)
declare <16 x i8> @llvm.aarch64.neon.urhadd.v16i8(<16 x i8>, <16 x i8>)
; CHECK-LABEL: test_uhadd_v8i8:
; CHECK: uhadd v0.8b, v0.8b, v1.8b
define <8 x i8> @test_uhadd_v8i8(<8 x i8> %a, <8 x i8> %b) {
%res = call <8 x i8> @llvm.aarch64.neon.uhadd.v8i8(<8 x i8> %a, <8 x i8> %b)
ret <8 x i8> %res
}
; CHECK-LABEL: test_urhadd_v8i8:
; CHECK: urhadd v0.8b, v0.8b, v1.8b
define <8 x i8> @test_urhadd_v8i8(<8 x i8> %a, <8 x i8> %b) {
%res = call <8 x i8> @llvm.aarch64.neon.urhadd.v8i8(<8 x i8> %a, <8 x i8> %b)
ret <8 x i8> %res
}
; CHECK-LABEL: test_uhadd_v16i8:
; CHECK: uhadd v0.16b, v0.16b, v1.16b
define <16 x i8> @test_uhadd_v16i8(<16 x i8> %a, <16 x i8> %b) {
%res = call <16 x i8> @llvm.aarch64.neon.uhadd.v16i8(<16 x i8> %a, <16 x i8> %b)
ret <16 x i8> %res
}
; CHECK-LABEL: test_urhadd_v16i8:
; CHECK: urhadd v0.16b, v0.16b, v1.16b
define <16 x i8> @test_urhadd_v16i8(<16 x i8> %a, <16 x i8> %b) {
%res = call <16 x i8> @llvm.aarch64.neon.urhadd.v16i8(<16 x i8> %a, <16 x i8> %b)
ret <16 x i8> %res
} Would this be acceptable for verifying that the uhadd/urhadd intrinsics get correctly lowered into the corresponding DAG avg patterns? |
What could be done is to fold based on the knownbits/signbits result of a avgu/avgs node - if the freeze is moved then the knownbits/signbits calculation is exposed and can allow the fold to occur. |
Hi @RKSimon here is the test could you please look in to it if i have not missed anything ; RUN: opt -passes=instcombine -S < %s | FileCheck %s
; === Signed Floor Average (AVGFLOORS) ===
define i8 @avg_signed_floor_freeze_before(i8 %a, i8 %b) {
; CHECK-LABEL: @avg_signed_floor_freeze_before(
; CHECK: freeze i8
entry:
%fa = freeze i8 %a
%fb = freeze i8 %b
%a32 = sext i8 %fa to i32
%b32 = sext i8 %fb to i32
%and = and i32 %a32, %b32
%xor = xor i32 %a32, %b32
%shr = ashr i32 %xor, 1
%sum = add i32 %and, %shr
%res = trunc i32 %sum to i8
ret i8 %res
}
define i8 @avg_signed_floor_freeze_after(i8 %a, i8 %b) {
; CHECK-LABEL: @avg_signed_floor_freeze_after(
; CHECK: freeze i8
entry:
%a32 = sext i8 %a to i32
%b32 = sext i8 %b to i32
%and = and i32 %a32, %b32
%xor = xor i32 %a32, %b32
%shr = ashr i32 %xor, 1
%sum = add i32 %and, %shr
%res = trunc i32 %sum to i8
%fr = freeze i8 %res
ret i8 %fr
}
; === Signed Ceil Average (AVGCEILS) ===
define i8 @avg_signed_ceil_freeze_before(i8 %a, i8 %b) {
; CHECK-LABEL: @avg_signed_ceil_freeze_before(
; CHECK: freeze i8
entry:
%fa = freeze i8 %a
%fb = freeze i8 %b
%a32 = sext i8 %fa to i32
%b32 = sext i8 %fb to i32
%add = add i32 %a32, %b32
%add1 = add i32 %add, 1
%shr = ashr i32 %add1, 1
%res = trunc i32 %shr to i8
ret i8 %res
}
define i8 @avg_signed_ceil_freeze_after(i8 %a, i8 %b) {
; CHECK-LABEL: @avg_signed_ceil_freeze_after(
; CHECK: freeze i8
entry:
%a32 = sext i8 %a to i32
%b32 = sext i8 %b to i32
%add = add i32 %a32, %b32
%add1 = add i32 %add, 1
%shr = ashr i32 %add1, 1
%res = trunc i32 %shr to i8
%fr = freeze i8 %res
ret i8 %fr
}
; === Unsigned Floor Average (AVGFLOORU) ===
define i8 @avg_unsigned_floor_freeze_before(i8 %a, i8 %b) {
; CHECK-LABEL: @avg_unsigned_floor_freeze_before(
; CHECK: freeze i8
entry:
%fa = freeze i8 %a
%fb = freeze i8 %b
%a32 = zext i8 %fa to i32
%b32 = zext i8 %fb to i32
%and = and i32 %a32, %b32
%xor = xor i32 %a32, %b32
%shr = lshr i32 %xor, 1
%sum = add i32 %and, %shr
%res = trunc i32 %sum to i8
ret i8 %res
}
define i8 @avg_unsigned_floor_freeze_after(i8 %a, i8 %b) {
; CHECK-LABEL: @avg_unsigned_floor_freeze_after(
; CHECK: freeze i8
entry:
%a32 = zext i8 %a to i32
%b32 = zext i8 %b to i32
%and = and i32 %a32, %b32
%xor = xor i32 %a32, %b32
%shr = lshr i32 %xor, 1
%sum = add i32 %and, %shr
%res = trunc i32 %sum to i8
%fr = freeze i8 %res
ret i8 %fr
}
; === Unsigned Ceil Average (AVGCEILU) ===
define i8 @avg_unsigned_ceil_freeze_before(i8 %a, i8 %b) {
; CHECK-LABEL: @avg_unsigned_ceil_freeze_before(
; CHECK: freeze i8
entry:
%fa = freeze i8 %a
%fb = freeze i8 %b
%a32 = zext i8 %fa to i32
%b32 = zext i8 %fb to i32
%add = add i32 %a32, %b32
%add1 = add i32 %add, 1
%shr = lshr i32 %add1, 1
%res = trunc i32 %shr to i8
ret i8 %res
}
define i8 @avg_unsigned_ceil_freeze_after(i8 %a, i8 %b) {
; CHECK-LABEL: @avg_unsigned_ceil_freeze_after(
; CHECK: freeze i8
entry:
%a32 = zext i8 %a to i32
%b32 = zext i8 %b to i32
%add = add i32 %a32, %b32
%add1 = add i32 %add, 1
%shr = lshr i32 %add1, 1
%res = trunc i32 %shr to i8
%fr = freeze i8 %res
ret i8 %fr
}
I’ve added
Could you please review if I’ve missed anything? |
This patch updates
SelectionDAG::canCreateUndefOrPoison
to indicate thatISD::AVGFLOORS
andISD::AVGCEILS
do not introduce poison or undef values.This is formally verified using Alive2:
These patterns are safe due to the use of
sext i8
intoi32
, which ensures no signed overflow occurs. The arithmetic is done in the wider domain before truncating safely back toi8
.Includes test coverage to ensure correctness.
Differential Revision: