-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[AMDGPU] AMDGPULateCodeGenPrepare Legacy PM: replace setPreservesAll()
with setPreservesCFG()
#148167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-analysis Author: Jim M. R. Teichgräber (J-MR-T) Changes<details><summary>This PR depends on #148165; the first commit (90f1d0a) belongs to that PR.</summary> As AMDGPULateCodeGenPrepare actually performs changes that invalidate Uniformity Analysis; use Note that before #148165, this would still have preserved Uniformity Analysis, hence the dependency. In addition, <details><summary>Note on why this hasn't caused issues so far</summary> It just so happens that AMDGPULateCodeGenPrepare is always immediately followed by AMDGPUUnifyDivergentExitNodes, which does invalidate most analyses, including Uniformity. And because UnifyDivergentExitNodes only looks at terminators, and LateCGP seemingly does not replace uniform values with divergent values, or divergent values with uniform values, and it only inserts new values that are not looked at by UnifyDivergentExitNodes, this bug remained hidden. </p> I ran Full diff: https://github.com/llvm/llvm-project/pull/148167.diff 3 Files Affected:
diff --git a/llvm/lib/Analysis/UniformityAnalysis.cpp b/llvm/lib/Analysis/UniformityAnalysis.cpp
index 2101fdfacfc8f..0daf4041a72f3 100644
--- a/llvm/lib/Analysis/UniformityAnalysis.cpp
+++ b/llvm/lib/Analysis/UniformityAnalysis.cpp
@@ -150,8 +150,11 @@ INITIALIZE_PASS_BEGIN(UniformityInfoWrapperPass, "uniformity",
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
INITIALIZE_PASS_DEPENDENCY(CycleInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
+// Though Uniformity Analysis depends on the CFG,
+// it also needs to be invalidated if values are changed, so isCFGOnly: false.
+// See NOTE on updatability at the start of GenericUniformityImpl.h
INITIALIZE_PASS_END(UniformityInfoWrapperPass, "uniformity",
- "Uniformity Analysis", true, true)
+ "Uniformity Analysis", false, true)
void UniformityInfoWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp b/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
index f0d63f523088b..b109fb73deaa5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
@@ -546,7 +546,9 @@ class AMDGPULateCodeGenPrepareLegacy : public FunctionPass {
AU.addRequired<TargetPassConfig>();
AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<UniformityInfoWrapperPass>();
- AU.setPreservesAll();
+ // makes changes that can invalidate Uniformity Analysis,
+ // so don't preserveAll here (see new PM version above)
+ AU.setPreservesCFG();
}
bool runOnFunction(Function &F) override;
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
index dd2ff2e013cc8..b23662c63338c 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -255,6 +255,7 @@
; GCN-O1-NEXT: Uniformity Analysis
; GCN-O1-NEXT: AMDGPU IR late optimizations
; GCN-O1-NEXT: Post-Dominator Tree Construction
+; GCN-O1-NEXT: Uniformity Analysis
; GCN-O1-NEXT: Unify divergent function exit nodes
; GCN-O1-NEXT: Dominator Tree Construction
; GCN-O1-NEXT: Cycle Info Analysis
@@ -557,6 +558,7 @@
; GCN-O1-OPTS-NEXT: Uniformity Analysis
; GCN-O1-OPTS-NEXT: AMDGPU IR late optimizations
; GCN-O1-OPTS-NEXT: Post-Dominator Tree Construction
+; GCN-O1-OPTS-NEXT: Uniformity Analysis
; GCN-O1-OPTS-NEXT: Unify divergent function exit nodes
; GCN-O1-OPTS-NEXT: Dominator Tree Construction
; GCN-O1-OPTS-NEXT: Cycle Info Analysis
@@ -871,6 +873,7 @@
; GCN-O2-NEXT: Uniformity Analysis
; GCN-O2-NEXT: AMDGPU IR late optimizations
; GCN-O2-NEXT: Post-Dominator Tree Construction
+; GCN-O2-NEXT: Uniformity Analysis
; GCN-O2-NEXT: Unify divergent function exit nodes
; GCN-O2-NEXT: Dominator Tree Construction
; GCN-O2-NEXT: Cycle Info Analysis
@@ -1200,6 +1203,7 @@
; GCN-O3-NEXT: Uniformity Analysis
; GCN-O3-NEXT: AMDGPU IR late optimizations
; GCN-O3-NEXT: Post-Dominator Tree Construction
+; GCN-O3-NEXT: Uniformity Analysis
; GCN-O3-NEXT: Unify divergent function exit nodes
; GCN-O3-NEXT: Dominator Tree Construction
; GCN-O3-NEXT: Cycle Info Analysis
|
@llvm/pr-subscribers-backend-amdgpu Author: Jim M. R. Teichgräber (J-MR-T) Changes<details><summary>This PR depends on #148165; the first commit (90f1d0a) belongs to that PR.</summary> As AMDGPULateCodeGenPrepare actually performs changes that invalidate Uniformity Analysis; use Note that before #148165, this would still have preserved Uniformity Analysis, hence the dependency. In addition, <details><summary>Note on why this hasn't caused issues so far</summary> It just so happens that AMDGPULateCodeGenPrepare is always immediately followed by AMDGPUUnifyDivergentExitNodes, which does invalidate most analyses, including Uniformity. And because UnifyDivergentExitNodes only looks at terminators, and LateCGP seemingly does not replace uniform values with divergent values, or divergent values with uniform values, and it only inserts new values that are not looked at by UnifyDivergentExitNodes, this bug remained hidden. </p> I ran Full diff: https://github.com/llvm/llvm-project/pull/148167.diff 3 Files Affected:
diff --git a/llvm/lib/Analysis/UniformityAnalysis.cpp b/llvm/lib/Analysis/UniformityAnalysis.cpp
index 2101fdfacfc8f..0daf4041a72f3 100644
--- a/llvm/lib/Analysis/UniformityAnalysis.cpp
+++ b/llvm/lib/Analysis/UniformityAnalysis.cpp
@@ -150,8 +150,11 @@ INITIALIZE_PASS_BEGIN(UniformityInfoWrapperPass, "uniformity",
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
INITIALIZE_PASS_DEPENDENCY(CycleInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
+// Though Uniformity Analysis depends on the CFG,
+// it also needs to be invalidated if values are changed, so isCFGOnly: false.
+// See NOTE on updatability at the start of GenericUniformityImpl.h
INITIALIZE_PASS_END(UniformityInfoWrapperPass, "uniformity",
- "Uniformity Analysis", true, true)
+ "Uniformity Analysis", false, true)
void UniformityInfoWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp b/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
index f0d63f523088b..b109fb73deaa5 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
@@ -546,7 +546,9 @@ class AMDGPULateCodeGenPrepareLegacy : public FunctionPass {
AU.addRequired<TargetPassConfig>();
AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<UniformityInfoWrapperPass>();
- AU.setPreservesAll();
+ // makes changes that can invalidate Uniformity Analysis,
+ // so don't preserveAll here (see new PM version above)
+ AU.setPreservesCFG();
}
bool runOnFunction(Function &F) override;
diff --git a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
index dd2ff2e013cc8..b23662c63338c 100644
--- a/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+++ b/llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
@@ -255,6 +255,7 @@
; GCN-O1-NEXT: Uniformity Analysis
; GCN-O1-NEXT: AMDGPU IR late optimizations
; GCN-O1-NEXT: Post-Dominator Tree Construction
+; GCN-O1-NEXT: Uniformity Analysis
; GCN-O1-NEXT: Unify divergent function exit nodes
; GCN-O1-NEXT: Dominator Tree Construction
; GCN-O1-NEXT: Cycle Info Analysis
@@ -557,6 +558,7 @@
; GCN-O1-OPTS-NEXT: Uniformity Analysis
; GCN-O1-OPTS-NEXT: AMDGPU IR late optimizations
; GCN-O1-OPTS-NEXT: Post-Dominator Tree Construction
+; GCN-O1-OPTS-NEXT: Uniformity Analysis
; GCN-O1-OPTS-NEXT: Unify divergent function exit nodes
; GCN-O1-OPTS-NEXT: Dominator Tree Construction
; GCN-O1-OPTS-NEXT: Cycle Info Analysis
@@ -871,6 +873,7 @@
; GCN-O2-NEXT: Uniformity Analysis
; GCN-O2-NEXT: AMDGPU IR late optimizations
; GCN-O2-NEXT: Post-Dominator Tree Construction
+; GCN-O2-NEXT: Uniformity Analysis
; GCN-O2-NEXT: Unify divergent function exit nodes
; GCN-O2-NEXT: Dominator Tree Construction
; GCN-O2-NEXT: Cycle Info Analysis
@@ -1200,6 +1203,7 @@
; GCN-O3-NEXT: Uniformity Analysis
; GCN-O3-NEXT: AMDGPU IR late optimizations
; GCN-O3-NEXT: Post-Dominator Tree Construction
+; GCN-O3-NEXT: Uniformity Analysis
; GCN-O3-NEXT: Unify divergent function exit nodes
; GCN-O3-NEXT: Dominator Tree Construction
; GCN-O3-NEXT: Cycle Info Analysis
|
CC @nhaehnle |
Your desciption contains HTML style markup tags. I don't think we want that in the git commit message when you squash-and-merge, do we? |
Ah, good point, I didn't consider that the description is included as well. Given that the notes aren't too long, always showing them shouldn't matter too much. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code change LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically LGTM, but a nit in the comments. (Also, this depends on #148165 which should go in first)
4b7737c
to
9b81632
Compare
(Now the first 3 commits belong to the dependent PR #148165, because of review changes there) |
…lse (#148165) Currently, Uniformity Analysis is preserved in the Legacy PM when a pass sets `setPreservesCFG()`. This is incorrect: any values' uniformity not only depends on the CFG, but also on the uniformity of other values, so a CFG-preserving change in many cases doesn't preserve uniformity analysis. This corrected behavior also matches the behavior of the new PM. --- On its own, this change does not affect the pass pipeline because of the happenstance of pass ordering. I also created a PR to change AMDGPULateCodeGenPrepare (#148167), this will have an actual impact on pass executions. That PR also includes changes to the `amdgpu/llc-pipeline.ll` test in order to check that this change works, but if this is preferred, I would also be happy to try to extend this PR to add an isolated test case; though my personal opinion is that the test in #148167 should suffice, as it should also accurately pinpoint failures related to this change. --- I ran `git-clang-format` on my changes. I tested them using the `check-llvm` target; no unexpected failures occurred.
Should be ready now :). If you agree, could I ask you to merge this one as well? I don't have commit perms. |
This PR depends on #148165; the first commit (90f1d0a) belongs to that PR. The changes are distinct, so separate PRs seemed like the best option. I don't have commit access, so I couldn't use user-branches to mark the dependency.
As AMDGPULateCodeGenPrepare actually performs changes that invalidate Uniformity Analysis; use
setPreservesCFG()
to mark this, instead ofsetPreservesAll()
which wrongly includes preserving Uniformity Analysis.Note that before #148165, this would still have preserved Uniformity Analysis, hence the dependency. In addition,
amdgpu/llc-pipeline.cc
needs to be changed when both changes are in effect, but those changes would make the test fail if the PRs weren't based on one another.Note on why this hasn't caused issues so far:
It just so happens that AMDGPULateCodeGenPrepare is always immediately followed by AMDGPUUnifyDivergentExitNodes, which does invalidate most analyses, including Uniformity. And because UnifyDivergentExitNodes only looks at terminators, and LateCGP seemingly does not replace uniform values with divergent values, or divergent values with uniform values, and it only inserts new values that are not looked at by UnifyDivergentExitNodes, this bug remained hidden.
I ran
git-clang-format
on my changes. I tested them using thecheck-llvm
target; no unexpected failures occurred after I made the change toamdgpu/llc-pipeline.ll
.