-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[VPlan] Expand VPWidenPointerInductionRecipe into separate recipes #148274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[VPlan] Expand VPWidenPointerInductionRecipe into separate recipes #148274
Conversation
Currently a ptradd can only generate a scalar, or a series of scalars per-lane. In an upcoming patch to expand VPWidenPointerRecipe into smaller recipes, we need to be able to generate a vector ptradd, which currently we can't do. This adds support for generating vectors by checking to see if the offset operand is a vector: If it isn't, it will generate per-lane scalars as per usual.
Stacked on llvm#148273 to be able to use VPInstruction::PtrAdd. This is the VPWidenPointerInductionRecipe equivalent of llvm#118638, with the motivation of allowing us to use the EVL as the induction step. Most of the new VPlan transformation is a straightforward translation of the existing execute code. VPUnrollPartAccessor unfortunately doesn't work outside of VPlanRecipes.cpp so here the operands are just manually checked to see if they're unrolled.
@llvm/pr-subscribers-vectorizers @llvm/pr-subscribers-llvm-transforms Author: Luke Lau (lukel97) ChangesStacked on #148273 to be able to use VPInstruction::PtrAdd. This is the VPWidenPointerInductionRecipe equivalent of #118638, with the motivation of allowing us to use the EVL as the induction step. Most of the new VPlan transformation is a straightforward translation of the existing execute code. VPUnrollPartAccessor unfortunately doesn't work outside of VPlanRecipes.cpp so here the operands are just manually checked to see if they're unrolled. Patch is 39.78 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/148274.diff 10 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index 40a55656bfa7e..2ca2e273392db 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -1043,21 +1043,6 @@ void VPlan::execute(VPTransformState *State) {
if (isa<VPWidenPHIRecipe>(&R))
continue;
- if (auto *WidenPhi = dyn_cast<VPWidenPointerInductionRecipe>(&R)) {
- assert(!WidenPhi->onlyScalarsGenerated(State->VF.isScalable()) &&
- "recipe generating only scalars should have been replaced");
- auto *GEP = cast<GetElementPtrInst>(State->get(WidenPhi));
- PHINode *Phi = cast<PHINode>(GEP->getPointerOperand());
-
- Phi->setIncomingBlock(1, VectorLatchBB);
-
- // Move the last step to the end of the latch block. This ensures
- // consistent placement of all induction updates.
- Instruction *Inc = cast<Instruction>(Phi->getIncomingValue(1));
- Inc->moveBefore(std::prev(VectorLatchBB->getTerminator()->getIterator()));
- continue;
- }
-
auto *PhiR = cast<VPSingleDefRecipe>(&R);
// VPInstructions currently model scalar Phis only.
bool NeedsScalar = isa<VPInstruction>(PhiR) ||
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 9a6e4b36397b3..6d658287fe738 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -958,8 +958,10 @@ class LLVM_ABI_FOR_TEST VPInstruction : public VPRecipeWithIRFlags,
ExtractPenultimateElement,
LogicalAnd, // Non-poison propagating logical And.
// Add an offset in bytes (second operand) to a base pointer (first
- // operand). Only generates scalar values (either for the first lane only or
- // for all lanes, depending on its uses).
+ // operand). The base pointer must be scalar, but the offset can be a
+ // scalar, multiple scalars, or a vector. If the offset is multiple scalars
+ // then it will generate multiple scalar values (either for the first lane
+ // only or for all lanes, depending on its uses).
PtrAdd,
// Returns a scalar boolean value, which is true if any lane of its
// (boolean) vector operands is true. It produces the reduced value across
@@ -998,7 +1000,7 @@ class LLVM_ABI_FOR_TEST VPInstruction : public VPRecipeWithIRFlags,
/// values per all lanes, stemming from an original ingredient. This method
/// identifies the (rare) cases of VPInstructions that do so as well, w/o an
/// underlying ingredient.
- bool doesGeneratePerAllLanes() const;
+ bool doesGeneratePerAllLanes(VPTransformState &State) const;
/// Returns true if we can generate a scalar for the first lane only if
/// needed.
@@ -2064,8 +2066,7 @@ class VPWidenIntOrFpInductionRecipe : public VPWidenInductionRecipe {
}
};
-class VPWidenPointerInductionRecipe : public VPWidenInductionRecipe,
- public VPUnrollPartAccessor<4> {
+class VPWidenPointerInductionRecipe : public VPWidenInductionRecipe {
bool IsScalarAfterVectorization;
public:
@@ -2093,19 +2094,14 @@ class VPWidenPointerInductionRecipe : public VPWidenInductionRecipe,
VP_CLASSOF_IMPL(VPDef::VPWidenPointerInductionSC)
- /// Generate vector values for the pointer induction.
- void execute(VPTransformState &State) override;
+ void execute(VPTransformState &State) override {
+ llvm_unreachable("cannot execute this recipe, should be expanded via "
+ "expandVPWidenIntOrFpInductionRecipe");
+ };
/// Returns true if only scalar values will be generated.
bool onlyScalarsGenerated(bool IsScalable);
- /// Returns the VPValue representing the value of this induction at
- /// the first unrolled part, if it exists. Returns itself if unrolling did not
- /// take place.
- VPValue *getFirstUnrolledPartOperand() {
- return getUnrollPart(*this) == 0 ? this : getOperand(3);
- }
-
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
/// Print the recipe.
void print(raw_ostream &O, const Twine &Indent,
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 75ade13b09d9c..1feb45abaa193 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -494,8 +494,9 @@ unsigned VPInstruction::getNumOperandsForOpcode(unsigned Opcode) {
}
#endif
-bool VPInstruction::doesGeneratePerAllLanes() const {
- return Opcode == VPInstruction::PtrAdd && !vputils::onlyFirstLaneUsed(this);
+bool VPInstruction::doesGeneratePerAllLanes(VPTransformState &State) const {
+ return Opcode == VPInstruction::PtrAdd && !vputils::onlyFirstLaneUsed(this) &&
+ !State.hasVectorValue(getOperand(1));
}
bool VPInstruction::canGenerateScalarForFirstLane() const {
@@ -848,10 +849,8 @@ Value *VPInstruction::generate(VPTransformState &State) {
return Builder.CreateLogicalAnd(A, B, Name);
}
case VPInstruction::PtrAdd: {
- assert(vputils::onlyFirstLaneUsed(this) &&
- "can only generate first lane for PtrAdd");
Value *Ptr = State.get(getOperand(0), VPLane(0));
- Value *Addend = State.get(getOperand(1), VPLane(0));
+ Value *Addend = State.get(getOperand(1), vputils::onlyFirstLaneUsed(this));
return Builder.CreatePtrAdd(Ptr, Addend, Name, getGEPNoWrapFlags());
}
case VPInstruction::AnyOf: {
@@ -911,9 +910,6 @@ InstructionCost VPInstruction::computeCost(ElementCount VF,
}
}
- assert(!doesGeneratePerAllLanes() &&
- "Should only generate a vector value or single scalar, not scalars "
- "for all lanes.");
return Ctx.TTI.getArithmeticInstrCost(getOpcode(), ResTy, Ctx.CostKind);
}
@@ -1001,7 +997,7 @@ void VPInstruction::execute(VPTransformState &State) {
bool GeneratesPerFirstLaneOnly = canGenerateScalarForFirstLane() &&
(vputils::onlyFirstLaneUsed(this) ||
isVectorToScalar() || isSingleScalar());
- bool GeneratesPerAllLanes = doesGeneratePerAllLanes();
+ bool GeneratesPerAllLanes = doesGeneratePerAllLanes(State);
if (GeneratesPerAllLanes) {
for (unsigned Lane = 0, NumLanes = State.VF.getFixedValue();
Lane != NumLanes; ++Lane) {
@@ -3690,87 +3686,6 @@ bool VPWidenPointerInductionRecipe::onlyScalarsGenerated(bool IsScalable) {
(!IsScalable || vputils::onlyFirstLaneUsed(this));
}
-void VPWidenPointerInductionRecipe::execute(VPTransformState &State) {
- assert(getInductionDescriptor().getKind() ==
- InductionDescriptor::IK_PtrInduction &&
- "Not a pointer induction according to InductionDescriptor!");
- assert(State.TypeAnalysis.inferScalarType(this)->isPointerTy() &&
- "Unexpected type.");
- assert(!onlyScalarsGenerated(State.VF.isScalable()) &&
- "Recipe should have been replaced");
-
- unsigned CurrentPart = getUnrollPart(*this);
-
- // Build a pointer phi
- Value *ScalarStartValue = getStartValue()->getLiveInIRValue();
- Type *ScStValueType = ScalarStartValue->getType();
-
- BasicBlock *VectorPH =
- State.CFG.VPBB2IRBB.at(getParent()->getCFGPredecessor(0));
- PHINode *NewPointerPhi = nullptr;
- if (CurrentPart == 0) {
- IRBuilder<>::InsertPointGuard Guard(State.Builder);
- if (State.Builder.GetInsertPoint() !=
- State.Builder.GetInsertBlock()->getFirstNonPHIIt())
- State.Builder.SetInsertPoint(
- State.Builder.GetInsertBlock()->getFirstNonPHIIt());
- NewPointerPhi = State.Builder.CreatePHI(ScStValueType, 2, "pointer.phi");
- NewPointerPhi->addIncoming(ScalarStartValue, VectorPH);
- NewPointerPhi->setDebugLoc(getDebugLoc());
- } else {
- // The recipe has been unrolled. In that case, fetch the single pointer phi
- // shared among all unrolled parts of the recipe.
- auto *GEP =
- cast<GetElementPtrInst>(State.get(getFirstUnrolledPartOperand()));
- NewPointerPhi = cast<PHINode>(GEP->getPointerOperand());
- }
-
- // A pointer induction, performed by using a gep
- BasicBlock::iterator InductionLoc = State.Builder.GetInsertPoint();
- Value *ScalarStepValue = State.get(getStepValue(), VPLane(0));
- Type *PhiType = State.TypeAnalysis.inferScalarType(getStepValue());
- Value *RuntimeVF = getRuntimeVF(State.Builder, PhiType, State.VF);
- // Add induction update using an incorrect block temporarily. The phi node
- // will be fixed after VPlan execution. Note that at this point the latch
- // block cannot be used, as it does not exist yet.
- // TODO: Model increment value in VPlan, by turning the recipe into a
- // multi-def and a subclass of VPHeaderPHIRecipe.
- if (CurrentPart == 0) {
- // The recipe represents the first part of the pointer induction. Create the
- // GEP to increment the phi across all unrolled parts.
- Value *NumUnrolledElems = State.get(getOperand(2), true);
-
- Value *InductionGEP = GetElementPtrInst::Create(
- State.Builder.getInt8Ty(), NewPointerPhi,
- State.Builder.CreateMul(
- ScalarStepValue,
- State.Builder.CreateTrunc(NumUnrolledElems, PhiType)),
- "ptr.ind", InductionLoc);
-
- NewPointerPhi->addIncoming(InductionGEP, VectorPH);
- }
-
- // Create actual address geps that use the pointer phi as base and a
- // vectorized version of the step value (<step*0, ..., step*N>) as offset.
- Type *VecPhiType = VectorType::get(PhiType, State.VF);
- Value *StartOffsetScalar = State.Builder.CreateMul(
- RuntimeVF, ConstantInt::get(PhiType, CurrentPart));
- Value *StartOffset =
- State.Builder.CreateVectorSplat(State.VF, StartOffsetScalar);
- // Create a vector of consecutive numbers from zero to VF.
- StartOffset = State.Builder.CreateAdd(
- StartOffset, State.Builder.CreateStepVector(VecPhiType));
-
- assert(ScalarStepValue == State.get(getOperand(1), VPLane(0)) &&
- "scalar step must be the same across all parts");
- Value *GEP = State.Builder.CreateGEP(
- State.Builder.getInt8Ty(), NewPointerPhi,
- State.Builder.CreateMul(StartOffset, State.Builder.CreateVectorSplat(
- State.VF, ScalarStepValue)),
- "vector.gep");
- State.set(this, GEP);
-}
-
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
void VPWidenPointerInductionRecipe::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {
@@ -3929,11 +3844,6 @@ void VPWidenPHIRecipe::execute(VPTransformState &State) {
Value *Op0 = State.get(getOperand(0));
Type *VecTy = Op0->getType();
Instruction *VecPhi = State.Builder.CreatePHI(VecTy, 2, Name);
- // Manually move it with the other PHIs in case PHI recipes above this one
- // also inserted non-phi instructions.
- // TODO: Remove once VPWidenPointerInductionRecipe is also expanded in
- // convertToConcreteRecipes.
- VecPhi->moveBefore(State.Builder.GetInsertBlock()->getFirstNonPHIIt());
State.set(this, VecPhi);
}
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 581af67c88bf9..b96ac9f36bcd3 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -2675,6 +2675,107 @@ expandVPWidenIntOrFpInduction(VPWidenIntOrFpInductionRecipe *WidenIVR,
WidenIVR->replaceAllUsesWith(WidePHI);
}
+/// Expand a VPWidenPointerInductionRecipe into executable recipes, for the
+/// initial value, phi and backedge value. In the following example:
+///
+/// <x1> vector loop: {
+/// vector.body:
+/// EMIT ir<%ptr.iv> = WIDEN-POINTER-INDUCTION %start, %step, %vf
+/// ...
+/// EMIT branch-on-count ...
+/// }
+///
+/// WIDEN-POINTER-INDUCTION will get expanded to:
+///
+/// <x1> vector loop: {
+/// vector.body:
+/// EMIT-SCALAR %pointer.phi = phi %start, %ptr.ind
+/// EMIT %mul = mul %stepvector, %step
+/// EMIT %vector.gep = ptradd %pointer.phi, %add
+/// ...
+/// EMIT %ptr.ind = ptradd %pointer.phi, %vf
+/// EMIT branch-on-count ...
+/// }
+static void
+expandVPWidenPointerInductionRecipe(VPWidenPointerInductionRecipe *R,
+ VPTypeAnalysis &TypeInfo) {
+ VPlan *Plan = R->getParent()->getPlan();
+
+ assert(R->getInductionDescriptor().getKind() ==
+ InductionDescriptor::IK_PtrInduction &&
+ "Not a pointer induction according to InductionDescriptor!");
+ assert(TypeInfo.inferScalarType(R)->isPointerTy() && "Unexpected type.");
+ assert(!R->onlyScalarsGenerated(Plan->hasScalableVF()) &&
+ "Recipe should have been replaced");
+
+ unsigned CurrentPart = 0;
+ if (R->getNumOperands() > 3)
+ CurrentPart =
+ cast<ConstantInt>(R->getOperand(4)->getLiveInIRValue())->getZExtValue();
+
+ VPBuilder Builder(R);
+ DebugLoc DL = R->getDebugLoc();
+
+ // Build a pointer phi
+ VPPhi *Phi;
+ if (CurrentPart == 0) {
+ Phi = Builder.createScalarPhi({R->getStartValue()}, R->getDebugLoc(),
+ "pointer.phi");
+ } else {
+ // The recipe has been unrolled. In that case, fetch the single pointer phi
+ // shared among all unrolled parts of the recipe.
+ auto *PtrAdd = cast<VPInstruction>(R->getOperand(3));
+ Phi = cast<VPPhi>(PtrAdd->getOperand(0)->getDefiningRecipe());
+ }
+
+ Builder.setInsertPoint(R->getParent(), R->getParent()->getFirstNonPhi());
+
+ // A pointer induction, performed by using a gep
+ Type *PhiType = TypeInfo.inferScalarType(R->getStepValue());
+ VPValue *RuntimeVF = Builder.createScalarZExtOrTrunc(
+ &Plan->getVF(), PhiType, TypeInfo.inferScalarType(&Plan->getVF()), DL);
+ if (CurrentPart == 0) {
+ // The recipe represents the first part of the pointer induction. Create the
+ // GEP to increment the phi across all unrolled parts.
+ VPValue *NumUnrolledElems = Builder.createScalarZExtOrTrunc(
+ R->getOperand(2), PhiType, TypeInfo.inferScalarType(R->getOperand(2)),
+ DL);
+ VPValue *Offset = Builder.createNaryOp(
+ Instruction::Mul, {R->getStepValue(), NumUnrolledElems});
+
+ VPBuilder::InsertPointGuard Guard(Builder);
+ VPBasicBlock *ExitingBB =
+ Plan->getVectorLoopRegion()->getExitingBasicBlock();
+ Builder.setInsertPoint(ExitingBB,
+ ExitingBB->getTerminator()->getIterator());
+
+ VPValue *InductionGEP = Builder.createPtrAdd(Phi, Offset, DL, "ptr.ind");
+ Phi->addOperand(InductionGEP);
+ }
+
+ VPValue *CurrentPartV =
+ Plan->getOrAddLiveIn(ConstantInt::get(PhiType, CurrentPart));
+
+ // Create actual address geps that use the pointer phi as base and a
+ // vectorized version of the step value (<step*0, ..., step*N>) as offset.
+ VPValue *StartOffsetScalar =
+ Builder.createNaryOp(Instruction::Mul, {RuntimeVF, CurrentPartV});
+ VPValue *StartOffset =
+ Builder.createNaryOp(VPInstruction::Broadcast, StartOffsetScalar);
+ // Create a vector of consecutive numbers from zero to VF.
+ StartOffset = Builder.createNaryOp(
+ Instruction::Add,
+ {StartOffset,
+ Builder.createNaryOp(VPInstruction::StepVector, {}, PhiType)});
+
+ VPValue *PtrAdd = Builder.createPtrAdd(
+ Phi,
+ Builder.createNaryOp(Instruction::Mul, {StartOffset, R->getStepValue()}),
+ DL, "vector.gep");
+
+ R->replaceAllUsesWith(PtrAdd);
+}
+
void VPlanTransforms::dissolveLoopRegions(VPlan &Plan) {
// Replace loop regions with explicity CFG.
SmallVector<VPRegionBlock *> LoopRegions;
@@ -2711,6 +2812,12 @@ void VPlanTransforms::convertToConcreteRecipes(VPlan &Plan,
continue;
}
+ if (auto *WidenIVR = dyn_cast<VPWidenPointerInductionRecipe>(&R)) {
+ expandVPWidenPointerInductionRecipe(WidenIVR, TypeInfo);
+ ToRemove.push_back(WidenIVR);
+ continue;
+ }
+
if (auto *Expr = dyn_cast<VPExpressionRecipe>(&R)) {
Expr->decompose();
ToRemove.push_back(Expr);
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll b/llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll
index e58ea655d6098..5aee65fd1c59d 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-gep.ll
@@ -67,10 +67,8 @@ define void @pointer_induction_used_as_vector(ptr noalias %start.1, ptr noalias
; CHECK: vector.body:
; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
; CHECK-NEXT: [[POINTER_PHI:%.*]] = phi ptr [ [[START_2]], [[VECTOR_PH]] ], [ [[PTR_IND:%.*]], [[VECTOR_BODY]] ]
-; CHECK-NEXT: [[TMP9:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT: [[TMP10:%.*]] = mul nuw i64 [[TMP9]], 2
; CHECK-NEXT: [[TMP11:%.*]] = mul i64 1, [[TMP6]]
-; CHECK-NEXT: [[TMP12:%.*]] = mul i64 [[TMP10]], 0
+; CHECK-NEXT: [[TMP12:%.*]] = mul i64 [[TMP6]], 0
; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <vscale x 2 x i64> poison, i64 [[TMP12]], i64 0
; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <vscale x 2 x i64> [[DOTSPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer
; CHECK-NEXT: [[TMP13:%.*]] = call <vscale x 2 x i64> @llvm.stepvector.nxv2i64()
@@ -159,17 +157,16 @@ define void @pointer_induction(ptr noalias %start, i64 %N) {
; CHECK: vector.body:
; CHECK-NEXT: [[INDEX2:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
; CHECK-NEXT: [[POINTER_PHI:%.*]] = phi ptr [ [[START]], [[VECTOR_PH]] ], [ [[PTR_IND:%.*]], [[VECTOR_BODY]] ]
-; CHECK-NEXT: [[TMP7:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-NEXT: [[TMP8:%.*]] = mul nuw i64 [[TMP7]], 2
; CHECK-NEXT: [[TMP10:%.*]] = mul i64 1, [[TMP6]]
-; CHECK-NEXT: [[TMP11:%.*]] = mul i64 [[TMP8]], 0
+; CHECK-NEXT: [[TMP11:%.*]] = mul i64 [[TMP6]], 0
; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <vscale x 2 x i64> poison, i64 [[TMP11]], i64 0
; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <vscale x 2 x i64> [[DOTSPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer
; CHECK-NEXT: [[TMP12:%.*]] = call <vscale x 2 x i64> @llvm.stepvector.nxv2i64()
-; CHECK-NEXT: [[TMP13:%.*]] = add <vscale x 2 x i64> [[DOTSPLAT]], [[TMP12]]
-; CHECK-NEXT: [[TMP14:%.*]] = mul <vscale x 2 x i64> [[TMP13]], splat (i64 1)
-; CHECK-NEXT: [[VECTOR_GEP:%.*]] = getelementptr i8, ptr [[POINTER_PHI]], <vscale x 2 x i64> [[TMP14]]
-; CHECK-NEXT: [[TMP15:%.*]] = extractelement <vscale x 2 x ptr> [[VECTOR_GEP]], i32 0
+; CHECK-NEXT: [[TMP20:%.*]] = extractelement <vscale x 2 x i64> [[DOTSPLAT]], i32 0
+; CHECK-NEXT: [[TMP21:%.*]] = extractelement <vscale x 2 x i64> [[TMP12]], i32 0
+; CHECK-NEXT: [[TMP13:%.*]] = add i64 [[TMP20]], [[TMP21]]
+; CHECK-NEXT: [[TMP14:%.*]] = mul i64 [[TMP13]], 1
+; CHECK-NEXT: [[TMP15:%.*]] = getelementptr i8, ptr [[POINTER_PHI]], i64 [[TMP14]]
; CHECK-NEXT: [[TMP16:%.*]] = getelementptr i8, ptr [[TMP15]], i32 0
; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <vscale x 2 x i8>, ptr [[TMP16]], align 1
; CHECK-NEXT: [[TMP17:%.*]] = add <vscale x 2 x i8> [[WIDE_LOAD]], splat (i8 1)
diff --git a/llvm/test/Transforms/LoopVectorize/ARM/mve-reg-pressure-vmla.ll b/llvm/test/Transforms/LoopVectorize/ARM/mve-reg-pressure-vmla.ll
index 4c29a3a0d1d01..6e16003f11757 100644
--- a/llvm/test/Transforms/LoopVectorize/ARM/mve-reg-pressure-vmla.ll
+++ b/llvm/test/Transforms/LoopVectorize/ARM/mve-reg-pressure-vmla.ll
@@ -29,14 +29,14 @@ define void @fn(i32 noundef %n, ptr %in, ptr %out) #0 {
; CHECK-NEXT: br label %[[VECTOR_BODY:.*]]
; CHECK: [[VECTOR_BODY]]:
; CHECK-NEXT: [[INDEX:%.*]] = phi i32 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
-; CHECK-NEXT: [[POINTER_PHI:%.*]] = phi ptr [ [[IN]], %[[VECTOR_PH]] ], [ [[PTR_IND:%.*]], %[[VECTOR_BODY]] ]
-; CHECK-NEXT: [[POINTER_PHI2:%.*]] = phi ptr [ [[OUT]], %[[VECTOR_PH]] ], [ [[PTR_IND3:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[POINTER_PHI2:%.*]] = phi ptr [ [[IN]], %[[VECTOR_PH]] ], [ [[PTR_IND3:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[POINTER_PHI:%.*]] = phi ptr [ [[OUT]], %[[VECTOR_PH]] ], [ [[PTR_IND6:%.*]], %[[VECTOR_BODY]] ]
; CHECK-NEXT: [[VECTOR_GEP:%.*]] = getelementptr i8, ptr [[POINTER_PHI]], <4 x i32> <i32 0, i32 3, i32 6, i32 9>
; CHECK-NEXT: [[VECTOR...
[truncated]
|
Stacked on #148273 to be able to use VPInstruction::PtrAdd.
This is the VPWidenPointerInductionRecipe equivalent of #118638, with the motivation of allowing us to use the EVL as the induction step.
Most of the new VPlan transformation is a straightforward translation of the existing execute code.
VPUnrollPartAccessor unfortunately doesn't work outside of VPlanRecipes.cpp so here the operands are just manually checked to see if they're unrolled.