Skip to content

Commit a1c61ac

Browse files
authored
[mlir][OpenMP] Allow composite SIMD REDUCTION and IF (#147568)
Reduction support: #146671 If Support is fixed in this PR The problem for the IF clause in composite constructs was that wsloop and simd both operate on the same CanonicalLoopInfo structure: with the SIMD processed first, followed by the wsloop. Previously the IF clause generated code like ``` if (cond) { while (...) { simd_loop_body; } } else { while (...) { nonsimd_loop_body; } } ``` The problem with this is that this invalidates the CanonicalLoopInfo structure to be processed by the wsloop later. To avoid this, in this patch I preserve the original loop, moving the IF clause inside of the loop: ``` while (...) { if (cond) { simd_loop_body; } else { non_simd_loop_body; } } ``` On simple examples I tried LLVM was able to hoist the if condition outside of the loop at -O3. The disadvantage of this is that we cannot add the llvm.loop.vectorize.enable attribute on either the SIMD or non-SIMD loops because they both share a loop back edge. There's no way of solving this without keeping the old design of having two different loops: which cannot be represented using only one CanonicalLoopInfo structure. I don't think the presence or absence of this attribute makes much difference. In my testing it is the llvm.loop.parallel_access metadata which makes the difference to vectorization. LLVM will vectorize if legal whether or not this attribute is there in the TRUE branch. In the FALSE branch this means the loop might be vectorized even when the condition is false: but I think this is still standards compliant: OpenMP 6.0 says that when the if clause is false that should be treated like the SIMDLEN clause is one. The SIMDLEN clause is defined as a "hint". For the same reason, SIMDLEN and SAFELEN clauses are silently ignored when SIMD IF is used. I think it is better to implement SIMD IF and ignore SIMDLEN and SAFELEN and some vectorization encouragement metadata when combined with IF than to ignore IF because IF could have correctness consequences whereas the rest are optimiztion hints. For example, the user might use the IF clause to disable SIMD programatically when it is known not safe to vectorize the loop. In this case it is not at all safe to add the parallel access or SAFELEN metadata.
1 parent 58d79aa commit a1c61ac

File tree

8 files changed

+195
-130
lines changed

8 files changed

+195
-130
lines changed

llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,9 @@ class CanonicalLoopInfo;
3434
struct TargetRegionEntryInfo;
3535
class OffloadEntriesInfoManager;
3636
class OpenMPIRBuilder;
37+
class Loop;
38+
class LoopAnalysis;
39+
class LoopInfo;
3740

3841
/// Move the instruction after an InsertPoint to the beginning of another
3942
/// BasicBlock.
@@ -1114,6 +1117,7 @@ class OpenMPIRBuilder {
11141117
/// \param NamePrefix Optional name prefix for if.then if.else blocks.
11151118
void createIfVersion(CanonicalLoopInfo *Loop, Value *IfCond,
11161119
ValueMap<const Value *, WeakTrackingVH> &VMap,
1120+
LoopAnalysis &LIA, LoopInfo &LI, llvm::Loop *L,
11171121
const Twine &NamePrefix = "");
11181122

11191123
public:

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp

Lines changed: 66 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -5376,58 +5376,90 @@ void OpenMPIRBuilder::unrollLoopHeuristic(DebugLoc, CanonicalLoopInfo *Loop) {
53765376

53775377
void OpenMPIRBuilder::createIfVersion(CanonicalLoopInfo *CanonicalLoop,
53785378
Value *IfCond, ValueToValueMapTy &VMap,
5379+
LoopAnalysis &LIA, LoopInfo &LI, Loop *L,
53795380
const Twine &NamePrefix) {
53805381
Function *F = CanonicalLoop->getFunction();
53815382

5382-
// Define where if branch should be inserted
5383-
Instruction *SplitBefore = CanonicalLoop->getPreheader()->getTerminator();
5384-
5385-
// TODO: We should not rely on pass manager. Currently we use pass manager
5386-
// only for getting llvm::Loop which corresponds to given CanonicalLoopInfo
5387-
// object. We should have a method which returns all blocks between
5388-
// CanonicalLoopInfo::getHeader() and CanonicalLoopInfo::getAfter()
5389-
FunctionAnalysisManager FAM;
5390-
FAM.registerPass([]() { return DominatorTreeAnalysis(); });
5391-
FAM.registerPass([]() { return LoopAnalysis(); });
5392-
FAM.registerPass([]() { return PassInstrumentationAnalysis(); });
5383+
// We can't do
5384+
// if (cond) {
5385+
// simd_loop;
5386+
// } else {
5387+
// non_simd_loop;
5388+
// }
5389+
// because then the CanonicalLoopInfo would only point to one of the loops:
5390+
// leading to other constructs operating on the same loop to malfunction.
5391+
// Instead generate
5392+
// while (...) {
5393+
// if (cond) {
5394+
// simd_body;
5395+
// } else {
5396+
// not_simd_body;
5397+
// }
5398+
// }
5399+
// At least for simple loops, LLVM seems able to hoist the if out of the loop
5400+
// body at -O3
53935401

5394-
// Get the loop which needs to be cloned
5395-
LoopAnalysis LIA;
5396-
LoopInfo &&LI = LIA.run(*F, FAM);
5397-
Loop *L = LI.getLoopFor(CanonicalLoop->getHeader());
5402+
// Define where if branch should be inserted
5403+
auto SplitBeforeIt = CanonicalLoop->getBody()->getFirstNonPHIIt();
53985404

53995405
// Create additional blocks for the if statement
5400-
BasicBlock *Head = SplitBefore->getParent();
5401-
Instruction *HeadOldTerm = Head->getTerminator();
5402-
llvm::LLVMContext &C = Head->getContext();
5406+
BasicBlock *Cond = SplitBeforeIt->getParent();
5407+
llvm::LLVMContext &C = Cond->getContext();
54035408
llvm::BasicBlock *ThenBlock = llvm::BasicBlock::Create(
5404-
C, NamePrefix + ".if.then", Head->getParent(), Head->getNextNode());
5409+
C, NamePrefix + ".if.then", Cond->getParent(), Cond->getNextNode());
54055410
llvm::BasicBlock *ElseBlock = llvm::BasicBlock::Create(
5406-
C, NamePrefix + ".if.else", Head->getParent(), CanonicalLoop->getExit());
5411+
C, NamePrefix + ".if.else", Cond->getParent(), CanonicalLoop->getExit());
54075412

54085413
// Create if condition branch.
5409-
Builder.SetInsertPoint(HeadOldTerm);
5414+
Builder.SetInsertPoint(SplitBeforeIt);
54105415
Instruction *BrInstr =
54115416
Builder.CreateCondBr(IfCond, ThenBlock, /*ifFalse*/ ElseBlock);
54125417
InsertPointTy IP{BrInstr->getParent(), ++BrInstr->getIterator()};
5413-
// Then block contains branch to omp loop which needs to be vectorized
5418+
// Then block contains branch to omp loop body which needs to be vectorized
54145419
spliceBB(IP, ThenBlock, false, Builder.getCurrentDebugLocation());
5415-
ThenBlock->replaceSuccessorsPhiUsesWith(Head, ThenBlock);
5420+
ThenBlock->replaceSuccessorsPhiUsesWith(Cond, ThenBlock);
54165421

54175422
Builder.SetInsertPoint(ElseBlock);
54185423

54195424
// Clone loop for the else branch
54205425
SmallVector<BasicBlock *, 8> NewBlocks;
54215426

5422-
VMap[CanonicalLoop->getPreheader()] = ElseBlock;
5423-
for (BasicBlock *Block : L->getBlocks()) {
5427+
SmallVector<BasicBlock *, 8> ExistingBlocks;
5428+
ExistingBlocks.reserve(L->getNumBlocks() + 1);
5429+
ExistingBlocks.push_back(ThenBlock);
5430+
ExistingBlocks.append(L->block_begin(), L->block_end());
5431+
// Cond is the block that has the if clause condition
5432+
// LoopCond is omp_loop.cond
5433+
// LoopHeader is omp_loop.header
5434+
BasicBlock *LoopCond = Cond->getUniquePredecessor();
5435+
BasicBlock *LoopHeader = LoopCond->getUniquePredecessor();
5436+
assert(LoopCond && LoopHeader && "Invalid loop structure");
5437+
for (BasicBlock *Block : ExistingBlocks) {
5438+
if (Block == L->getLoopPreheader() || Block == L->getLoopLatch() ||
5439+
Block == LoopHeader || Block == LoopCond || Block == Cond) {
5440+
continue;
5441+
}
54245442
BasicBlock *NewBB = CloneBasicBlock(Block, VMap, "", F);
5443+
5444+
// fix name not to be omp.if.then
5445+
if (Block == ThenBlock)
5446+
NewBB->setName(NamePrefix + ".if.else");
5447+
54255448
NewBB->moveBefore(CanonicalLoop->getExit());
54265449
VMap[Block] = NewBB;
54275450
NewBlocks.push_back(NewBB);
54285451
}
54295452
remapInstructionsInBlocks(NewBlocks, VMap);
54305453
Builder.CreateBr(NewBlocks.front());
5454+
5455+
// The loop latch must have only one predecessor. Currently it is branched to
5456+
// from both the 'then' and 'else' branches.
5457+
L->getLoopLatch()->splitBasicBlock(
5458+
L->getLoopLatch()->begin(), NamePrefix + ".pre_latch", /*Before=*/true);
5459+
5460+
// Ensure that the then block is added to the loop so we add the attributes in
5461+
// the next step
5462+
L->addBasicBlockToLoop(ThenBlock, LI);
54315463
}
54325464

54335465
unsigned
@@ -5483,20 +5515,7 @@ void OpenMPIRBuilder::applySimd(CanonicalLoopInfo *CanonicalLoop,
54835515

54845516
if (IfCond) {
54855517
ValueToValueMapTy VMap;
5486-
createIfVersion(CanonicalLoop, IfCond, VMap, "simd");
5487-
// Add metadata to the cloned loop which disables vectorization
5488-
Value *MappedLatch = VMap.lookup(CanonicalLoop->getLatch());
5489-
assert(MappedLatch &&
5490-
"Cannot find value which corresponds to original loop latch");
5491-
assert(isa<BasicBlock>(MappedLatch) &&
5492-
"Cannot cast mapped latch block value to BasicBlock");
5493-
BasicBlock *NewLatchBlock = dyn_cast<BasicBlock>(MappedLatch);
5494-
ConstantAsMetadata *BoolConst =
5495-
ConstantAsMetadata::get(ConstantInt::getFalse(Type::getInt1Ty(Ctx)));
5496-
addBasicBlockMetadata(
5497-
NewLatchBlock,
5498-
{MDNode::get(Ctx, {MDString::get(Ctx, "llvm.loop.vectorize.enable"),
5499-
BoolConst})});
5518+
createIfVersion(CanonicalLoop, IfCond, VMap, LIA, LI, L, "simd");
55005519
}
55015520

55025521
SmallSet<BasicBlock *, 8> Reachable;
@@ -5530,6 +5549,14 @@ void OpenMPIRBuilder::applySimd(CanonicalLoopInfo *CanonicalLoop,
55305549
Ctx, {MDString::get(Ctx, "llvm.loop.parallel_accesses"), AccessGroup}));
55315550
}
55325551

5552+
// FIXME: the IF clause shares a loop backedge for the SIMD and non-SIMD
5553+
// versions so we can't add the loop attributes in that case.
5554+
if (IfCond) {
5555+
// we can still add llvm.loop.parallel_access
5556+
addLoopMetadata(CanonicalLoop, LoopMDList);
5557+
return;
5558+
}
5559+
55335560
// Use the above access group metadata to create loop level
55345561
// metadata, which should be distinct for each loop.
55355562
ConstantAsMetadata *BoolConst =

llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp

Lines changed: 22 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2242,23 +2242,34 @@ TEST_F(OpenMPIRBuilderTest, ApplySimdIf) {
22422242
PB.registerFunctionAnalyses(FAM);
22432243
LoopInfo &LI = FAM.getResult<LoopAnalysis>(*F);
22442244

2245-
// Check if there are two loops (one with enabled vectorization)
2245+
// Check if there is one loop containing branches with and without
2246+
// vectorization
22462247
const std::vector<Loop *> &TopLvl = LI.getTopLevelLoops();
2247-
EXPECT_EQ(TopLvl.size(), 2u);
2248+
EXPECT_EQ(TopLvl.size(), 1u);
22482249

22492250
Loop *L = TopLvl[0];
22502251
EXPECT_TRUE(findStringMetadataForLoop(L, "llvm.loop.parallel_accesses"));
2251-
EXPECT_TRUE(getBooleanLoopAttribute(L, "llvm.loop.vectorize.enable"));
2252-
EXPECT_EQ(getIntLoopAttribute(L, "llvm.loop.vectorize.width"), 3);
2253-
2254-
// The second loop should have disabled vectorization
2255-
L = TopLvl[1];
2256-
EXPECT_FALSE(findStringMetadataForLoop(L, "llvm.loop.parallel_accesses"));
2252+
// These attributes cannot not be set because the loop is shared between simd
2253+
// and non-simd versions
22572254
EXPECT_FALSE(getBooleanLoopAttribute(L, "llvm.loop.vectorize.enable"));
2258-
// Check for llvm.access.group metadata attached to the printf
2259-
// function in the loop body.
2255+
EXPECT_EQ(getIntLoopAttribute(L, "llvm.loop.vectorize.width"), 0);
2256+
2257+
// Check for if condition
22602258
BasicBlock *LoopBody = CLI->getBody();
2261-
EXPECT_TRUE(any_of(*LoopBody, [](Instruction &I) {
2259+
BranchInst *IfCond = cast<BranchInst>(LoopBody->getTerminator());
2260+
EXPECT_EQ(IfCond->getCondition(), IfCmp);
2261+
BasicBlock *TrueBranch = IfCond->getSuccessor(0);
2262+
BasicBlock *FalseBranch = IfCond->getSuccessor(1)->getUniqueSuccessor();
2263+
2264+
// Check for llvm.access.group metadata attached to the printf
2265+
// function in the true body.
2266+
EXPECT_TRUE(any_of(*TrueBranch, [](Instruction &I) {
2267+
return I.getMetadata("llvm.access.group") != nullptr;
2268+
}));
2269+
2270+
// Check for llvm.access.group metadata attached to the printf
2271+
// function in the false body.
2272+
EXPECT_FALSE(any_of(*FalseBranch, [](Instruction &I) {
22622273
return I.getMetadata("llvm.access.group") != nullptr;
22632274
}));
22642275
}

mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp

Lines changed: 0 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -702,30 +702,6 @@ static void forwardArgs(LLVM::ModuleTranslation &moduleTranslation,
702702
moduleTranslation.mapValue(arg, moduleTranslation.lookupValue(var));
703703
}
704704

705-
/// Helper function to map block arguments defined by ignored loop wrappers to
706-
/// LLVM values and prevent any uses of those from triggering null pointer
707-
/// dereferences.
708-
///
709-
/// This must be called after block arguments of parent wrappers have already
710-
/// been mapped to LLVM IR values.
711-
static LogicalResult
712-
convertIgnoredWrapper(omp::LoopWrapperInterface opInst,
713-
LLVM::ModuleTranslation &moduleTranslation) {
714-
// Map block arguments directly to the LLVM value associated to the
715-
// corresponding operand. This is semantically equivalent to this wrapper not
716-
// being present.
717-
return llvm::TypeSwitch<Operation *, LogicalResult>(opInst)
718-
.Case([&](omp::SimdOp op) {
719-
forwardArgs(moduleTranslation,
720-
cast<omp::BlockArgOpenMPOpInterface>(*op));
721-
op.emitWarning() << "simd information on composite construct discarded";
722-
return success();
723-
})
724-
.Default([&](Operation *op) {
725-
return op->emitError() << "cannot ignore wrapper";
726-
});
727-
}
728-
729705
/// Converts an OpenMP 'masked' operation into LLVM IR using OpenMPIRBuilder.
730706
static LogicalResult
731707
convertOmpMasked(Operation &opInst, llvm::IRBuilderBase &builder,
@@ -2852,17 +2828,6 @@ convertOmpSimd(Operation &opInst, llvm::IRBuilderBase &builder,
28522828
llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder();
28532829
auto simdOp = cast<omp::SimdOp>(opInst);
28542830

2855-
// Ignore simd in composite constructs with unsupported clauses
2856-
// TODO: Replace this once simd + clause combinations are properly supported
2857-
if (simdOp.isComposite() &&
2858-
(simdOp.getReductionByref().has_value() || simdOp.getIfExpr())) {
2859-
if (failed(convertIgnoredWrapper(simdOp, moduleTranslation)))
2860-
return failure();
2861-
2862-
return inlineConvertOmpRegions(simdOp.getRegion(), "omp.simd.region",
2863-
builder, moduleTranslation);
2864-
}
2865-
28662831
if (failed(checkImplementationStatus(opInst)))
28672832
return failure();
28682833

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
2+
3+
llvm.func @_QPfoo(%arg0: !llvm.ptr {fir.bindc_name = "array", llvm.nocapture}, %arg1: !llvm.ptr {fir.bindc_name = "t", llvm.nocapture}) {
4+
%0 = llvm.mlir.constant(0 : i64) : i32
5+
%1 = llvm.mlir.constant(1 : i32) : i32
6+
%2 = llvm.mlir.constant(10 : i64) : i64
7+
%3 = llvm.mlir.constant(1 : i64) : i64
8+
%4 = llvm.alloca %3 x i32 {bindc_name = "i", pinned} : (i64) -> !llvm.ptr
9+
%5 = llvm.load %arg1 : !llvm.ptr -> i32
10+
%6 = llvm.icmp "ne" %5, %0 : i32
11+
%7 = llvm.trunc %2 : i64 to i32
12+
omp.wsloop {
13+
omp.simd if(%6) {
14+
omp.loop_nest (%arg2) : i32 = (%1) to (%7) inclusive step (%1) {
15+
llvm.store %arg2, %4 : i32, !llvm.ptr
16+
%8 = llvm.load %4 : !llvm.ptr -> i32
17+
%9 = llvm.sext %8 : i32 to i64
18+
%10 = llvm.getelementptr %arg0[%9] : (!llvm.ptr, i64) -> !llvm.ptr, i32
19+
llvm.store %8, %10 : i32, !llvm.ptr
20+
omp.yield
21+
}
22+
} {omp.composite}
23+
} {omp.composite}
24+
llvm.return
25+
}
26+
27+
// CHECK-LABEL: @_QPfoo
28+
// ...
29+
// CHECK: omp_loop.preheader: ; preds =
30+
// CHECK: store i32 0, ptr %[[LB_ADDR:.*]], align 4
31+
// CHECK: store i32 9, ptr %[[UB_ADDR:.*]], align 4
32+
// CHECK: store i32 1, ptr %[[STEP_ADDR:.*]], align 4
33+
// CHECK: %[[VAL_15:.*]] = call i32 @__kmpc_global_thread_num(ptr @1)
34+
// CHECK: call void @__kmpc_for_static_init_4u(ptr @1, i32 %[[VAL_15]], i32 34, ptr %{{.*}}, ptr %[[LB_ADDR]], ptr %[[UB_ADDR]], ptr %[[STEP_ADDR]], i32 1, i32 0)
35+
// CHECK: %[[LB:.*]] = load i32, ptr %[[LB_ADDR]], align 4
36+
// CHECK: %[[UB:.*]] = load i32, ptr %[[UB_ADDR]], align 4
37+
// CHECK: %[[VAL_18:.*]] = sub i32 %[[UB]], %[[LB]]
38+
// CHECK: %[[COUNT:.*]] = add i32 %[[VAL_18]], 1
39+
// CHECK: br label %[[OMP_LOOP_HEADER:.*]]
40+
// CHECK: omp_loop.header: ; preds = %[[OMP_LOOP_INC:.*]], %[[OMP_LOOP_PREHEADER:.*]]
41+
// CHECK: %[[IV:.*]] = phi i32 [ 0, %[[OMP_LOOP_PREHEADER]] ], [ %[[NEW_IV:.*]], %[[OMP_LOOP_INC]] ]
42+
// CHECK: br label %[[OMP_LOOP_COND:.*]]
43+
// CHECK: omp_loop.cond: ; preds = %[[OMP_LOOP_HEADER]]
44+
// CHECK: %[[VAL_25:.*]] = icmp ult i32 %[[IV]], %[[COUNT]]
45+
// CHECK: br i1 %[[VAL_25]], label %[[OMP_LOOP_BODY:.*]], label %[[OMP_LOOP_EXIT:.*]]
46+
// CHECK: omp_loop.body: ; preds = %[[OMP_LOOP_COND]]
47+
// CHECK: %[[VAL_28:.*]] = add i32 %[[IV]], %[[LB]]
48+
// This is the IF clause:
49+
// CHECK: br i1 %{{.*}}, label %[[SIMD_IF_THEN:.*]], label %[[SIMD_IF_ELSE:.*]]
50+
51+
// CHECK: simd.if.then: ; preds = %[[OMP_LOOP_BODY]]
52+
// CHECK: %[[VAL_29:.*]] = mul i32 %[[VAL_28]], 1
53+
// CHECK: %[[VAL_30:.*]] = add i32 %[[VAL_29]], 1
54+
// CHECK: br label %[[VAL_33:.*]]
55+
// CHECK: omp.loop_nest.region: ; preds = %[[SIMD_IF_THEN]]
56+
// This version contains !llvm.access.group metadata for SIMD
57+
// CHECK: store i32 %[[VAL_30]], ptr %{{.*}}, align 4, !llvm.access.group !1
58+
// CHECK: %[[VAL_34:.*]] = load i32, ptr %{{.*}}, align 4, !llvm.access.group !1
59+
// CHECK: %[[VAL_35:.*]] = sext i32 %[[VAL_34]] to i64
60+
// CHECK: %[[VAL_36:.*]] = getelementptr i32, ptr %[[VAL_37:.*]], i64 %[[VAL_35]]
61+
// CHECK: store i32 %[[VAL_34]], ptr %[[VAL_36]], align 4, !llvm.access.group !1
62+
// CHECK: br label %[[OMP_REGION_CONT3:.*]]
63+
// CHECK: omp.region.cont3: ; preds = %[[VAL_33]]
64+
// CHECK: br label %[[SIMD_PRE_LATCH:.*]]
65+
66+
// CHECK: simd.pre_latch: ; preds = %[[OMP_REGION_CONT3]], %[[OMP_REGION_CONT35:.*]]
67+
// CHECK: br label %[[OMP_LOOP_INC]]
68+
// CHECK: omp_loop.inc: ; preds = %[[SIMD_PRE_LATCH]]
69+
// CHECK: %[[NEW_IV]] = add nuw i32 %[[IV]], 1
70+
// CHECK: br label %[[OMP_LOOP_HEADER]], !llvm.loop !2
71+
72+
// CHECK: simd.if.else: ; preds = %[[OMP_LOOP_BODY]]
73+
// CHECK: br label %[[SIMD_IF_ELSE2:.*]]
74+
// CHECK: simd.if.else5:
75+
// CHECK: %[[MUL:.*]] = mul i32 %[[VAL_28]], 1
76+
// CHECK: %[[ADD:.*]] = add i32 %[[MUL]], 1
77+
// CHECK: br label %[[LOOP_NEST_REGION:.*]]
78+
// CHECK: omp.loop_nest.region6: ; preds = %[[SIMD_IF_ELSE2]]
79+
// No llvm.access.group metadata for else clause
80+
// CHECK: store i32 %[[ADD]], ptr %{{.*}}, align 4
81+
// CHECK: %[[VAL_42:.*]] = load i32, ptr %{{.*}}, align 4
82+
// CHECK: %[[VAL_43:.*]] = sext i32 %[[VAL_42]] to i64
83+
// CHECK: %[[VAL_44:.*]] = getelementptr i32, ptr %[[VAL_37]], i64 %[[VAL_43]]
84+
// CHECK: store i32 %[[VAL_42]], ptr %[[VAL_44]], align 4
85+
// CHECK: br label %[[OMP_REGION_CONT35]]
86+
// CHECK: omp.region.cont37: ; preds = %[[LOOP_NEST_REGION]]
87+
// CHECK: br label %[[SIMD_PRE_LATCH]]
88+
89+
// CHECK: omp_loop.exit: ; preds = %[[OMP_LOOP_COND]]
90+
// CHECK: call void @__kmpc_for_static_fini(ptr @1, i32 %[[VAL_15]])
91+
// CHECK: %[[VAL_45:.*]] = call i32 @__kmpc_global_thread_num(ptr @1)
92+
// CHECK: call void @__kmpc_barrier(ptr @2, i32 %[[VAL_45]])
93+
94+
// CHECK: !1 = distinct !{}
95+
// CHECK: !2 = distinct !{!2, !3}
96+
// CHECK: !3 = !{!"llvm.loop.parallel_accesses", !1}
97+
// CHECK-NOT: llvm.loop.vectorize

mlir/test/Target/LLVMIR/openmp-llvm.mlir

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -820,8 +820,6 @@ llvm.func @simd_if(%arg0: !llvm.ptr {fir.bindc_name = "n"}, %arg1: !llvm.ptr {fi
820820
}
821821
// Be sure that llvm.loop.vectorize.enable metadata appears twice
822822
// CHECK: llvm.loop.parallel_accesses
823-
// CHECK-NEXT: llvm.loop.vectorize.enable
824-
// CHECK: llvm.loop.vectorize.enable
825823

826824
// -----
827825

mlir/test/Target/LLVMIR/openmp-reduction.mlir

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -637,9 +637,12 @@ llvm.func @wsloop_simd_reduction(%lb : i64, %ub : i64, %step : i64) {
637637
// Outlined function.
638638
// CHECK: define internal void @[[OUTLINED]]
639639

640-
// Private reduction variable and its initialization.
640+
// reduction variable in wsloop
641641
// CHECK: %[[PRIVATE:.+]] = alloca float
642+
// reduction variable in simd
643+
// CHECK: %[[PRIVATE2:.+]] = alloca float
642644
// CHECK: store float 0.000000e+00, ptr %[[PRIVATE]]
645+
// CHECK: store float 0.000000e+00, ptr %[[PRIVATE2]]
643646

644647
// Call to the reduction function.
645648
// CHECK: call i32 @__kmpc_reduce
@@ -659,9 +662,9 @@ llvm.func @wsloop_simd_reduction(%lb : i64, %ub : i64, %step : i64) {
659662

660663
// Update of the private variable using the reduction region
661664
// (the body block currently comes after all the other blocks).
662-
// CHECK: %[[PARTIAL:.+]] = load float, ptr %[[PRIVATE]]
665+
// CHECK: %[[PARTIAL:.+]] = load float, ptr %[[PRIVATE2]]
663666
// CHECK: %[[UPDATED:.+]] = fadd float 2.000000e+00, %[[PARTIAL]]
664-
// CHECK: store float %[[UPDATED]], ptr %[[PRIVATE]]
667+
// CHECK: store float %[[UPDATED]], ptr %[[PRIVATE2]]
665668

666669
// Reduction function.
667670
// CHECK: define internal void @[[REDFUNC]]

0 commit comments

Comments
 (0)