-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PLT-506: Case-of-case #5554
PLT-506: Case-of-case #5554
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite a diff! I've only had a very cursory look so far, I'd need a ticket and some allocated time to review this one properly.
defaultUniMatcherLike :: Map.Map DefaultFun (BuiltinMatcherLike DefaultUni DefaultFun) | ||
defaultUniMatcherLike = Map.fromList | ||
[ (IfThenElse, BuiltinMatcherLike splitIfThenElse) | ||
, (ChooseUnit, BuiltinMatcherLike splitChooseUnit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lists and pairs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lists we can do, for pairs we don't actually have a matcher function!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lists we can do, for pairs we don't actually have a matcher function!
How come? How's MkCons a [a]
so much different from MkPair a b
? Isn't uncurry
the matcher?
... oh wait, I think I get it. We have ChooseList
, but no such builtin for pairs. This all is just so weird, we should have proper pattern matching builtins and forget about all this nonsense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. And it's slightly weird to treat chooseList
as a "matcher": I'd be treating it as a matcher with 0 arity branches, which is in fact fine, but is a little strange...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about chooseData?
That is, this guarantees that case-of-known constructor will fire and get rid of one of the | ||
matches entirely, which is great. | ||
|
||
If we *don't* have this property, then case-of-case can duplicate code, making the program bigger. | ||
So the conservative option is to only do case-of-case when this is true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, here they are. Is case-of-case interleaved with case-of-known-constructor? If not, can there still be an exponential blowup in code size before case-of-known-constructor cleans everything up? Worth discussing in this Note I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note that I'm currently turning the non-conservative version on by default since it seems to help! I feel pretty nervous about that...
Good point about the interleaving: indeed, they run one after the other and only once in each simplifier iteration, so we should be safe, but worth pointing out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only once in each simplifier iteration, so we should be safe
Can you still get an exponential blowup due to transformOf
being a bottom-up traversal? First you duplicate some leaves, then you duplicate nodes storing the duplicated leaves, then you duplicate nodes storing the duplicated nodes etc.
No idea if it's how it actually works, but it's worth having some kind of a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, that seems possible. Let me try and come up with a case where that could happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I can make it happen. I think if we're in conservative mode then COKC is still guaranteed to clear it up, so the worst that can happen is a big intermediate tree.
16155a3
to
c04d00e
Compare
So I realised that
That means I don't think we can have a strictly conservative case-of-case at all (or we'd have to assert that the branch bodies were all distinct conapps, which seems annoying). I'm not totally sure what to do then: turn it off entirely when in conservative mode, and turn it on universally otherwise (just forgetting the conapps-only mode?). |
|
Hmmm maybe we could fuse the passes. I think the issue will be that case-of-case creates opportunities for COKC in a subterm of the term currently being processed, so naive fusion won't work. We could directly call the COKC step from the case-of-case step if we think we've made an opportunity for it. That would avoid the problem partly, although if you look in the comment I added you'll see there's still a case where we can get duplication :( |
@@ -133,7 +137,7 @@ data CompilationCtx uni fun a = CompilationCtx { | |||
makeLenses ''CompilationCtx | |||
|
|||
toDefaultCompilationCtx | |||
:: (Default (PLC.BuiltinSemanticsVariant fun), Default (PLC.CostingPart uni fun)) | |||
:: (Ord fun, Default (PLC.BuiltinSemanticsVariant fun), Default (PLC.CostingPart uni fun)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated: This Default CostingPart
always makes me uneasy. It sucks that we have to pass some fake (zero) "default" costing just so we can have a runtime available to hand over to the EvaluateBuiltins pass (and perhaps other passes as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there another way to do it? Costing is a part of the evaluation semantics of a builtin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess in principle we could make it optional? But then we'd need to pass Maybe cost
around in various places which would have overhead in the machine, so this seems better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the compiler does not make use of the costing part at all. It's just for compile-time evaluation, disregarding the costs (more precisely treating all as zero-cost)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the compiler does not make use of the costing part at all. It's just for compile-time evaluation, disregarding the costs (more precisely treating all as zero-cost)
Good point, though. I'll think about it, thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you only need compile-time evaluation, you can use the first two fields of BuiltinMeaning
like we do in the well-typed Hedgehog program generators for example. Although it's probably not a good idea, since you probably don't want to roll out your own lifting and unlifting logic etc. Maybe we should indeed have some middle-ground.
Although I still feel like insisting everywhere that costing is a part of operational semantics of builtins will save us so much trouble in the long run that it may not be optimal to break this rule even when it makes sense. I may be wrong about that in this particular case, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this still needs a bit of code refactoring:
a) moving code to their "principal" modules,
b) redoing a bit processTerm because it was hard to follow
c) adding some more comments/haddocks
Overall, despite bing a tricky transformation, the golden results look correct. Kudos!
@@ -144,8 +123,38 @@ bindingVarInfo = \case | |||
in VarsInfo (PLC.insertByName matcher info mempty) mempty | |||
constrArity constrTy = | |||
-- One parameter for all function type arguments | |||
fmap (const TypeParam) (funTyArgs constrTy) | |||
fmap (const TermParam) (funTyArgs constrTy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is happening here? Why TypeParam became TermParam?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was wrong before! now it's still slightly wrong actually, but I'll fix it
e.g. consider the term @let id = \x -> x in id@: the variable @id@ has syntactic | ||
arity @[]@, but does in fact need an argument before it does any work. | ||
-} | ||
type Arity = [Param] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Arity is a misnomer in this case. Arity supposed to be a number and this is clearly not a number. Suggestion SyntacticArgs or SyntacticParams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I think about it differently. For me arity is a "summary of how the function is going to be used in the program". As such it makes sense to me to include information about the parameters, because that's what you need!
TermAppContext <$> underBindersMaybe arity f branch <*> pure ann <*> go ctx arities | ||
go _ _ = Nothing | ||
|
||
-- I tried to come up with a way to do this in terms of the underBinders traversal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get it. underbinders should just work since its a traversal with Maybe as the Applicative?
go (TermAppContext branch ann ctx) (arity:arities) =
TermAppContext <$> underBinders arity f branch <*> pure ann <*> go ctx arities
-} | ||
|
||
{- Note [Case-of-case and conapps] | ||
Case-of-case will be much better if the bodies of the branches are all constructor applications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not jus t"will be much better", but to me it seems caseofcase is only applied if all the branches are conapps, correct?
33 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's nice, I wish i could see the simplified uplc here to enjoy the benefit of caseofcase+caseknown+inline!
Thanks for the explanation. Can you then just create a |
Yeah, I was also wondering about this. This is essentially what GHC does with join points. In our case it's not "free" like join points are, but it might still be worth it. And it does avoid the exponential code problem. So maybe it is indeed the right thing to do. In fact, I already have most of the code there to do this...
Right, and that's a generally useful inlining heuristic that will help us anyway. For most cases we don't even need a new heuristic, though: in the non-duplicative cases the binding will be single-occurrence, so the inliner will already inline it. This seems promising, I think I should try this. |
No, this is totally wrong: we get one binding which is used in every case branch, so it will almost never be single-use. So we are unlikely to inline it currently, which is a bit of a problem. Boo. |
I guess we could make a binding in conservative mode? That avoids the program growth problem, and currently won't optimize well, but maybe that's okay. |
d67adc0
to
0ef014c
Compare
0ef014c
to
e7b68f4
Compare
This is good to go now, I think.
|
Based on #5551, look at the last commit if you're interested.
This is a bit tricky. The key is to extract out enough of the code for picking apart datatype match terms that the pass itself is comprehensible. We also need this because then we can abstract over builtin "matchers" also, which lets us handle e.g. the combination of
Bool_match
andifThenElse1
, which is very common.There's a question of whether to only do it when we know it's a win. I've linked that to the conservative optimisation flag, which feels right. To my surprise, it seems to be generally better to just let it run regardless!
Improvements are, again, modest.
Pre-submit checklist: