Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could untyped plutus core have primitives which don't need initial force #4183

Closed
phadej opened this issue Nov 2, 2021 · 20 comments
Closed

Comments

@phadej
Copy link

phadej commented Nov 2, 2021

by looking at some contracts' plutus core, there are some of (force mkCons), (force trace), (force ifThenElse) for example. I guess in these forcings are cheap to store and execute, but they aren't entirely free.

EDIT: I think I haven't seen these combinators used unforced.

@michaelpj
Copy link
Contributor

These are somewhat artificial. Type abstractions in typed PLC block evaluation, so we erase abstractions to delay and instantiations to force. Builtins can have type arguments, so the corresponding type instantiations get turned into forces. We thought the simplest and most consistent approach was just to maintain these, although strictly speaking they're redundant.

@phadej
Copy link
Author

phadej commented Nov 2, 2021

Checking some more, E.g. sndPair has always two forces (for both of pair types), chooseList has also two, If each force is a byte, there's plenty of bytes we could save.

@michaelpj
Copy link
Contributor

The primitives don't appear that much in the grand scheme of things. If we were concerned about this then the number of force/delays introduced from erasing type abstractions/instantiations throughout the whole program is going to dwarf that. But I don't know of a good way to get rid of them en masse (we do the obvious force/delay cancellation already).

(Also term tags are actually only 4 bits, not a whole byte.)

@phadej
Copy link
Author

phadej commented Nov 2, 2021

I run a quick test (rewrite Force b@Builtin{} -> b) on use-cases Uniswap contract:

        expected: (11462,11001)
         but got: (11011,10777)

The counts are (node size, serialized size of term using DeBruijn names), that's 4% reduction in node count and 2% serialized size. I think that is significant in the grand scheme of things.

@michaelpj
Copy link
Contributor

Thanks, that's a useful datapoint!

Let me elaborate a bit more on why this is tricky.

Consider the following program (in pseudo-PIR):

let usesTrace :: (forall a. BuiltinString -> a -> a) -> Integer -> Integer
     usesTrace tracer i = tracer "I saw it" i
     evilTrace :: forall a . BuiltinString -> a -> a
     evilTrace s v = error "I'm evil"

     i1 = usesTrace trace 0
     i2 = usesTrace evilTrace 1
in i1 + i2

The usesTrace function receives as its argument a polymorphic function which it instantiates. Builtin functions are just values, so we can pass a polymorphic builtin function as its argument. Therefore whatever usesTrace does has to work for both "normal" functions and builtins. The first thing that it will do is to instantiate the function, which will correspond to a force when erased.

So at the very least it must be tolerable to force builtin functions. At the moment we require you to force exactly where you would otherwise pass a type. Perhaps we could not require you to do any forcing. The only problematic case would be a nullary builtin which had only type arguments and immediately failed (e.g. error :: forall a . a as a builtin) - it's not clear what that would mean. We could just ban them.

I have a feeling we've been through this discussion before, @kwxm do you remember anything else?

@phadej
Copy link
Author

phadej commented Nov 2, 2021

I don't agree, it is not tricky:

You can do rewrite in two steps:

  1. Replace each builtin with a version which is delayed as many times as there are erased type arguments
  2. Apply forceDelay simplification

Let me write (builtin b) for polymorphic builtin, and (builtin! b) for one which is absorbed force arguments (i.e. cannot be forced).

Your example would be rewritten as:

usesTrace (builtin trace) 0 ->          -- replace with delayed versions
usesTrace (delay (builtin! trace)) 0    -- note: builtin!

And indeed, in the compiled PLC there is

[ [ usesTrace (builtin trace) ] (con integer 0) ]

That grows in size, but that doesn't actually happen. GHC rarely passes
polymorphic arguments (there should be an explicit rank-N signature, no eta-expansion).

If we rewrite the example as (which is closer to what inferred types would be):

errorExample2 :: Integer
errorExample2 = 
    let usesTrace :: (BuiltinString -> Integer -> Integer) -> Integer -> Integer
        usesTrace tracer i = tracer "I saw it" i

        evilTrace :: forall a . BuiltinString -> a -> a
        evilTrace s v = traceError "I'm evil"

        i1 = usesTrace trace 0
        i2 = usesTrace evilTrace 1
    in i1 + i2

The line above becomes

[ [ usesTrace (force (builtin trace)) ] (con integer 0) ]

which would simplify:

usesTrace (force (builtin trace)) 0 ->           -- replace with delayed versions
usesTrace (force (delay (builtin! trace))) 0 ->  -- forceDelay
usesTrace (builtin! trace) 0                     -- profit!

The polymorphic program has size 54, simple one 53, and latter could become smaller.


EDIT: What does (force (builtin trace)) evaluate to? Or is it stuck until there is enough arguments to execute the builtin?

@michaelpj
Copy link
Contributor

That grows in size, but that doesn't actually happen. GHC rarely passes polymorphic arguments (have to an explicit rank-N signature).

Yes, you're certainly right about that, but even if it's a corner case we thought it was important for builtins to behave exactly the same as polymorphic functions. That way you don't have to worry about the corner cases.

You can do rewrite in two steps...

Nice idea. This assumes that all the type arguments come first, but at the moment our builtins can have interleaved term and type arguments. This is pretty pointless however, and we might as well require the type arguments to come first.

If we did that and banned nullary builtins, I think this could work.

What does (force (builtin trace)) evaluate to? Or is it stuck until there is enough arguments to execute the builtin?

It's stuck.

@phadej
Copy link
Author

phadej commented Nov 2, 2021

This assumes that all the type arguments come first, but at the moment our builtins can have interleaved term and type arguments. This is pretty pointless however, and we might as well require the type arguments to come first.

I think that removing the head type-arguments is already an improvement. If it's possible to shuffle the type arguments to the front, that would be even better, but do you have such builtins? (The specification doesn't list their types, nor haddocks, list their types, that would be helpful).

@phadej
Copy link
Author

phadej commented Nov 2, 2021

It looks like that types of builtins are generated by toBuiltinMeaning?

I think GHC pushes all foralls to the front in inferred types. That trick is cool but feels dangerous, I'd prefer having types of builtins checked (against explicitly written ones) rather than inferred from their Haskell-meaning.

EDIT: if (U)PLC is a compilation target the types or "call convention" (in case of UPLC) is an important information to have in the spec.

@phadej
Copy link
Author

phadej commented Nov 2, 2021

What does (force (builtin trace)) evaluate to? Or is it stuck until there is enough arguments to execute the builtin?

It's stuck.

This is unfortunate. I liked the "builtins have to fully saturated" design more (when there isn't such problem). It's easier to show that semantics are preserved then. (My proposal would been completely trivial then).

@michaelpj
Copy link
Contributor

We had a long argument about saturated vs unsaturated builtins. In the end we opted for unsaturated - perhaps a mistake, but here we are.

@kk-hainq
Copy link
Contributor

kk-hainq commented Nov 3, 2021

You can do rewrite in two steps...

Nice idea. This assumes that all the type arguments come first, but at the moment our builtins can have interleaved term and type arguments. This is pretty pointless however, and we might as well require the type arguments to come first.

If we did that and banned nullary builtins, I think this could work.

We've been able to gather some interest on this specific topic and Oleg's proposal. @michaelpj @phadej Should we add it to #4174 and attempt it in the upcoming weeks?

@phadej
Copy link
Author

phadej commented Nov 3, 2021

This is an intrusive proposal as it changes the UPLC. Everything listed in #4174 doesn't seem to require changes, even in typed PLC, as they can happen on PIR-level, which makes them a lot more realistic to get done :)

I opened this issue "early" because I expect it take long to fix, if it will ever.

@kk-hainq
Copy link
Contributor

kk-hainq commented Nov 3, 2021

I know, but it is well-scoped and elegant. We also suspect the improvements would be non-trivial, and our transpilers would take fewer forces every day of the week!

@michaelpj
Copy link
Contributor

Yes, this would be a change to the semantics of the language, and might well require a new language version and changes to the spec (which we're in the process of redoing already). It's a much more complicated change, and I don't think it will be that significant in the grand scheme of things. It's something we can hopefully roll in easily enough when we do do another version.

@kk-hainq
Copy link
Contributor

kk-hainq commented Nov 4, 2021

Okay, I'll just add this to #4174 for tracking for now.

@effectfully
Copy link
Contributor

effectfully commented Jan 1, 2022

@phadej

If it's possible to shuffle the type arguments to the front, that would be even better, but do you have such builtins?

We don't.

(The specification doesn't list their types, nor haddocks, list their types, that would be helpful).

See the golden files. Generating a haddock comment using doctest or whatever would be quite helpful, I agree.

It looks like that types of builtins are generated by toBuiltinMeaning?

Within toBuiltinMeaning, yes.

I think GHC pushes all foralls to the front in inferred types. That trick is cool but feels dangerous, I'd prefer having types of builtins checked (against explicitly written ones) rather than inferred from their Haskell-meaning.

We don't rely on GHC pushing the foralls to the front (nor would it be possible at all, I think). Using an inference machinery and not checking what it infers would be very dangerous, I agree, but we have golden tests for that purpose. A golden file is created automatically whenever a new builtin is added, so we do control what PLC types get inferred for builtins.

EDIT: if (U)PLC is a compilation target the types or "call convention" (in case of UPLC) is an important information to have in the spec.

@kwxm
Copy link
Contributor

kwxm commented Jan 6, 2022

(The specification doesn't list their types, nor haddocks, list their types, that would be helpful).

Sorry, I just noticed this after it was quoted in a previous comment.

The plc and uplc commands now have a print-builtin-signatures option that will print out the types of all the builtins it knows about in a reasonably readable form:

$ cabal exec uplc print-builtin-signatures
addInteger               : [ integer, integer ] -> integer
subtractInteger          : [ integer, integer ] -> integer
...
encodeUtf8               : [ string ] -> bytestring
decodeUtf8               : [ bytestring ] -> string
ifThenElse               : [ forall a, bool, a, a ] -> a
chooseUnit               : [ forall a, unit, a ] -> a
trace                    : [ forall a, string, a ] -> a
fstPair                  : [ forall a, forall b, pair(a, b) ] -> a
sndPair                  : [ forall a, forall b, pair(a, b) ] -> b
chooseList               : [ forall a, forall b, list(a), b, b ] -> b
mkCons                   : [ forall a, a, list(a) ] -> list(a)
...

That still doesn't quite tell the full story because sometimes forall means "for all built-in types" (as in mkCons) and sometimes it means "for any term at all" (as in ifThenElse). We're working on a new version of the specification which will explain all this in detail: that should be available in a couple of weeks.

@phadej
Copy link
Author

phadej commented Jan 6, 2022

@kwxm that's great.

@effectfully
Copy link
Contributor

effectfully commented Feb 22, 2023

There's now a CIP for this discussion, hence I'm closing the ticket. Feel free to reopen if you disagree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants