Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(vow): Support reliable retry-send #9608

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

feat(vow): Support reliable retry-send #9608

wants to merge 2 commits into from

Conversation

erights
Copy link
Member

@erights erights commented Jun 28, 2024

closes: #9541

Description

Retry idempotent eventual sends with reliable durable result vows that mask upgrade trauma.

The goal here is to support something like E(bob).foo(carol) but with several differences:

  • Retry: While the promise for the result of sending .foo(carol) to bob is rejected with an UpgradeDisconnection, the message is resent, until finally the result of such a send settles in some other way, i.e., it is either fulfilled, or rejected by anything else.
  • Zone abstraction: Since the point is to deal with upgrade trauma, these abstractions are designed to be scoped, somehow, to the durable zone. But can still be scoped to any zone.
  • Vow return: Rather than returning a promise, it returns a vow scoped to the zone in question --- typically the durable zone.
  • Components must be durable: The target, e.g., bob, and the arguments, .e.g., carol, must themselves be storable in the zone in question --- typically meaning they must be durable as well.
  • Masks upgrade on either end: The vow for the result itself only settles to the non-UpgradeDisconnection settlement of some try or retry of the message send. These retries will continue over upgrades of either or both the sending side and the receiving side, until it works.
  • Assumes idempotence: Because the operation is retried an unpredicable number of times, you should only do such a retry-send of operations that are idempotent enough for you not to care how many "extra" times the operation may have occurred.
    • Must be explicit: Because we cannot make such idempotence assumptions in general, we cannot automagically turn unmarked eventual-sends into retry-send. There must be an explicit optin expressed in the code.
    • All query-only ops are already idempotent: An important observation that makes retry-send more immediately applicable than expected.
  • Design for idempotence: APIs intended to be reliably usable across upgrade of either side should often be designed to be idempotent, so that callers can use a retry-send, to mask these upgrade traumas. This may be especially relevant to the design of orchestration APIs.

In this first PR, you can only express a retry-send of const p = E(bob).foo(carol); by saying

const v = retry(bob, 'foo', [carol]);

but the intention is that the next PR will provide a more natural E-like syntax.

Note we already have variations on E, like heapVowE that is scoped to the heap zone and have extra static methods. We can make similar variants of E that are scoped to the durable zone and have an extra retry method. Suppose this E variant is called durableVowE.

We already have static methods on E that express variants of E, such as E.sendOnly(bob).foo(carol) saying to just send the message to bob as a one-way send without any result. Similarly, I imagine that the next PR may provide something like

const v = durableVowE.retry(bob).foo(carol);

But hopefully we can invent something shorter but just as clear.

The internal API also provides a means to cancel the retry loop, so it doesn't go on forever after it is no longer relevant. Using the precedent of E taking an options bag as second argument, I imagine something like

const v = durableVowE.retry(bob, { cancelToken }).foo(carol);

once we have adapted a cancellation token proposal such as https://github.com/tc39/proposal-cancellation . So this would be farther into the future.

Security Considerations

unpredicable resends of non-idempotent operations may be accidents threatening integrity. We currently have no in-band way to mark an operation as idempotent, and so no current means to check the consistency of these assumptions.

Scaling Considerations

Perpetual retries threaten availability and scalability. Though not much if they are only triggered by upgrade. Nevertheless, it would be nice to better support cancellation.

Documentation Considerations

Will need to be explained, including the caution to avoid retry-send on non-idempotent operations, which requires explaining idempotence.

Testing Considerations

As with the current state of async-flow, this only uses the lightweight low-fidelity upgrade testing framework. See #9303

Upgrade Considerations

The point. Retry-sent messages will be retried across upgrade at either end until they settle to something else, masking upgrade trauma. Rather than a promise, a durable retry-send returns a reliable durable vow that itself survives upgrades.

@erights erights self-assigned this Jun 28, 2024
Copy link

cloudflare-workers-and-pages bot commented Jun 28, 2024

Deploying agoric-sdk with  Cloudflare Pages  Cloudflare Pages

Latest commit: 915a7d7
Status: ✅  Deploy successful!
Preview URL: https://a56d798f.agoric-sdk.pages.dev
Branch Preview URL: https://markm-retrier.agoric-sdk.pages.dev

View logs

@erights erights force-pushed the markm-retrier branch 3 times, most recently from b4bb440 to 95fe951 Compare June 30, 2024 03:34
@@ -23,7 +23,6 @@
"@agoric/base-zone": "^0.1.0",
"@agoric/store": "^0.9.2",
"@agoric/swingset-liveslots": "^0.10.2",
"@agoric/vow": "^0.1.0",
Copy link
Member Author

@erights erights Jun 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a drive-by, since it wasn't needed by vat-data anyway. But until removed, the new devDependence of @agoric/vow on @agoric/zone, for durable zone testing, created a dependency cycle.

@erights erights changed the title feat(vow): retrier feat(vow): Support reliable retry-send Jun 30, 2024
@erights erights marked this pull request as ready for review June 30, 2024 04:17
Comment on lines +100 to +104
// TODO do I need to wait for a new incarnation
// using isRetryableReason instead?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelfig , I'm especially interested in your take on this question. Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very much so. Which means you need to store any previous rejection seen. See how watch implements this.

t.log('ping at', count);
},
});
const v = zone.makeOnce('v', () => retry(bob, 'foo', [carol]));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the call to retry only happens in the first incarnation, but it continues retrying into the second without further preparation.

@erights
Copy link
Member Author

erights commented Jul 1, 2024

See #9621 for an experiment of a different approach.

@mhofman
Copy link
Member

mhofman commented Jul 3, 2024

We already have static methods on E that express variants of E, such as E.sendOnly(bob).foo(carol) saying to just send the message to bob as a one-way send without any result. Similarly, I imagine that the next PR may provide something like

const v = durableVowE.retry(bob).foo(carol);

I doubt it will be that simple given the requirement that the retry helper sees the host objects when used inside an asyncFlow guest function. While some usages of these helpers might be independent, we need to make sure we plumb things in a way that composes well.

(target, optVerb, args) => {
const { vow, resolver } = makeVowKit();
return harden({
target,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target should support a promise, which is not durably storable. Instead I believe we need a watchHandler for the target, and use watch(target) on init to get at the final target. It would also allow us to transparently handle vows as targets too :)

Comment on lines +100 to +104
// TODO do I need to wait for a new incarnation
// using isRetryableReason instead?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very much so. Which means you need to store any previous rejection seen. See how watch implements this.

if (optResolver === undefined) {
return;
}
// TODO `heapVowE` is likely too fragile under upgrade.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not, if the target send is idempotent, it's probably ok to just retrigger if we get upgraded ourselves.

packages/vow/src/retrier.js Outdated Show resolved Hide resolved
@erights erights force-pushed the markm-retrier branch 2 times, most recently from d8213c8 to fd21ac6 Compare September 5, 2024 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

retry helper to durably watch idempotent promises
3 participants