fix: explain await safety cases

endojs · Nov 19, 2022 · cc12305 · cc12305
1 parent c331b2f
commit cc12305
Show file tree

Hide file tree

Showing 10 changed files with 216 additions and 0 deletions.
diff --git a/docs/await-safety/all-branches-balanced.md b/docs/await-safety/all-branches-balanced.md
@@ -0,0 +1,7 @@
+```js
+// This nested synchronous-throw-impossible await at the top level of
+// the then branch is safe because it is balanced by a
+// synchronous-throw-impossible await at the top level of the else
+// branch. In fact the await in the else branch was introduced for
+// that balance, in order to make this one safe.
+```
diff --git a/docs/await-safety/await-safety.md b/docs/await-safety/await-safety.md
@@ -0,0 +1,53 @@
+_context: [[Coding Style]]_
+
+# Avoid Nested/Conditional Await
+
+Our general rule (which will be enforced as part of the Jessie linter) is that `await` must only appear in the top level of any given function. It should not appear inside of a conditional or a loop.
+
+For example, the following function runs `thunk()` (which might or might not be `async`), and wants to ensure that the `meteringDisabled` counter is decremented afterwards. The non-recommended approach is:
+
+```js
+async function runWithoutMeteringAsync(thunk) {
+  meteringDisabled += 1;
+  try {
+    return await thunk(); // NOT RECOMMENDED
+  } finally {
+    meteringDisabled -= 1;
+  }
+}
+```
+
+This version has a subtle timing concern. If `thunk()` throws synchronously, the `await` is bypassed entirely, and control jumps immediately to the `finally { }` block. This is made more obvious by holding the supposed return Promise in a separate variable:
+
+```js
+async function runWithoutMeteringAsync(thunk) {
+  meteringDisabled += 1;
+  try {
+    const p = thunk(); // if this throws..
+    return await p; // .. this never even runs
+  } finally {
+    meteringDisabled -= 1; // .. and control jumps here immediately
+  }
+}
+```
+
+The recommended approach rewrites this to avoid the `await`, and instead uses the Promise's `.finally` method to achieve the same thing:
+
+```js
+async function runWithoutMeteringAsync(thunk) {
+  meteringDisabled += 1;
+  return Promise.resolve()
+    .then(() => thunk())
+    .finally(() => {
+      meteringDisabled -= 1;
+    });
+}
+```
+
+(Note that `thunk()` must be called inside a `.then` to protect against any synchronous behavior it might have. It would not be safe to use `return thunk().finally(...)`, and `thunk()` might even return a non-Promise with some bogus `.finally` method.)
+
+`await` effectively splits the function into two pieces: the part that runs before the `await`, and the part that runs after, and we must review it with that in mind (including reentrancy concerns enabled by the loss of control between the two). Using `await` inside a conditional means that _sometimes_ the function is split into two pieces, and sometimes it is not, which makes this review process much harder.
+
+We sometimes refer to this rule as "only use top-level `await`". Keep in mind that we mean "top of each function", rather than "top level of the file" (i.e. outside of any function body, which is a relatively recent addition to JS, and only works in a module context).
+
+See ["Atomic" vs "Transactional" terminology note](https://github.com/Agoric/agoric-sdk/wiki/%22Atomic%22-vs-%22Transactional%22-terminology-note)
diff --git a/docs/await-safety/initial-control-flow.md b/docs/await-safety/initial-control-flow.md
@@ -0,0 +1,13 @@
+```js
+// "initial-control-flow" is the time reversal of
+// "terminal-control-flow". There is always a turn boundary before each
+// iteration of a for-await-of loop. This await occurs in one branch
+// of an unbalanced if within the loop body. But it happens before
+// anything stateful executes within this iteration, so all subsequent
+// code in the loop body only executes after a turn boundary.
+// It is also safe because "terminal-control-flow" because non of the
+// code in the remainder of the loop body is potentially stateful.
+// The for-await-of loop protects all code after the loop anyway,
+// because even in the zero iteration case it will always take a
+// turn boundary.
+```
diff --git a/docs/await-safety/not-my-problem.md b/docs/await-safety/not-my-problem.md
@@ -0,0 +1,15 @@
+It does occur nested with a control flow branch. But it is at top
+level within that branch, followed by another await also at
+top level within that branch. Therefore, this await introduces
+no unsafety beyond that potentially caused by the next await.
+
+Go pay attention to some other await instead of this one.
+
+e.g. in an if-then-else
+
+```js
+// It does occur nested with a control flow branch. But it is at top
+// level within that branch, followed by another await also at
+// top level within that branch. Therefore, this await introduces
+// no unsafety beyond that potentially caused by the next await.
+```
diff --git a/docs/await-safety/notes.md b/docs/await-safety/notes.md
@@ -0,0 +1,24 @@
+The "await hazard" is when there is stateful coupling between blocks of code such that on some paths, the stateful coupling happens within the same turn and in other paths there's a turn boundary between them. (By "turn" we mean a processing of the Node event loop.) Using `await` is a loss of execution control similar to calling a function.
+
+When you `await` you don't know what might have changed before the next line executes. But fortunately, you know that the call stack is now empty. (There's nobody else that's waiting to get control back from you.)
+
+So the equivalence classes are:
+- zero awaits
+- one or more awaits
+
+With our await rules we're trying to only worry about one of those hazards at a time.
+
+Top-level awaits are always safe.
+
+Some control structures are always safe:
+- `for await ... of` because it awaits even with zero iterations
+
+When there is no turn boundary, you know that no interleaving happened between A and B. (You have an implicit mutual exclusion lock to all state to which you have synchronous access.)
+
+When there is a turn boundary, you benefit from the stateful isolation wrt invariant maintenance. The code before the turn boundary is responsible for restoring all invariants before the turn is over (because arbitrary other turns happen before the await).
+
+The risk to inconsistent turn boundaries to legibility (reader understanding of the code) and to testing. E.g. you happen to test when the stateful coupling is across a turn boundary but in production there isn't (or vice-versa). Consider function A calls function B. If there is a turn boundary in the midst of B executing, at the turn boundary control returns to A which proceeds to potentially manipulate state and all that happens before B resumes executing.
+
+Example: Splicing into a doubly-linked list. There are two things that must change but can't happen at the same time. You have to temporarily suspend the invariant. If any code executes while that invariant is suspended that depends on the invariant being true, then you're in trouble.
+
+So when execution resumes after an `await` the awaiting block must check state assumptions that may have changed. 
diff --git a/docs/await-safety/synchronous-throw-impossible.md b/docs/await-safety/synchronous-throw-impossible.md
@@ -0,0 +1,28 @@
+```js
+// We assume `E` cannot synchronously throw, and all the argument
+// expressions are simple variables. If the await expression throws
+// we exit the function immediately. Thus, there is a reliable
+// turn boundary before each iteration of the loop. There remains
+// the issue of zero iterations of the loop, which in this case is
+// impossible because of the absence of a condition in the head.
+//
+// Finally, there remains the issue of the unbalances `if`. The
+// then case will always take at least one turn boundary before
+// proceeding. But the else case will proceed synchronously to
+// the subsequent code. Fortunately, the next interesting statement
+// is a top level await, so this await does not introduce any
+// unsafety beyond that caused by the other await.
+// Hence "not-my-problem". That await is manifestly safe because it
+// is at top level of the function and always forces a turn boundary.
+// The awaited expression is another `E` message send expression in
+// which all the argument expressions are only variables, so it will
+// not cause anything stateful prior to that turn boundary.
+```
+
+```js
+// normalizeConfigDescriptor is an async function. The only argument
+// expression that might in theory throw is `config.bundles`, which we
+// are confident could not actually throw. Even if it could, we'd still
+// be safe because "terminal-control-flow". The catch body is not stateful.
+// eslint-disable-next-line @jessie.js/no-nested-await
+```
diff --git a/docs/await-safety/terminal-combined-control-flow.md b/docs/await-safety/terminal-combined-control-flow.md
@@ -0,0 +1,35 @@
+In an an async function, the turn boundary always happens at the end of each
+iteration. However, the control-flow is still unbalanced. If the loop iterates
+zero times, we proceed to the next code synchronously. Else we proceed
+asynchronously. However, the next loop is terminal, and each iteration of that
+loop resemble each iteration of this one. Considered together, in all cases the
+first iteration of one of these loops will happen first, if there are any. And
+if there are more iterations from either loop, a turn boundary separated each
+iteration from the next.
+
+If you've got a non-terminal control block that has an unbalanced await, but the
+next control block always has an await before it does something statefully
+coupled then it's still the case that something statefully coupled is on the
+other side of an await
+
+```js
+// `ensureVatOnline` is an async function, so this turn boundary always
+// happens at the end of each iteration. However, the control-flow is
+// still unbalanced. If the loop iterates zero times, we proceed to
+// the next code synchronously. Else we proceed asynchronously.
+// However, the next loop is terminal, and each iteration of that loop
+// resemble each iteration of this one. Considered together, in all
+// cases the first iteration of one of these loops will happen first,
+// if there are any. And if there are more iterations from either loop,
+// a turn boundary separated each iteration from the next.
+```
+
+```js
+// It occurs at the top level of the loop body of a non-terminal top
+// level loop, so we need to consider the zero-vs-non-zero iteration
+// cases wrt the potentially stateful `getBootstrap()`. However, there is
+// a turn boundary immediately prior to the loop, with no
+// potentially stateful execution between that turn boundary and the loop.
+// So, considering the loop and that previous await together,
+// `getBootstrap()` in always called in a new turn.
+```
diff --git a/docs/await-safety/terminal-control-flow.md b/docs/await-safety/terminal-control-flow.md
@@ -0,0 +1,7 @@
+This nested await is safe because "terminal-control-flow",
+given that we do not consider the timing of a `console.log`
+to be a practical hazard.
+
+A joined control flow (e.g. if/then/else) with an await is a hazard if there is additional code after the block.
+
+This case names when it's safe because there is no code after or it's a return.
diff --git a/docs/await-safety/terminal-throw-control-flow.md b/docs/await-safety/terminal-throw-control-flow.md
@@ -0,0 +1,29 @@
+```js
+// This statement appears at the top level of a top level try block,
+// and so is executed
+// unconditionally. If it throws, we do nothing significantly stateful
+// before exiting. (We do not consider `console.log` to be stateful
+// for these purposes.) Otherwise, it will always cause a turn boundary
+// before control flow continues.
+```
+
+```js
+// We assume `E` cannot synchronously throw, and all the argument
+// expressions are simple variables. If the await expression throws
+// we exit the function immediately. Thus, there is a reliable
+// turn boundary before each iteration of the loop. There remains
+// the issue of zero iterations of the loop, which in this case is
+// impossible because of the absence of a condition in the head.
+//
+// Finally, there remains the issue of the unbalances `if`. The
+// then case will always take at least one turn boundary before
+// proceeding. But the else case will proceed synchronously to
+// the subsequent code. Fortunately, the next interesting statement
+// is a top level await, so this await does not introduce any
+// unsafety beyond that caused by the other await.
+// Hence "not-my-problem". That await is manifestly safe because it
+// is at top level of the function and always forces a turn boundary.
+// The awaited expression is another `E` message send expression in
+// which all the argument expressions are only variables, so it will
+// not cause anything stateful prior to that turn boundary.
+```
diff --git a/docs/await-safety/top-level-of-for-await.md b/docs/await-safety/top-level-of-for-await.md
@@ -0,0 +1,5 @@
+```js
+// It occurs at the top level of a for-await body, and so everything
+// afterwards happens after a turn boundary, whether the awaited expression
+// throws or not.
+```