Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maybe remove per-vat (xsnap) memory limit? #5953

Open
warner opened this issue Aug 13, 2022 · 0 comments
Open

maybe remove per-vat (xsnap) memory limit? #5953

warner opened this issue Aug 13, 2022 · 0 comments
Labels
enhancement New feature or request SwingSet package: SwingSet vaults_triage DO NOT USE xsnap the XS execution tool

Comments

@warner
Copy link
Member

warner commented Aug 13, 2022

What is the Problem Being Solved?

In today's discussion about DoS attacks/defenses, one concern was an attacker provoking a vat into allocating too much memory (perhaps slowly), and triggering our 2GiB memory limit:

https://github.com/agoric-labs/xsnap-pub/blob/3e671651ed6d8a58491b500c9e7359494ff7da7d/xsnap/sources/xsnapPlatform.c#L290-L291

When xsnap sees the engine try to malloc past this limit, it exits with XS_NOT_ENOUGH_MEMORY_EXIT, aka ExitCode.E_NOT_ENOUGH_MEMORY. The vat manager reports this as a delivery error, which tells the kernel to terminate the vat (in-consensus).

In general, any hard resource limit that kills the process (under the theory that the vat did it deliberately) will convert a provoked-growth bug into a kill-the-vat authority. An attacker who manages to send a few extra arguments with each message (perhaps unused positional arguments, which JS functions ignore unless special schema enforcement is added), might cause the vat to consume extra memory to hold them. Over time, that might grow the memory footprint to the point that it trips the ceiling. If, at that point, the worker is killed, we'll have given the attacker more authority than we intended.

Some pathways we're examining for that consumption:

  • promise vrefs (vpids) sent into a vat ("imported promises"), for which liveslots must create a Promise for userspace, and must retain the resolve/reject functions (and thus effectively the Promise itself) until the kernel does a dispatch.notify to resolve it
    • (it wouldn't help to track the promise with a WeakRef because we cannot sense whether userspace has done a .then or await on it, so liveslots must assume userspace is watching until resolved)
    • but note that a null upgrade of the vat will free this memory, because Promises do not survive an upgrade, nor would the userspace code that might have been following them
  • object vrefs (technically "voids" but for obvious reasons we never use that term) sent into a vat, which create a Presence
  • flat data, which gets unserialized into Arrays and records and numbers and strings

We have a size limit at the perimeter of the kernel (specifically the maximum cosmos signed-transaction size). That imposes a limit on the largest message that can make it into the bridge device, or the mailbox device. Framing, comms-protocol serialization, and other transformations are applied or removed, leading to some sort of limit (maybe larger, maybe smaller) on the properties of the capdata that arrives in dispatch.deliver or dispatch.notify. Then liveslots and marshal turn this into JS Objects and Promises and Arrays, etc, whose RAM usage is some function of the capdata sizes. If we could compute this overall transformation function, we could pick a signed-cosmos-txn limit X, and then state that the worst-case RAM usage that can be provoked by any single message is Y.

If Y is larger than the per-vat xsnap allocation limit, a single message could kill the vat during deserialization. If userspace and/or liveslots holds on to data over time, a series of sub-X-sized cosmos messages could kill the vat.

We don't (yet) know what that Y = f(X) function is. We expect it's polylinear. And it's probably more accurately expressed as a * body.length + b * slots.map(filterObjectIDs).length + c * slots.map(filterPromiseIDs). And the slope is probably at least 10, since [],[],[], takes only 3 bytes to make an empty Array (which is probably 4-8 32-bit words in RAM), or {},{},{},. The cosmos limit is currently at least several MB, maybe 10MB (to accomodate contract bundles). When we finish #4564 and #4811 and friends, txns of this size will still be allowed, but the component messages inside will be limited according to type, and the messages which go to swingset will be limited to much smaller values (maybe 1MB, maybe more like 10kB).

So we aren't confident that the perimeter limit is tight enough. And we might not be able to prevent liveslots from retaining some data. And userspace might not check for surprising extra properties on inbound objects that it stores. So the attack can be spread over multiple messages. The net result is that we might not be able to prevent attackers from inflating (perhaps slowly) the XS memory footprint of an innocent vat to the point that it passes a fixed limit.

But, if the vat is not intentionally using that extra data, then the Unix xsnap process which hosts the vat won't be touching it. The details are tricky, because the GC mark() phase must walk some portion of all objects, but for some forms of data (especially large strings), the application won't touch those pages. And the standard unix virtual memory system is pretty good at not spending RAM on unused pages.

So our thought is "thrash before crash". If we allow the xsnap process to grow large, but most of the pages are unused, the host OS will push those pages out to swap, preserving the RAM for only the active set. If the host has enough swap space, we can accomodate several inflated vats.

Description of the Design

I'm thinking:

  • change xsnapPlatform.c fxCreateMachinePlatform() to increase the->allocationLimit to 20 GiB or 100 GiB.

    • alternatively, I think we might be able to set the limit after initialization, in xsnap-worker.c after the xsCreateMachine() call, in response to an argv parameter
    • I believe the value is stored in the heap snapshot
    • it might be possible to change this dynamically
    • changes should be part of consensus, since the XS_NOT_ENOUGH_MEMORY_EXIT error is treated as part of consensus
  • tell validators they should provision a meaningful amount of swap

    • maybe this can wait, if it looks like our other defenses are working

Security Considerations

A hard memory limit chooses one side of a tradeoff:

  • enforce a limit:
    • good: host OS memory usage is protected (although the linux OOM killer protects this too)
    • good: actually buggy/malicious vats/contracts only hurt themselves
    • bad: vats are more vulnerable to being killed by attacker (provoked into using too much memory)
  • do not enforce a limit:
    • good: harder for attacker to provoke the vat
    • bad: attacker might trigger OOM killer and take out the whole validator
    • bad: buggy/malicious vats can kill the whole validator

Since we don't have third-party contracts in MN-1, any buggy/malicious contracts are our own fault, and halting the whole chain is better than an in-consensus termination of some important vat.

Test Plan

A unit test which creates an xsnap worker configured with various allocationLimits, evaluates code to build a string just below that limit (successfully), then allocates a little bit more (which should cause the worker to exit with ExitCode.E_NOT_ENOUGH_MEMORY).

We already have a test to exceed the limit, but it works by allocating more and more until the limit is reached. We need to test that the configured limit is respected.

To avoid undue stress on the CI host (or developer's laptops), the test shouldn't exercise much larger than 2GB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request SwingSet package: SwingSet vaults_triage DO NOT USE xsnap the XS execution tool
Projects
None yet
Development

No branches or pull requests

3 participants