Skip to content

Latest commit

 

History

History
416 lines (225 loc) · 21.2 KB

CG-04-27.md

File metadata and controls

416 lines (225 loc) · 21.2 KB

WebAssembly logo

Agenda for the April 27th video call of WebAssembly's Community Group

  • Where: zoom.us
  • When: April 27th, 3pm-5pm UTC (April 27th, 8am-10am Pacific Time)
  • Location: link on calendar invite

Registration

None required if you've attended before. Send an email to the acting WebAssembly CG chair to sign up if it's your first time. The meeting is open to CG members only.

Logistics

The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.

This is a special edition longer form meeting to discuss Scoping and Layering of Module Linking and Interface Types.

Agenda items

  1. Opening, welcome and roll call
    1. Opening of the meeting
    2. Introduction of attendees
  2. Find volunteers for note taking (acting chair to volunteer)
  3. Adoption of the agenda
  4. Proposals and discussions
    1. Discussion: Scoping and Layering of Module Linking and Interface Types (full session) (slides)
  5. Closure

Agenda items for future meetings

None

Schedule constraints

None

Meeting Notes

Opening, welcome and roll call

Opening of the meeting

Introduction of attendees

Luke Wagner

Deepti Gandluri

Thomas Trankler

Zalim Bashorov

Francis McCabe

Asumu Takikawa

Saul Cabrera

Matias Funder

Arun Purushan

Ryan Hunt

Till Schneiderdeit

Sergey Rubanov

Dan Gohman

Wouter van Oortmerssen

Alon Zakai

Piotr Sikora

Paolo Severini

Mingqiu Sun

Alex Crichton

Rick Battagline

Lars Hansen

Jakob Kummerow

Chris Fallin

Adam Reese

Andrew Brown

Sam Clegg

Garrett Gu

Zhi An Ng

Daniel Wirtz

Slava Kuzmich

Ross Tate

Nicholas

Richard Winterton

Andreas Rossberg

Emanuel Ziegler

Piotr Sikora

Michal Kawalec

Manos Koukoutos

Nicholas Matsakis

Jhonnie Birch

Nabeel Al-Shamma

Heejin Ahn

Flaki

Thomas Lively

Pat

Adam Klein

Lin Clark

Sean Westfall

Fitzgen

Petr Penzin

Find volunteers for note taking

Zhi An Ng, Dan Gohman, Derek Schuff volunteer

Proposals and discussions

Luke: [presenting]

About 1 hour of content with slides, discussion to follow. Scoping and Layering the Module Linking and Interface Types proposals

Not trying to carve out a 3-year plan, there’s some urgency to do something in the short term, an “MVP”

Summary of current Module Linking proposal. Can import modules, with no state, and instances Define nested modules Instantiate modules to make instances This allows composing wasm modules, including instantiating module multiple times

Module linking now implemented in Wasmtime, leading to practical feedback. Including the question of duplicate imports, reported as design/#1402. This points to Module Linking being a new layer, rather than part of the core spec.

Core Wasm unchanged, new things go into “adapter module”. Syntax is identical, keyword is different.

Adapter modules are purely (typed) wiring. Linker specification. Adapter modules/instances form a tree, core modules/instances only at the leaves.

PP: Right now tools do link wasm objects together. How is this different from that, other than being part of the spec?

LW: type of linking we currently support with object file is static linking, many object files linked to make 1 Wasm file. Here, we have multiple Wasm files (from different toolchains), want to link at runtime, maybe want to share machine code, or other reasons.

SC: We can also use this for same-language dynamic linking in C++: the object format we have isn’t really a dynamic linking format, we could use this for that.

WV: Also module linking can involve multiple memories, unlike how C++ does it now.

FM: With dynamic linking people will want all kinds of different use cases. What’s the escape hatch for when the use case doesn’t really fit this. How many of these pieces can be re-used if the use case doesn’t line up exactly?

LW: have a slide laying out spectrum of linking options, it can be complementary to address different use cases, can come back to that

[presentation continues]

Module linking is a new spec layer between Core and the JS API and other language APIs. Module linking isn’t a required layer; JS and other languages could continue to use Core features and APIs.

PP: self-modifying code, person was asking about generating wasm and loading it up in the same instance, we cannot do that now, have to go back to JS, will this support something like that?

LW: will speak to that in a few slides, then return to this question, what I’m proposing wouldn’t do that but also wouldn’t block it.

[presentation continues] Implementation layering can reflect spec layering; engines can focus on the core spec, with a module linking implementation layer on top.

Interface Types: A feature proposal which extend the Module Linking layer Adapter modules can contain adapter functions which use Interface Types types like strings, and lifting/lowering instructions.

FM: Can the adapter concept be completely factored out? i.e. a separate functionality, reusable in different contexts?

LW: yes, will talk a bit about it when we get to WASI, spend a lot of time designing lifting/lowering, how to go between abstract types.

[presentation] Module Linking is a layer, with Interface Types as a feature, what other features are there? What should the scope be?

Spectrum of dynamic linking: Module linking at one end, usable even in very static use cases In the middle, the ability to have a first-class reference to a module. Or a step further, adding an instruction to dynamically compile bytecode to modules Or even further, JIT-style linkage There are interesting points on the spectrum with less dynamism too.

Greater dynamism provides greater flexibility, but fewer invariants. Wasm can potentially pursue multiple points on the spectrum; the spec layering means that Module Linking can make some choices, and core wasm can still explore other points in the spectrum.

RT: for 3rd one, no JIT + static linkage, how is that static? If you do branch you won’t know which reference you’re getting.

LW: the pointer to the instance is dynamic, but you know its type (what module it’s an instance of). It’s like a class with a non-virtual method.

RT: statically typed essentially.

LW: get_export is virtual dispatch, and the other one is C++ static function calls

LH: some of these seem to be dipping into the core spec, i.e. you havenew instructions and types, etc

LW: will return to what i think goes where, some of these make sense in core

FM: not sure if you kept the existing concepts invariant, if you have an instantiate instruction that refers to an index, that index is across all of your modules, in the first 2 items of the spectrum.

LW: this would be locally scoped the way indices are today. So this module would either be nested or imported into the containing module. So it would still be scoped.

[presentation]

Proposed scope: a Lightweight component model. Components: composable units of software. Many prior examples of this, on many points on the dynamism spectrum. Wasm modules are close to components already, but lack black-box and cross-language reuse and external composition. Interface types and module linking provide those pieces. Model: Cross-language and black-box reuse raise many questions, about how linking works, async, etc., so a component model needs to be somewhat opinionated. This also points to a layered approach, so that the Module Linking layer doesn’t need to impose its opinions on the core spec. Lightweight: Define a scope, and exclude features like distributed, services, etc., and don’t tie it to GUI or other domain-specific functionality.

Positioning: a middle-ground between language-specific units on one hand, and host-oriented units on the other.

SC: for “fast import calls” did you mean IT instead of module linking to implement those?

LW: both, module linking to wire them up, and IT to say how to pass values and handles back and forth

PP: if gc were implemented in one of those components… are we running ELF binaries?

LW: I meant some sort of “wasm-ified” ELF-like system, like what we have now in our tool conventions. All of these are wasm but the way they interact inside is different.

SC: they have to agree on their own conventions, not specced, whereas the green arrows are specced.

PP: what’s an example of what those ELF things could actually be?

LW: for the .a, that’s already a thing that exists, multiple wasm files, can make them into a .a. Emscripten supports dynamic linking of Wasm files, uses JS.

PP: so in that case, JS wouldn’t be needed. So ok yeah in this case it needn’t be literally ELF, but ELF-like use of wasm

FM: question about handles, for people not familiar with handles, they are a handle on a resource you don’t have access to. I’m thinking about the other part, the lifecycle of the resource identified by the handle. The lifecycle is something known by the engine, if sharing a resource across modules or components, and now we’re looking at something like a service making sure the handles are collected properly.

LW: in the same sense you could say the wasm engine is a service which implements the runtime semantics of the spec. So you you could specify when destructors are called, handle lifetimes, etc.

FM: a component is running in 1 Wasm engine.

LW: at least one component. It should be possible/common that all the components are running inside one engine

FM: whereas the heavyweight model, you’re talking about multiple Wasm engines.

SC: the model for the components would be an isolate in V8?

LW: yeah, like an isolate. Maybe even a little smaller. I drew 4 here but it would be normal to have hundreds.

PP: we can think of multiple engines running Wasm on different machines, function as a service, what use cases do you see, how would it work. How does supporting GC languages play into this? What kind of problems will this be a solution for?

LW: when you have separate units of code from separate projects/languages tha you want to reuse: today what people often do is put in a whole separate container and talk over a socket. This would be much lighter weight (putting in a component and calling synchronously). This fits (back to “spectrum of granularity” slide). Today people take these and bundle them up and expose in JS. For the GC question: inside a component you can linear memory but also GC memory as a component implementation detail. I will propose that there’s no intra-component GC (will get to that later)

PP: those things compiled to Wasm, accessing them somewhere in some opaque form?

RT: can you clarify, what is the deployed wasm file or files here?

LW: a component is a .wasm, if it wants it can bundle all the code it contains, if we have a host that wants to share libc, then don’t bundle libc, then host can say you import the shared libraries, then component can use a module import of libc. Shared library can be outside the component, you’re not importing state, you’re importing raw machine code essentially.

RT: so the green arrows, how are those imports set up?

LW: can say that if a component is a root, it could bundle all 3. You can nest dependents inside the root. If the host is like node.js, maybe ESM is what is linking them together. Different host can have different ways to load separate components and link them according to the host. Simplest case is all 4 in one wasm file.

RT: at a high level, somewhere between green and blue, depending on the specifics of the host. They could be sections of the same file.

[use cases]

PP: imagine i’m a developer using C++/Rust/AssemblyScript etc. Don’t know how someone would use Typed Main, probably hard to use from C/C++, what are the user visible things?

LW: e.g. typed main: that requires a bit of language-specifc tooling. Dan has a prototype in Rust that we want to extend to C++/WASI. You just write main with typed parameters like file handles, strings, etc. when you compile that to components, the export uses IT. then the host runtime, when you invoke the CLI, it converts the CLI arguments into the main types. So the tooling allows you to emit those files that do all that. It could involve a lot of different tools. So there are lots of questions about how do we enable people to produce those.

PP: Typed Main is a new way of writing code, not yet mainstream, maybe not the best to talk about here. I don’t understand how if I am a C++ developer, how would this benefit me? We can define novel use cases, but if they are quite far out, we’re defining our own users and own solutions.

LW: the hope is that we should be able to tweak but not have to rewrite existing tools, an likewise not have to rewrite existing codebases, to adapt legacy code by doing adaptation at the boundaries. E.g. take legacy code with a normal main() but write a witx file. But it is on us to write the tools the developers can use.

PP: witx is for one very specific runtime, not everyone support it.

LW: I just meant that as a stand-in for being able to specify a signature for what my component exports. It doesn't have to be exactly witx

PP: would that also be part of this new layer? Part of the tooling story for components.

FM: these use cases are great, might be helpful if we could say what components are bringing to these use cases. Take the first 1 for example, developer import component from their native language. They can import a module, what does component bring to the table?

LW: in this one, they pass native values and don’t have to write manual glue code that e.g. creates a typed array view and passes the values through that

FM: I would justify it a different way: it’s the unit of commerce, developer brings in some functionality and they don’t really want to know how that functionality is constructed. These are great use cases, what is component bringing to the table to make these use cases better.

LW: the same toolchain/standards can support all of these and I can just make a component that can be reused in all or many of these different use cases. It reduces the combination of the number of ways a piece needs to be produced and consumed.

[composition use cases]

RT: the trapping one, sounds like you have to be able to catch other people’s trap.

LW: there’s a difference between saying my state is not corrupted v.s. When someone traps, what’s the blast radius. Not suggesting that when one traps, making sure others are not corrupted, as opposed to letting corruption happen and later get some bizarre results. Not handling traps.

PP: follow-up to host side, in previous side, potentially having a registry of components, if you’re pulling them, can you depend on them crashing and not everything dies

LW: if you import a dependency and it traps (in the current proposal) you trap too, similarly to if you call it today. There are ways dependencies can break you. That’s subtly different from silent corruption.

PP: we cannot promise this, for sufficiently large applications, likely to pull in one small component that spoils everything. The current way where you load modules, maybe there are ways to detect and stop.

LW: for now, when you talk about host-specific units. Many of those today talk about partial failure, e.g. container systems expect containers to crash. In the future we can ask, how might we be able to handle this within the component model too.

PP: streaming data, a lot of things have to happen under the hood for it to work, okay for this level of proposal because it is high level.

LW: there are subtleties with streams. There’s synchronous streaming, and async, which is harder. Ultimately I think we want both.

[Composition use cases (future features)]

RT: one slide we talked about portability, and nother talked about changes you could make to the module system. ES modules (one e.g. you gave) don’t support some of those changes you described. How do you think about that

LW: talked to ESM experts about, we can embed it within the ESM loader, loader loads the root component, then they determine how the DAGs contained in it are linked. The ESM loader is a big singleton register, module linking doesn’t describe it, what ESM calls a single instance is a DAG of component instances. Technically feasible. WHen you drop a whole DAG in ESM it looks just like 1 thing.

RT: I was more thinking about default imports, would components work differently in that case?

LW: when you instantiate components, the root’s imports are determined by host, take what you’re given. Up to ESM to say what gets supplied to the root component.

[static analyzability use cases]

AZ: everywhere you say “AOT compiler” could that be a toolchain that does this with compilation?

LW: either build aot or client side

AZ: you mention x-module inline, wasm vm does not do that, probably left to toolchain, are you imagining it better for engine to do that?

LW: in these multi-tenant hosts you don’t want the toolchain to have done this because you want to share code between the tenants. So in that case you want to be more careful about the boundaries where you inline,so it’s more nuanced. For the web it could well be the case that you want everything together.

KM: in the web, if you import wasm malloc, you want to some inline some part of that. Hypothetically some vm can decide it will be much faster if we inline it

PP: cost of inlining in multiple places, binary size

KM: for the malloc example, it could just be a bump allocator. Hypothetically the VM could figure it out.

[Requirements]

PP: do you expect this to change, how do you see this evolving?

LW: if we have these requirements it will limit it some. Sometimes use cases are in conflict. E.g. needing ubiquitous GC means it can’t be implemented in environments without GC. so if there are radically different requirements then maybe those uses need something other than this particular component model.

PP: what’s an example of, if you work on it, and find that it is not efficient, we can remove. What is fixed?

LW: just to be clear, GC is fine within component bodies. From the outside I don’t care whether that happens in e.g. a linear memory or with platform GC. but otherwise, i could imagine loosening static component linkage. Certainly there are lots of use cases for modules dynamically chosen by the user. That could also happen outside the component model, but through the host (e.g. lots of different possibilities, e.g. does it come via fetch or filesystem, etc). So there’s lots of design space.

FM: is this set of requirements spanning, if we have all these requirements we can do it?

LW: I’m not sure. It summarizes a lot of debates and challenges and tradeoffs we’ve discussed over the last couple of years. Lots of these are not the obvious thing, we had to discover them in some sense.

RT: Good to have a description of the sort of changeability you want to have. What can people rely on, if they make changes what will not break compatibility.

LW: what’s the subtyping relationship between components...

RT: subtyping can be part, there's also other research on changeability

LW: mentioned a few things earlier, that’s something we need to iterate on

RT: good for scoping and for others to know what guarantees the component system has

LW: in some sense it means defining a semver model for components, what qualifies as a breaking change.

PP: versioning this layer will be harder than versioning core spec. If you don’t support certain instructions yet, you can’t do this way of communicating between modules.

[spectrum slide]

The scope of the lightweight component model suggests the leftmost 2 belong in the spec now

RT: suggest not calling them canonical? Idea of separating adapter functions from IT is a good idea. Many ways to represent strings, the term canonical suggests “only” rather than “easy”.

LW: that’s open for discussion. I think it will end up being useful as a canonical ABI for wasi. It can serve that role, but there doesn’t have to be exactly one.

PP: question about previous slide. We can have a component model first, then rebase, if you think about linking, it is a sub case of component model. Can we do linking first? Don’t feel great about a broader proposal first, then a more narrow proposal.

LW: it makes sense to do module-linking first and written up/implemented. We could say IT and the rest are optional, to achieve the outcomes you want. But if we don’t scope it to the component module, I worry we will have scoping problems in the future.

PP: wouldn’t linking be useful even without the rest? Standardizing how we dynamically link things...

LW: the point is that there are lots of ways to dynamically link things. In the component model you have [one way to do it?]. But there’s not just one kind of linking, so we want module-linking to be scoped to component-model-style linking.

DG: next meeting is full, we might have to wait for some time.

LW: not a full thing, but just a couple of materials.

DG: yeah, the agenda is full

LW: enough time to do a basic poll whether people think this is a good idea? Or is it premature

DS: seems like there is a lot to chew on here

DG: one option is to have an issue in document to do a tentative poll and can have more discussion later

[Poll on general interest, or general agreement with the high-level direction]

SF F N A SA
17 6 4 2 0

Closure