From a9671af137a74aaef42ed487f54f06b377580341 Mon Sep 17 00:00:00 2001 From: Lin Clark Date: Mon, 15 Mar 2021 18:45:24 -0400 Subject: [PATCH] Add 03-11 meeting notes (#412) --- meetings/2021/WASI-03-11.md | 83 +++++++++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) diff --git a/meetings/2021/WASI-03-11.md b/meetings/2021/WASI-03-11.md index 37476e27..a87a75c0 100644 --- a/meetings/2021/WASI-03-11.md +++ b/meetings/2021/WASI-03-11.md @@ -32,3 +32,86 @@ The meeting will be on a zoom.us video conference. 1. _Sumbit a PR to add your agenda item here_ [slides]: presentations/2021-03-11-gohman-typed-main/index.html + +### Attendees + +- Dan Gohman +- Lin Clark +- Sam Clegg +- Mingqiu Sun +- Ralph Squillace +- Radu Matei +- Piotr Sikora +- Till Schneidereit +- Johnnie Birch +- Andrew Brown +- Pat Hickey +- Peter Huene +- Sasha Syrotenko +- Yong He + +**Dan Gohman:** What is main in this context? The main function. Takes argc argv. You pass in ints, bools, strings, files, network connections. However, on all systems today they are passed in as strings. That’s how they are communicated. + +You pass in filename. Program gets string and passes it to the OS. The OS then gives the file. + +This has worked well for past 50 years, but does cause problems. Everyone does command line parsing in their own way. Single letter options sometimes works. Order sometimes matters. Get all kinds of differences. + +More important for WASI, child process needs same filesystem view and permissions of the parent. Parent will pass strings to things they can access. It also means that they hardcode how you ask for a stream. This is hardcoding in a particular kind of access. This means we have a dependency on filesystem permissions and filesystem views. We have a coarse grained way to handle this. + +What can Wasm change here? We sometimes call Wasm an ISA, kind of true. CPUs have also started converging which is part of why Wasm is possible. But there are some properties that Wasm ____. Memory is a big virtual address space of bytes. Another counterprperty across all systems is that calls are just jumps based on conventions. Syscalls are also based on that property. Arguments go in particular registers. We don’t get this with Wasm. Call arguments are part of the signature. Has implications in terms of emulating Unix. + +Wasm has a static type system in contrast to other ISAs. CPUs will allow you to run any code you want. If they don’t recognize an opcode, CPU won’t determine until that opcode is being run. Wasm in contrast has up-front validation. With a type system, you have relationships that span instructions, so its a very different way of organizing instructions. So we can try to either minimize the effect of the type system to maximize compat, or we could take advantage of the type system and make the most use of it to do new things other platforms can’t do. I think ultimately we want both, but this framework helps us understand how we can combine both. + +Quick look at Wasm type system. Like CPU types. Interface Types gives us signed and unsigned ints, bool, lists, variants, records, strings and handles. IT is very abstract so no concrete representation. Handles represent external objects. It roughly corresponds to file descriptors. Over time, we expect that as the Wasm platform expands, we’re going to have more options here, e.g. representing unforgeable handles. At the witx level, we can use the handle type and assume the underlying platform will give us a representation of a handle, and we don’t have to care about the underlying mechanics. + +When we have types, we also get signatures. Functions have signatures, and programs are functions. If you look at native programs, what is the signature? Native binaries don’t tell you the signature. Unix deals with this by saying all programs have the same signature: pass in an array of strings and return an int. What if we allowed Wasm programs to have real signatures? + +What would this look like in terms of CLI usage? Command line parsing happens in the engine. The program says I want an f32 and the program doesn’t have to parse it. Wasm engines could be local aware and use commas instead of dots. To the extent that we can move this into the engine, we can make it consistent across programs. + +What happens if the program has a handle argument? A program wants to have an input file. But in most cases, they don’t need an actual file. They just need something they can read from. So if the program declares it’s going to take a handle and the Wasm engine says that the user on the CLI is going to pass in a string, the engine can convert it. This solves the problem of the child/parent process permissioning. We can launch applications without permissions to the file system. There might not even be a filesystem on this host. + +If we can build up an ecosystem of programs that can do this, we can start thinking about composing programs together in a richer way. + +This is a cool functionality, but currently there are zero programs in the world that take advantage of this. So how are we going to make this work? There are three levels. + +Option A: out of the box. If the developer doesn’t make any changes, we can give the user that experience and make some improvements. We could provide a main function in the toolchain that takes strings from the program. + +Option B: provide a witx description. For developers who are willing to do a little work, they could create a witx file that describes the interface, and then a tool could wrap their program with a typed main. That would prepopulate the preopen table. + +Option C: Typed Main in the source language. What if programming languages would allow you to write a mian function where you specify they types you want. I’m putting together a crate for Rust that prototypes this. Nameless does this. This works in native programs. Once WASI is ready to do this, we can port this to WASI. If you’re interested in Nameless, it’s up on crates.io. Get in touch if you want to learn more about what it does and how it works. Optimistic we can do this in other languages as well. + +Current status: typed main is still fairely experimental. Witx result type is currently in witx tree. Haven’t built tool on top of that yet. New style commands is something that I added to LLVM earlier. Wasi-libc and rust support that now. Interface types are the foundation for many of the types, currently in phase 1 and there’s prototyping underway. + +How are we going to use it in WASI. This is something we’ve been waiting for for a while. Currently wasi-clock etc depends on ambient access. But we want this to be a handle that some things don’t have access to, but complication is how do you get that handle into the program. This is a way we can get capabilities into programs. Wasi-random is kind of the same story. Typed main and similar techniques are closely related to having program imports. More generally, this applies more generally to other suggested apis such as serial ports, audio devices, database connectison. With typed main, we can get those capabilities into programs. + +**Sam Clegg:** Awesome vision! In the limit, this would remove need for preopen. + +**Dan Gohman:** Yes, preopen will still be needed for Option A and B, but Option C will not use preopen. + +**Sam Clegg:** Why is it called nameless? + +**Dan Gohman:** because with this, you don’t need to know the name of the file. For example, sometimes you need a filename to get to media type. We could do things with that not using names. In general, we want to avoid filesystem dependencies. They bring in strings, unicode, a lot of details of the filesystem. Goal is to abstract that out so programs don’t need to worry about that. Not thinking about how you turn a string into a file. All of that is factored out. + +**Sam Clegg:** So file names is what you’re thinking of, but also handles other types like floats, and probably will get an enum or struct of options. + +**Dan Gohman:** Yes. May be more work on ergonomics, but basically all flags would be booleans. One of the fun things that namelses adds on top of that. If a string looks like a URL, nameless has support for automatically handling that URL and giving you a handle for it. You can launch a child process and pipe in the input. Your program doesn’t know were the data is coming from. A bunch of different schemes we can wire up. + +**Radu Matei:** How does this look for reactor modules that don’t use a main function. + +**Dan Gohman:** Related to imports in Wasm. Reactors would probably just use imports for everything. Imports would be provided, initalizer would run, and then reactor would be ready. + +**Andrew Brown:** How do you see the relative priority? What is stopped by not having typed main? + +**Dan Gohman:** Shared nothing linking. The idea of breaking up a program into components that don’t share underlying state and implied dependencies including implied dependency on shared filesystem state. Shared nothing linking isolates without the giant firewall. + +**Andrew Brown:** Blurs the line between library and executables. In terms of other proposals in WASI, like sockets. Are there any dependencies here? + +**Dan Gohman:** Sockets proposal in flight has an address pool that is related to the thinking here. It would be provided the pool and then becomes a UI question. High-level for me is factoring it out. Typed main isn’t tied to command line options, so we can package this up in a lot of different ways. + +**Lin Clark:** Switch to Ralph + +**Ralph Squillace:** Great presentation! BSD sockets PR stalled, we said we would reach out, had a nice chat. He’s enthusiastic that we reached out and he would like to complete it. My take was that he had a question about what modifications might need to be made yet. Also some awareness that changes would need to be made in the future. + +**Radu Matei:** The conversation about the missing APIs were fairly clear. Probably good to go back to original PR and talk through what will change, eg with IO Streams. He’s happy to get back to working on that and so are we. + +**Ralph Squillace:** Over-documentation at this point might be good at this point so that’s the principle that we’d like to stab out in this case.