POPL 2025
Sun 19 - Sat 25 January 2025 Denver, Colorado, United States

WebAssembly (Wasm) is an evolving language with several proposals for feature extensions. Some features, like SIMD support, target common hardware capabilities, while others propose high-level language features like garbage collection and stack switching. However, existing Wasm compilers do not support these high-level features in a scalable, extensible, and modular way. These compilers either rely on custom, hand-tuned toolchains or require complicated reconstruction of high-level Wasm features from a general but low-level intermediate representation (IR) like LLVM.

Custom compilation toolchains for dedicated source languages (e.g., Go, AssemblyScript) use IRs tailored for producing Wasm. While these compilers can be extended to implement newer Wasm features, compiler developers must duplicate their efforts for supporting each new source language. On the other hand, compilers using general frameworks like the LLVM first convert high-level Wasm features into lower-level LLVM IR, using the backend to reconstruct the higher-level feature. While this approach can be scalable and modular, it suffers from a fundamental issue: the current LLVM IR, designed for traditional assembly targets, is too low-level to natively support the new higher-level Wasm features. Generating higher-level Wasm code from LLVM IR would require re-abstraction algorithms or extensions to the LLVM IR, both of which are not ideal. Implementing a re-abstraction algorithm, such as Relooper to reconstruct high-level Wasm control flow, incurs long development times from expert developers despite this high-level information being already present in the source language. Extending LLVM IR is also impractical, as LLVM’s extensibility is limited—for instance, adding a new type to LLVM IR requires substantial effort because it changes the bitcode format, breaking compatibility with other components.

We address this limitation of existing Wasm compilers by implementing a new, custom MLIR-based (Multi-Level Intermediate Representation) compiler pipeline, which can be easily extended to support different source language front-ends, as well as new Wsm high-level features.

MLIR is a compiler infrastructure that has multiple IRs at different levels of abstraction called dialects. We propose a new WebAssembly dialect in MLIR, lifting Wasm from being a LLVM target to being a part of the MLIR ecosystem. Implementing a new Wasm feature would require a developer to simply identify a dialect that expresses the proposed feature at the required level of abstraction, and then provide a simple lowering pass to our Wasm dialect without the need for complicated reconstruction. For example, the standard MLIR structured control flow (scf) dialect can richly express high-level Wasm control flow, removing the need for a Relooper-like algorithm. Similarly, lowering dialects with non-local control flow, like continuations, could be done easily using stack-switching constructs—which are nontrivial to generate from LLVM IR. While MLIR already supports compiling to Wasm through the LLVM backend, supporting a native Wasm dialect is more extensible and maintainable, making it well-suited for the evolving nature of Wasm. We anticipate that this approach will benefit a wide range of languages targeting MLIR.