An MLIR Dialect for WebAssembly (WAW 2025 - - WebAssembly Workshop)

Who

Byeongjee Kang, Harsh Desai, Limin Jia, Brandon Lucia

Track

WAW 2025

Time Zone

The program is currently displayed in (GMT-07:00) Mountain Time (US & Canada).

Use conference time zone: (GMT-07:00) Mountain Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 20 Jan 2025 09:40 - 10:05 at Dodgeball - Session 1

Abstract

WebAssembly (Wasm) is an evolving language with several proposals for feature extensions. Some features, like SIMD support, target common hardware capabilities, while others propose high-level language features like garbage collection and stack switching. However, existing Wasm compilers do not support these high-level features in a scalable, extensible, and modular way. These compilers either rely on custom, hand-tuned toolchains or require complicated reconstruction of high-level Wasm features from a general but low-level intermediate representation (IR) like LLVM.

Custom compilation toolchains for dedicated source languages (e.g., Go, AssemblyScript) use IRs tailored for producing Wasm. While these compilers can be extended to implement newer Wasm features, compiler developers must duplicate their efforts for supporting each new source language. On the other hand, compilers using general frameworks like the LLVM first convert high-level Wasm features into lower-level LLVM IR, using the backend to reconstruct the higher-level feature. While this approach can be scalable and modular, it suffers from a fundamental issue: the current LLVM IR, designed for traditional assembly targets, is too low-level to natively support the new higher-level Wasm features. Generating higher-level Wasm code from LLVM IR would require re-abstraction algorithms or extensions to the LLVM IR, both of which are not ideal. Implementing a re-abstraction algorithm, such as Relooper to reconstruct high-level Wasm control flow, incurs long development times from expert developers despite this high-level information being already present in the source language. Extending LLVM IR is also impractical, as LLVM’s extensibility is limited—for instance, adding a new type to LLVM IR requires substantial effort because it changes the bitcode format, breaking compatibility with other components.

We address this limitation of existing Wasm compilers by implementing a new, custom MLIR-based (Multi-Level Intermediate Representation) compiler pipeline, which can be easily extended to support different source language front-ends, as well as new Wsm high-level features.

MLIR is a compiler infrastructure that has multiple IRs at different levels of abstraction called dialects. We propose a new WebAssembly dialect in MLIR, lifting Wasm from being a LLVM target to being a part of the MLIR ecosystem. Implementing a new Wasm feature would require a developer to simply identify a dialect that expresses the proposed feature at the required level of abstraction, and then provide a simple lowering pass to our Wasm dialect without the need for complicated reconstruction. For example, the standard MLIR structured control flow (scf) dialect can richly express high-level Wasm control flow, removing the need for a Relooper-like algorithm. Similarly, lowering dialects with non-local control flow, like continuations, could be done easily using stack-switching constructs—which are nontrivial to generate from LLVM IR. While MLIR already supports compiling to Wasm through the LLVM backend, supporting a native Wasm dialect is more extensible and maintainable, making it well-suited for the evolving nature of Wasm. We anticipate that this approach will benefit a wide range of languages targeting MLIR.

Byeongjee Kang

Carnegie Mellon University

United States

Harsh Desai

Carnegie Mellon University

Limin Jia

Carnegie Mellon University

Brandon Lucia

Carnegie Mellon University, USA

Time Zone

The program is currently displayed in (GMT-07:00) Mountain Time (US & Canada).

Use conference time zone: (GMT-07:00) Mountain Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 20 Jan
Displayed time zone: Mountain Time (US & Canada) change

09:00 - 10:30	Session 1WAW at Dodgeball

09:00 40m Keynote		Full-Stack Correctness in Wasm: Eliminating Bugs Inside and Outside the Sandbox WAW Chris Fallin F5
09:40 25m Talk		An MLIR Dialect for WebAssembly WAW Byeongjee Kang Carnegie Mellon University, Harsh Desai Carnegie Mellon University, Limin Jia Carnegie Mellon University, Brandon Lucia Carnegie Mellon University, USA
10:05 25m Talk		Meta-tracing Interpreters in WebAssembly WAW Andrew Brown Portland State University, Andrew Tolmach Portland State University