HN.zip

WASM is not quite a stack machine

79 points by signa11 - 24 comments
stevefan1999 [3 hidden]5 mins ago
I'm trying to implement a WASM to C compiler, and because of that not-quite-so-stack behavior, I can actually guarantee that it will always build an expression and I don't have to discard or reset stack value! Everything stays within that function, which is very neat, and I think it is one of the reason WAT, the textual format is so neat, that you can represent it with a S-Expression.
jedisct1 [3 hidden]5 mins ago
Compiling WASM to C is a really good option: https://00f.net/2023/12/11/webassembly-compilation-to-c/
ncruces [3 hidden]5 mins ago
Shameless plug… compiling it to Go is a great option too: https://github.com/ncruces/wasm2go

I've used it to translate SQLite (with a few extensions) and, that I know of, it's been used (to varying degrees of success) to translate the MARISA trie library (C++), libghostty (Zig), zlib, Perl, and QuickJS.

More on-topic, I use a mix of an unevaluated expression stack and a stack-to-locals approach to translate Wasm.

bsder [3 hidden]5 mins ago
But how do you handle arguments or loop index variables? Your liveness is the entire function? You have to compile all the WASM chunks together in order to do any optimization? That seems ... problematic.

Edit: Yep. In article referenced from the original: http://troubles.md/posts/wasm-is-not-a-stack-machine/

Double edit: Some of this has already been fixed in WASM: https://github.com/WebAssembly/multi-value

Hendrikto [3 hidden]5 mins ago
The series of articles linked at the end (troubles.md/posts/wasm-is-not-a-stack-machine/) is even more interesting, imo.

Very well articulated and concise critique by somebody who seems to have a great amount of knowledge and experience with the topics.

ufo [3 hidden]5 mins ago
The author seems to complain about a lack of stack manip expressions like dup and rot, but at least for me that's what I would expect from an average programming language stack machine. Even Java, which does have those instructions, doesn't use them --- reuse happens via local variables.

The way I see it, the difference between register and stack vms is all about the instruction encoding. Register VMs have fatter instructions in exchange for needing fewer LOAD and STORE operations. Despite the name, register VMs also have a stack.

U1F984 [3 hidden]5 mins ago
Java does use dup in some cases, e.g.

   public static void test() { 
      new Object();
   }

         0: new           #2                  // class java/lang/Object
         3: dup
         4: invokespecial #1                  // Method java/lang/Object."<init>":()V
         7: pop
         8: return
asibahi [3 hidden]5 mins ago
This `dup` seems entirely useless it actually supports the case for omitting it fron the instruction set.
pjjpo [3 hidden]5 mins ago
> Despite the name, register VMs also have a stack.

Out of curiosity what do you think about this - in spite of the name, stack machines also have yet another stack. Ok I don't like that wording, but locals are basically the stack frames people know of from their computer arch class I think.

It doesn't change the fact that Wasm operations have to have the execution stack as one or more of the operands. Seems like a stack machines to me too, though I don't know more details on why the specific design of Wasm would make optimizing compilers harder to write than JVM as the article suggests (I think?).

ufo [3 hidden]5 mins ago
As you said, it's more like CPU stack frames. In a register VM all instructions can read and write to any position in the stack frame. In a stack VM, most instructions only read and write to the top but they are often combined with LOAD and STORE instructions that can access any position in the stack frame.
asibahi [3 hidden]5 mins ago
I dont really disagree with the main premise of the article, which is that WASM is not really a stack language, but this part just gave me pause:

> In textual Wasm, for example, they are instead represented in a LISP-like notation – not any less or more efficient

The Text format, at least when it comes to instructions, it 1 to 1 with the binary format. The LISP-like syntax is mainly just syntax sugar[1].

    ‘(’ plaininstr  instrs ‘)’ ≡  instrs plaininstr
So (in theory, as far as I understand it) you can just do `(local.get 2 local.get 0 local.get 1)` to mean `local.get 0 local.get 1 local.get 2`, and it works for (almost) any instruction.

Unfortunately, in my limited testing, tools like `wat2wasm` and Binaryen's `wasm-as` don't seem to adhere to (my perhaps faulty understanding of) the spec, and demand all instructions in a folded block be folded and have the "correct" amount of arguments, which makes Binaryen do weird things like

    (return
      (tuple.make     ;; Binaryen only pseudoinstruction
        (local.get 0) ;; or w/e expression
        (local.get 1) ;; or w/e expression
      )
    )

when this is perfectly valid

    local.get 0
    local.get 1
    return

tl;dr: the LISP syntax is just syntax sugar. The textual format is as "stack-like" as the binary format.

Edit: An example that is easily done with the stack syntax and not with lisp syntax is the following:

    call function_that_returns_multivalue
    local.set 2 ;; last return
    local.set 1 ;;
    local.set 0 ;; first return
In LISP syntax this would be

    (local.set 0
      (local.set 1
        (local.set 2
          (call function_that_returns_multivalue
            ( ;; whatever input paramters 
            )))))
I have not yet tried this with Binaryen but I doubt it flies.

[1]: https://webassembly.github.io/spec/core/text/instructions.ht...

soegaard [3 hidden]5 mins ago
FWIW if you are looking for examples of WebAssembly written in the textual format, take a look at:

https://raw.githubusercontent.com/soegaard/webracket/refs/he...

As a small example, here is a definition of `$car` which extracts the first value from a pair.

    (func $car (type $Prim1) 
               (param $v (ref eq)) 
               (result (ref eq))
      (if (result (ref eq)) 
          (ref.test (ref $Pair) (local.get $v))
          (then (struct.get $Pair $a (ref.cast (ref $Pair) (local.get $v))))
          (else (call $raise-pair-expected (local.get $v))
                (unreachable))))
purplesyringa [3 hidden]5 mins ago
> tl;dr: the LISP syntax is just syntax sugar. The textual format is as "stack-like" as the binary format.

Not that you're technically wrong, but I think you're begging the question.

Stack-based languages/encodings, in a colloquial sense, are equated to postfix notation, e.g. `a b +` instead of the infix `a + b`. Both LISP and textual Wasm use prefix notation, e.g. `(+ a b)`. Neither of the three is any more foundational than the other -- all notations can encode all expression trees, and postfix and prefix notations in particular have the same coding efficiency.

So sure, the LISP syntax is sugar, but for what? It's not sugar for a stack program, because prefix notation in general can't represent an arbitrary stack program; it's sugar for a mathematical expression. Which is encoded in postfix notation in binary, sure, but that's just an implementation detail, and prefix notation could've been selected when Wasm was born with little adversarial consequences.

pjjpo [3 hidden]5 mins ago
I have reread this several times but might be missing so I am begging the question, what exactly makes the LISP syntax sugar for something that isn't a stack machine? Or did I misread that?

If not, I think the OP is making the same point we all are, any program can be translated for execution on any machine - so bringing it up in the blog seems weak, which I agree with.

asibahi [3 hidden]5 mins ago
I am saying that textual wasm uses `a b +` (justl ike binary wasm) and `(+ a b)` is just a nicety.

It is explicity sugar for the stack operations, per my reading of the spec.

kg [3 hidden]5 mins ago
The lack of a dup opcode in Wasm as mentioned in the post is quite annoying when trying to generate compact code. I wish something like it had made it into the spec.
thomasmg [3 hidden]5 mins ago
You could use "local.tee". It is kind of is "store" + "duplicate".
asibahi [3 hidden]5 mins ago
`local.tee` doesn't duplicate. it just doesn't remove the value from the stack. (so it is "just" `local.set` followed by `local.get`)
thomasmg [3 hidden]5 mins ago
Sure. But it does save you one instruction: "tee", "get" instead of "set", "get", "get".
shevy-java [3 hidden]5 mins ago
I am sad about WASM. It was a promise for epic greatness.

It has failed to deliver that - so much is clear now. You rarely see any awesome success story shown with regard to WASM nowadays. What happened to the old promises? "Electron will be SUPER fast thanks to WASM" or "use any language, WASM unifies it all for the larger browser ecosystem".

It feels as if WASM is on a step towards exctinction. Sure, it is mentioned, it is used, but let's be honest - only few people really use it. And that won't change either.

bbkane [3 hidden]5 mins ago
Looks like you're getting down voted, but the folks at Mozilla seem like they agree and are working towards making WASM more first class in the browser: https://hacks.mozilla.org/2026/02/making-webassembly-a-first...
sippeangelo [3 hidden]5 mins ago
Just recently I've compiled my side-project Rust game engine to WASM and it runs beautifully in the browser, as well as SSH2 to have a fully featured SSH implementation in the browser over a websocket transport.

It can obviously do amazing things, but the expectation for it to do replace webdev frontend code was always a huge misconception. Though recent developments have made DOM access without a JavaScript translation layer possible, so that might change!

I'd say the hype is still very much alive.

purplesyringa [3 hidden]5 mins ago
There used to be hype about Wasm, now it's a technology as any other. It's still used, and used a lot; it just doesn't get focused on as much.
thomasmg [3 hidden]5 mins ago
Well there is Google Sheets, Microsoft Office, Figma, and some other heavier web apps.