firefox-wasm-tail-call-benchmark
This commit is contained in:
parent
0d401e1274
commit
dba77ef988
58
articles/firefox-wasm-tail-call-benchmark/page.mmd
Normal file
58
articles/firefox-wasm-tail-call-benchmark/page.mmd
Normal file
@ -0,0 +1,58 @@
|
|||||||
|
Title: Testing Firefox Wasm Tail Call
|
||||||
|
Brief: Or why assumptions are not always correct.
|
||||||
|
Date: 1708076705
|
||||||
|
Tags: Wasm, Interpreters
|
||||||
|
CSS: /style.css
|
||||||
|
|
||||||
|
### Lore ###
|
||||||
|
|
||||||
|
Interpreting comes at a cost, the more you nest - the more complex things become.
|
||||||
|
It's especially true on the web, where any user program already sits on layers and layers
|
||||||
|
of interfaces. It gets pretty funny, I can't even run ZX Spectrum emulator written in JavaScript with more than few frames a second.
|
||||||
|
|
||||||
|
A lot of software targeting the web has their own languages and interpreters (such as Godot and GDScript) and in realtime simulation intensive cases overheads do matter.
|
||||||
|
|
||||||
|
One of things that is often suggested for solving interpreter performance is `tail calling`.
|
||||||
|
And it works emperically on native platforms. ![Check this post](https://mort.coffee/home/fast-interpreters/).
|
||||||
|
|
||||||
|
And so I wondered, could it work for Wasm platform? Firefox recently ![pushed support](https://bugzilla.mozilla.org/show_bug.cgi?id=1846789) for ![experimental spec](https://github.com/WebAssembly/tail-call/blob/main/proposals/tail-call/Overview.md) of it, after all.
|
||||||
|
|
||||||
|
### Results ###
|
||||||
|
|
||||||
|
I based the test interpreter on `fast-interpreters` post linked above.
|
||||||
|
Sources are available on ![github](https://github.com/quantumedbox/wasm-tail-call-interpreter-benchmark). It does nothing, but increments until 100000000,
|
||||||
|
which is relevant case for nothing, but instruction decoding, which we are testing here.
|
||||||
|
|
||||||
|
First, native:
|
||||||
|
```
|
||||||
|
time ./jump-table
|
||||||
|
|
||||||
|
real 0m3,094s
|
||||||
|
user 0m3,082s
|
||||||
|
sys 0m0,012s
|
||||||
|
|
||||||
|
time ./tail-call
|
||||||
|
|
||||||
|
real 0m2,491s
|
||||||
|
user 0m2,485s
|
||||||
|
sys 0m0,005s
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
Run time decrease of `19.3%`! Formidable.
|
||||||
|
|
||||||
|
But with web it's more interesting:
|
||||||
|
```
|
||||||
|
tail-call.wasm (cold): 10874ms - timer ended
|
||||||
|
|
||||||
|
jump-table.wasm (cold): 6610ms - timer ended
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
Tail calls are actually slower in this case (by `39.2%`), which I'm not sure about why yet.
|
||||||
|
Intuition proven wrong, - but me testing it first proven useful :)
|
||||||
|
|
||||||
|
Note: I'm running it on amd64 cpu, stable Firefox 122.0, compiled with Zig's Clang version 16.
|
||||||
|
|
||||||
|
Seems like JIT complation on the web is the way to go, to fold everything to Wasm bytecode.
|
||||||
|
But overall with plain jump-table overheads are *mere* 113.6%, which I would say isn't critical for a lot of cases, especially if interpreter is intended mostly as an interface adapter, which is the case with GDScript.
|
Loading…
Reference in New Issue
Block a user