These are tests that use some more LLVM tools (llvm-objdump, llvm-dwarfdump, not). Could you try after building these tools in addition to FileCheck? Do the TPDE-LLVM tests, which use the same tools, pass with this setup?
While they used Cranelift IR itself (amongst others, not just LLVM) to show performance improvements (thus making it complementary and not a replacement) you raise a good point. Quite possible it is not as full-featured yet so perhaps in the future, if at all.
The TPDE-based back-end compiles 4.27x faster than Cranelift and 2.68x faster than Cranelift with its fast register allocator, but is 1.74x slower than Winch
They're hitting another design point on the compile time vs. code-quality tradeoff curve, which is interesting. They compile 4.27x faster than Cranelift with default (higher quality) regalloc, but Cranelift produces code that runs 1.64x faster (section 6.2.2).
This isn't too surprising to me, as the person who wrote Cranelift's current regalloc (hi!) -- regalloc is super important to run-time perf, so for Wasmtime's use-case at least, we've judged that it's worth the compile time.
TPDE is pretty cool and it's great to see more exploration in compiler architectures!
TPDE is a framework for writing a back-end for various SSA IRs.
TPDE-LLVM is an LLVM back-end written using TPDE, but TPDE itself is independent of LLVM.
The paper also mentions back-ends written for Cranelift's IR and Umbra IR using TPDE.
One thing I never understood in this context here (fast JIT/debug builds/hot reloads/-O0) is why you would need much static linking. Generally your modules are going to have a DAG relationship. Even code inside a large compilation unit could potentially be factored out (automatically) into smaller modules. Could you not just generate a bunch of small dynamically linked libraries? Would the system dynamic loader become the speed bottleneck? Even if so, wouldn't reloading just a portion of the DAG in a hot-reload context be much faster than linking everything beforehand?
https://github.com/tpde2/tpde
In the llvm/llvm-project repository
In the tpde repository /Stable/bin/clangThere are some failures:
``` % /tmp/out/custom/bin/llvm-lit out/debug/tpde/test/filetest ... Failed Tests (5): TPDE FileTests :: codegen/eh-frame-arm64.tir TPDE FileTests :: codegen/eh-frame-x64.tir TPDE FileTests :: codegen/simple_ret.tir TPDE FileTests :: codegen/tbz.tir TPDE FileTests :: tir/duplicate_funcs.tir ```
The TPDE-based back-end compiles 4.27x faster than Cranelift and 2.68x faster than Cranelift with its fast register allocator, but is 1.74x slower than Winch
This isn't too surprising to me, as the person who wrote Cranelift's current regalloc (hi!) -- regalloc is super important to run-time perf, so for Wasmtime's use-case at least, we've judged that it's worth the compile time.
TPDE is pretty cool and it's great to see more exploration in compiler architectures!
Seems like a pretty neat fast compiler backend for LLVM. Why the extra buzzwords?
Edit: LLM to LLVM
Wait - it’s 8-24x faster than O0 while producing code on par with O3???