> We implement LuaJIT Remake (LJR), a standard-compliant Lua 5.1 VM, using Deegen. Across 44 benchmarks, LJR's interpreter is on average 179% faster than the official PUC Lua interpreter, and 31% faster than LuaJIT's interpreter. LJR's baseline JIT has negligible startup delay, and its execution performance is on average 360% faster than PUC Lua and only 33% slower (but faster on 13/44 benchmarks) than LuaJIT's optimizing JIT.
My heart sank at the description of being LLVM based. (I couldn't think of a worse choice for creating a JIT compiler.) Thankfully, they don't use LLVM at runtime! LLVM is only for static compilation of the JIT.
-LLVM is huge. This rules it out for many usecases.
-LLVM's API is notoriously unstable. I've probably seen 10 different projects struggling with this problem. They're often tied to a specific version or version range, can't use the system install of LLVM, and bitrot fast. As one person put it, "I mean that there has been an insane amount of hours trying to make llvm as a dependency and that it doesn't break on other devices, one of zig's main complains about llvm." [in thread [1]]. In comparison, gccjit's API, while not as mature, is supposedly simpler, higher level/easier, and more stable [1].
-LLVM was never designed to be used as a JIT compiler. It's slow and the API doesn't sound good for that.
For example for the last couple of months I've been using Julia, and it seems to me that the biggest problem with the language is LLVM! It suffers significantly from all three points above. And that's despite the fact that Julia is probably the best place to use LLVM for JIT compiling, since it cares about performance so much. After all the pain I've seen, I would never use LLVM as a dependency.
Yes! I've been waiting for a practical tool like this, and would love to write a JIT for Squirrel/Quirrel using it.
But I'm looking through the luajit-remake codebase, and there is still a lot of code. Assuming that the drt and deegen directories are Deegen (however, at lease drt/tvalue.h is clearly part of the VM, not of Deegen):
In comparison, Lua 5.2.4 is 20.3k lines of C and LuaJIT 1.1.5, which is a (comparable?) method JIT compiler, is 22.8k lines of C and 4.8k lines of Lua (for dynasm and JIT support). LuaJIT 2.1 is 74.9k lines of C, 13.7k Lua.
I think a large part of that might be the language they choose. Every C++ code example in the paper feels extremely verbose to me, and I wonder to which degree that is inherently required for encoding language semantics, and to which degree it's C++ syntax being noisy.
This is not a critique of the authors, btw. Considering the breadth and depthtof various types of domain-specific knowledge that have to be "synthesized" on a project like this, developing a mastery of C++ is almost a given. So implementing things in C++ was likely the most natural approach for them. It technically also might be the most portable choice, since anyone who has LLVM installed will also have a C++ compiler.
I do wonder what it would be like if this were built upon a language with more appropriate "ergonomics" though. Maybe they can invent and implement DSL for Deegen in Deegen, haha.
There is significant warmup required, which is not good for most programs. Deegen's approach is very promising for interactive use or other situations that require low latency.
There's warmup to get to the best possible performance, which given that Deegen is a copy/patch baseline compiler, will be far above what Deegen can do. If you only care about Deegen level performance then GraalVM will warm up to that point quite quickly. And Deegen's approach cannot easily go beyond that level because it's not a full compiler.
I think the GraalVM/Truffle guys are also working on a copy/patch mode and warmup optimizations too. So the real question is who gets to full generation of both baseline and full top-tier JIT from one codebase quicker.
~~It didn't work; PyPy moved away from partial evaluation years ago~~
Sorry, I think I was responding to completely the wrong comment. I would also like a more general-purpose tool for writing fast programming language implementations
Discussed here: https://news.ycombinator.com/item?id=42186507
> We implement LuaJIT Remake (LJR), a standard-compliant Lua 5.1 VM, using Deegen. Across 44 benchmarks, LJR's interpreter is on average 179% faster than the official PUC Lua interpreter, and 31% faster than LuaJIT's interpreter. LJR's baseline JIT has negligible startup delay, and its execution performance is on average 360% faster than PUC Lua and only 33% slower (but faster on 13/44 benchmarks) than LuaJIT's optimizing JIT.
presentation by the author
Deegen: A LLVM-based Compiler-Compiler for Dynamic Languages https://www.youtube.com/watch?v=5cAUX9QPj4Y
Slides https://aha.stanford.edu/sites/g/files/sbiybj20066/files/med...
Ongoing work documented here https://sillycross.github.io/ and some comments here https://lobste.rs/s/ftsowh/building_baseline_jit_for_lua
https://github.com/luajit-remake/luajit-remake
My heart sank at the description of being LLVM based. (I couldn't think of a worse choice for creating a JIT compiler.) Thankfully, they don't use LLVM at runtime! LLVM is only for static compilation of the JIT.
Why is LLVM a bad choice for JIT? Are you concerned about the optimization versus speed of compilation trade-off they chose?
-LLVM is huge. This rules it out for many usecases.
-LLVM's API is notoriously unstable. I've probably seen 10 different projects struggling with this problem. They're often tied to a specific version or version range, can't use the system install of LLVM, and bitrot fast. As one person put it, "I mean that there has been an insane amount of hours trying to make llvm as a dependency and that it doesn't break on other devices, one of zig's main complains about llvm." [in thread [1]]. In comparison, gccjit's API, while not as mature, is supposedly simpler, higher level/easier, and more stable [1].
-LLVM was never designed to be used as a JIT compiler. It's slow and the API doesn't sound good for that.
For example for the last couple of months I've been using Julia, and it seems to me that the biggest problem with the language is LLVM! It suffers significantly from all three points above. And that's despite the fact that Julia is probably the best place to use LLVM for JIT compiling, since it cares about performance so much. After all the pain I've seen, I would never use LLVM as a dependency.
[1] https://www.reddit.com/r/ProgrammingLanguages/comments/19etc...
Very cool! Taking copy and patch to its natural conclusion
Wish they tried it with Ruby. Which is known to be extremely hard to optimise.
Truffle exists.
It’s Ruby for JVM (granted not LLVM).
https://github.com/oracle/truffleruby
If this can generate a v8/spidermonkey class engine for new scripting languages that would be incredible.
It is very exciting to get a multi-tier VM from just bytecode encoded version of VM spec.
Yes! I've been waiting for a practical tool like this, and would love to write a JIT for Squirrel/Quirrel using it.
But I'm looking through the luajit-remake codebase, and there is still a lot of code. Assuming that the drt and deegen directories are Deegen (however, at lease drt/tvalue.h is clearly part of the VM, not of Deegen):
In comparison, Lua 5.2.4 is 20.3k lines of C and LuaJIT 1.1.5, which is a (comparable?) method JIT compiler, is 22.8k lines of C and 4.8k lines of Lua (for dynasm and JIT support). LuaJIT 2.1 is 74.9k lines of C, 13.7k Lua.I think a large part of that might be the language they choose. Every C++ code example in the paper feels extremely verbose to me, and I wonder to which degree that is inherently required for encoding language semantics, and to which degree it's C++ syntax being noisy.
This is not a critique of the authors, btw. Considering the breadth and depthtof various types of domain-specific knowledge that have to be "synthesized" on a project like this, developing a mastery of C++ is almost a given. So implementing things in C++ was likely the most natural approach for them. It technically also might be the most portable choice, since anyone who has LLVM installed will also have a C++ compiler.
I do wonder what it would be like if this were built upon a language with more appropriate "ergonomics" though. Maybe they can invent and implement DSL for Deegen in Deegen, haha.
See also https://stefan-marr.de/papers/oopsla-larose-et-al-ast-vs-byt... which demonstrates that we can do that with GraalVM/Truffle, and the generated VM from the AST based interpreter is even faster than the bytecode interpreter.
There is significant warmup required, which is not good for most programs. Deegen's approach is very promising for interactive use or other situations that require low latency.
There's warmup to get to the best possible performance, which given that Deegen is a copy/patch baseline compiler, will be far above what Deegen can do. If you only care about Deegen level performance then GraalVM will warm up to that point quite quickly. And Deegen's approach cannot easily go beyond that level because it's not a full compiler.
I think the GraalVM/Truffle guys are also working on a copy/patch mode and warmup optimizations too. So the real question is who gets to full generation of both baseline and full top-tier JIT from one codebase quicker.
I wonder if this would work for python.
CPython merged[0] an experimental JIT compiler into mainline based on the author's previous paper, Copy-and-Patch
0: https://peps.python.org/pep-0744/
The Faster CPython team at least is aware of the paper and probably will look into it
https://github.com/faster-cpython/ideas/issues/707
do you mean more specifically than the generally similar approach that worked for https://pypy.org ?
~~It didn't work; PyPy moved away from partial evaluation years ago~~
Sorry, I think I was responding to completely the wrong comment. I would also like a more general-purpose tool for writing fast programming language implementations
I’m gonna need you to take about 15% off the top there, Squirelly Dan.