This article has a lot of interesting stuff, though it tends to be largely cheerleading LLVM's design decisions, rather than delving into the really interesting tradeoffs.
For example, some compilers use multiple levels of optimizer IR, for example Open64/Pathscale. This allows them to perform more high-level language-specific optimizations. LLVM forces front-ends to lower their code directly down to its fairly low level IR, thus throwing away some of this high-level information. It's possible to use LLVM's metadata system to preserve some of it, but this is subject to a variety of restrictions, and decorating a low-level IR with high-level decorations can be substantially less convenient than just using a higher-level IR. Clearly there are interesting tradeoffs here.
For another example, the author ribs Perl, Python, Ruby, and Java for not sharing backend code, and yet, years later, LLVM has still not been proven broadly viable for such languages, despite numerous efforts. Maybe someday it will be, but it's pretty clear that there are, yes, tradeoffs in play. Having everyone implement their own x86 backend has its downsides, but trying to get everyone to use one x86 backend has its downsides too.
I spent a couple years working on a .NET runtime that used LLVM as its back-end and found it to be more than adequate as a base for that ecosystem, specifically languages like C# and C++-CLI. Some changes were needed to LLVM in order to support precise garbage collection but they were fairly minimal. Debugging worked very well, though it wasn't as full featured as Visual Studio's .NET debugging environment it was about as functional as debugging C++ code.
I think LLVM is a fantastic code generator for statically typed languages, managed (Java/C#) or unmanaged (C/C++). Although my experience with the LLVM JIT isn't recent, when I did use it I found that it made a great science experiment but wasn't very practical. Specifically for this project the JIT ended up having to basically compile everything at once in order to determine object vtables and the like which made it more of a runtime linker than a just-in-time compiler.
I don't think that it would be very suitable as a back end for a dynamically typed language, as Unladen Swallow's unfortunate project trajectory seems to bear out. Not that it's impossible to use LLVM as part of a dynamic language infrastructure, but I do think there are other environments that are more suited to the task.
> This article has a lot of interesting stuff, though it tends to be largely cheerleading LLVM's design decisions
This is not very surprising, the author of the article is Chris Lattner, one of the architects of LLVM and Clang. Maybe he should have made this clear upfront.
RFC1738 - Uniform Resource Locators (URL) - recommends use of angle brackets in its "Recommendations for URLs in Context" appendix; when URLs are embedded in text.
"A second success story is perhaps the most unfortunate, but also most popular way to reuse compiler technology: translate the input source to C code (or some other language) and send it through existing C compilers."
Author forgets to mention the more recent "success" story of compiling languages to Javascript. That is a shame for the industry that such thing exists.
Can anyone provide experience or pointers in writing a LLVM backend? I tinkered with making an AVR backend but LLVM was rather new and had little examples to go by at the time. I would like to give it another shot!
Interesting, before this article I though that LLVM was hard to understand or use (without a top level compiler like Clang) but now that I see how a snippet of LLVM Intermediate Language looks like it makes me want to experiment with it a little bit. IR looks a like like IL which is not that hard to understand either.
Next step: learn how to compile these examples into an object and link it.
For example, some compilers use multiple levels of optimizer IR, for example Open64/Pathscale. This allows them to perform more high-level language-specific optimizations. LLVM forces front-ends to lower their code directly down to its fairly low level IR, thus throwing away some of this high-level information. It's possible to use LLVM's metadata system to preserve some of it, but this is subject to a variety of restrictions, and decorating a low-level IR with high-level decorations can be substantially less convenient than just using a higher-level IR. Clearly there are interesting tradeoffs here.
For another example, the author ribs Perl, Python, Ruby, and Java for not sharing backend code, and yet, years later, LLVM has still not been proven broadly viable for such languages, despite numerous efforts. Maybe someday it will be, but it's pretty clear that there are, yes, tradeoffs in play. Having everyone implement their own x86 backend has its downsides, but trying to get everyone to use one x86 backend has its downsides too.