The Design of LLVM

cliffbean · on May 30, 2012

This article has a lot of interesting stuff, though it tends to be largely cheerleading LLVM's design decisions, rather than delving into the really interesting tradeoffs.

For example, some compilers use multiple levels of optimizer IR, for example Open64/Pathscale. This allows them to perform more high-level language-specific optimizations. LLVM forces front-ends to lower their code directly down to its fairly low level IR, thus throwing away some of this high-level information. It's possible to use LLVM's metadata system to preserve some of it, but this is subject to a variety of restrictions, and decorating a low-level IR with high-level decorations can be substantially less convenient than just using a higher-level IR. Clearly there are interesting tradeoffs here.

For another example, the author ribs Perl, Python, Ruby, and Java for not sharing backend code, and yet, years later, LLVM has still not been proven broadly viable for such languages, despite numerous efforts. Maybe someday it will be, but it's pretty clear that there are, yes, tradeoffs in play. Having everyone implement their own x86 backend has its downsides, but trying to get everyone to use one x86 backend has its downsides too.

maximilianburke · on May 30, 2012

I spent a couple years working on a .NET runtime that used LLVM as its back-end and found it to be more than adequate as a base for that ecosystem, specifically languages like C# and C++-CLI. Some changes were needed to LLVM in order to support precise garbage collection but they were fairly minimal. Debugging worked very well, though it wasn't as full featured as Visual Studio's .NET debugging environment it was about as functional as debugging C++ code.

I think LLVM is a fantastic code generator for statically typed languages, managed (Java/C#) or unmanaged (C/C++). Although my experience with the LLVM JIT isn't recent, when I did use it I found that it made a great science experiment but wasn't very practical. Specifically for this project the JIT ended up having to basically compile everything at once in order to determine object vtables and the like which made it more of a runtime linker than a just-in-time compiler.

I don't think that it would be very suitable as a back end for a dynamically typed language, as Unladen Swallow's unfortunate project trajectory seems to bear out. Not that it's impossible to use LLVM as part of a dynamic language infrastructure, but I do think there are other environments that are more suited to the task.

buo · on May 30, 2012

See Julia (http://julialang.org) for an example of a dynamic language that uses LLVM rather successfully (IMO).

kingkilr · on May 30, 2012

I'm not familiar with how Julia uses it, but I've worked on two different projects that tried to use LLVM as a JIT backend, I've written a small bit on the experience: http://www.quora.com/LLVM/Is-LLVM-not-good-for-interpreted-l...

ot · on May 30, 2012

> This article has a lot of interesting stuff, though it tends to be largely cheerleading LLVM's design decisions

This is not very surprising, the author of the article is Chris Lattner, one of the architects of LLVM and Clang. Maybe he should have made this clear upfront.

ccgus · on May 30, 2012

Looks like this is from The Architecture of Open Source Applications: http://www.aosabook.org/en/llvm.html

There's some other good stuff (and a book) on the site as well.

delinka · on May 30, 2012

I was wondering why the Dr. Dobbs article looked familiar. Thanks for noticing and remembering the original.

dochtman · on May 30, 2012

That also explains why it references LLVM 2.8, while the dateline says it's from May 29, 2012.

Hoff · on May 30, 2012

The original source for this LLVM article is a chapter in the Architecture of Open Source Applications (AOSA), and the LLVM chapter is available at:

<http://www.aosabook.org/en/llvm.html>;

AOSA is now a two-volume book <http://www.aosabook.org>, with chapters providing overviews of 49 open source projects.

jcurbo · on May 30, 2012

Thanks for the links. BTW, the trailing ">" you are inserting is getting caught in HN's automatic linking and being inserted into the <A> tag.

Hoff · on May 30, 2012

RFC1738 - Uniform Resource Locators (URL) - recommends use of angle brackets in its "Recommendations for URLs in Context" appendix; when URLs are embedded in text.

<http://www.ietf.org/rfc/rfc1738.txt>;

jcurbo · on May 30, 2012

Sure, but HN mangles the links. If this was email it might make more sense - or if HN's auto-parsing code were changed.

Hoff · on May 31, 2012

This is the Internet.

It's filled with pedants.

Some of us are even RFC-compliant.

Jimmie · on May 30, 2012

Here's the un-paginated link

http://www.drdobbs.com/article/print?articleId=240001128

kombine · on May 30, 2012

"A second success story is perhaps the most unfortunate, but also most popular way to reuse compiler technology: translate the input source to C code (or some other language) and send it through existing C compilers."

Author forgets to mention the more recent "success" story of compiling languages to Javascript. That is a shame for the industry that such thing exists.

jevinskie · on May 30, 2012

Can anyone provide experience or pointers in writing a LLVM backend? I tinkered with making an AVR backend but LLVM was rather new and had little examples to go by at the time. I would like to give it another shot!

giulianoxt · on May 30, 2012

There's some excellent official documentation on backend writing (http://llvm.org/docs/subsystems.html and http://llvm.org/docs/WritingAnLLVMBackend.html). From my previous experience, it isn't as straightforward as in the docs, but you can use the other backends as reference.

bithavoc · on May 30, 2012

Interesting, before this article I though that LLVM was hard to understand or use (without a top level compiler like Clang) but now that I see how a snippet of LLVM Intermediate Language looks like it makes me want to experiment with it a little bit. IR looks a like like IL which is not that hard to understand either.

Next step: learn how to compile these examples into an object and link it.