The LLVM Compiler Infrastructure
Site Map:
Download!
Download now: LLVM 2.4

Try the
online demo

View the open-source
license

Search this Site


Status Updates
Developer Mtgs
Useful Links
Maintained by:
Chris Lattner
Projects built with LLVM

This page is an incomplete list of the projects built with LLVM, sorted in reverse chronological order. The idea of this list is to show some of the things that have been done with LLVM for various course projects or for other purposes, which can be used as a source of ideas for future projects. Another good place to look is the list of published papers and theses that use LLVM.

Note that this page is not intended to reflect that current state of LLVM or show endorsement of any particular project over another. This is just a showcase of the hard work various people have done. It also shows a bit about how the capabilities of LLVM have evolved over time.

We are always looking for new contributions to this page. If you work on a project that uses LLVM for a course or a publication, we would definitely like to hear about it, and would like to include your work here as well. Please just send email to Chris Lattner with an entry like those below. We're not particularly looking for source code (though we welcome source-code contributions through the normal channels), but instead would like to put up the "polished results" of your work, including reports, papers, presentations, posters, or anything else you have.

This project describes the development of a compiler front end producing LLVM Assembly Code for a Java-like programming language. It is used in a course on Compilers to show how to incrementally design and implement the successive phases of the translation process by means of common tools such as JFlex and Cup. The source code developed at each step is made available.

By Fernando Pereira and Jens Palsberg, UCLA.

In this project, we have shown that register allocation can be viewed as solving a collection of puzzles. We model the register file as a puzzle board and the program variables as puzzle pieces; pre-coloring and register aliasing fit in naturally. For architectures such as x86, SPARC V8, and StrongARM, we can solve the puzzles in polynomial time, and we have augmented the puzzle solver with a simple heuristic for spilling. For SPEC CPU2000, our implementation is as fast as the extended version of linear scan used by LLVM. Our implementation produces Pentium code that is of similar quality to the code produced by the slower, state-of-the-art iterated register coalescing algorithm of George and Appel augmented with extensions by Smith, Ramsey, and Holloway.

Project page with a link to a tool that verifies the output of LLVM's register allocator.

Faust Real-Time Signal Processing System

FAUST is a compiled language for real-time audio signal processing. The name FAUST stands for Functional AUdio STream. Its programming model combines two approaches: functional programming and block diagram composition. You can think of FAUST as a structured block diagram language with a textual syntax. The project aims at developing a new backend for Faust that will directly produce LLVM IR instead of the C++ class Faust currently produces. With a (yet to come) library version of the Faust compiler, it will allow developers to embed Faust + LLVM JIT to dynamically define, compile on the fly and execute Faust plug-ins. LLVM IR and tools also allows some nice bytecode manipulations like "partial evaluation/specialization" that will also be investigated.

By Adobe Systems Incorporated

Efficient use of the computational resources available for image processing is a goal of the Adobe Image Foundation project. Our language, "Hydra", is used to describe single- and multi-stage image processing kernels, which are then compiled and run on a target machine within a larger application. Similarly to how its namesake had many heads, our Hydra can be run on the GPU or alternately on the host CPU(s). AIF uses LLVM for our CPU path.

The first Adobe application to use our system is the soon-to-ship After Effects CS3. We welcome you to try out our public beta found at labs.adobe.com.

By Domagoj Babic, UBC.

Calysto is a scalable context- and path-sensitive SSA-based static assertion checker. Unlike other static checkers, Calysto analyzes SSA directly, which means that it not only checks the original code, but also the front-end (including SSA-optimizations) of the compiler which was used to compile the code. The advantage of doing static checking on the SSA is language independency and the fact that the checked code is much closer to the generated assembly than the source code.

Several main factors contribute to Calysto's scalability:

  • A novel SSA symbolic execution algorithm that exploits the structure of the control flow graph to minimize the number of paths that need to be considered.
  • Lazy interprocedural analysis.
  • Tight integration with the Spear automated theorem prover, designed for software static checking.
  • And, of course, fast implementations of the basic algorithms in LLVM (dominator trees, postdominance, etc.).

Currently, Calysto is still in the development phase, and the first results are encouraging. Most likely, the first public release will happen some time in the fall 2007. Spear and Calysto generated benchmarks are available.

By Fernando Pereira, UCLA.

The register allocation problem has an exact polynomial solution when restricted to programs in the Single Static Assignment (SSA) form. Although striking, this major theoretical accomplishment has yet to be endorsed empirically. This project consists in the implementation of a complete SSA-based register allocator using the LLVM compiler framework. We have implemented a static transformation of the target program that simplifies the implementation of the register allocator and improves the quality of the code that it generates. We also describe novel techniques to perform register coalescing and SSA-elimination. In order to validate our allocation technique, we extensively compare different flavors of our method against a modern and heavily tuned extension of the linear-scan register allocator described here. The proposed algorithm consistently produces faster code when the target architecture provides a small number of registers. For instance, we have achieved an average speed-up of 9.2% when limiting the number of registers to four general purpose and three reserved register. By augmenting the algorithm with an aggressive coalescing technique, we have been able to raise the speed improvement up to 13.0%.

This project was supported by the google's Summer of Code initiative. Fernando Pereira is funded by CAPES under process number 218603-9.

Project page.

By Michael O. McCracken, UCSD.

The LENS Project is intended to improve the task of measuring programs and investigating their behavior. LENS defines an external representation of a program in XML to store useful information that is accessible based on program structure, including loop structure information.

Lens defines a flexible naming scheme for program components based on XPath and the LENS XML document structure. This allows users and tools to selectively query program behavior from a uniform interface, allowing users or tools to ask a variety of questions about program components, which can be answered by any tool that understands the query. Queries, metrics and program structure are all stored in the LENS file, and are annotated with version names that support historical comparison and scientific record-keeping.

Compiler writers can use LENS to expose results of transformations and analyses for a program easily, without worrying about display or handling information overload. This functionality has been demonstrated using LLVM. LENS uses LLVM for two purposes: first, to generate the initial program structure file in XML using an LLVM pass, and second, as a demonstration of the advantages of selective querying for compiler information, with an interface built into LLVM that allows LLVM passes to easily respond to queries in a LENS file.

Trident is a compiler for floating point algorithms written in C, producing Register Transfer Level VHDL descriptions of circuits targetted for reconfigurable logic devices. Trident automatically extracts parallelism and pipelines loop bodies using conventional compiler optimizations and scheduling techniques. Trident also provides an open framework for experimentation, analysis, and optimization of floating point algorithms on FPGAs and the flexibility to easily integrate custom floating point libraries.

Trident uses the LLVM C/C++ front-end to parse input languages and produce low-level platform independent code.

Ascenium is a fine-grained continuously reconfigurable processor that handles most instructions at hard-wired speeds while retaining the ability to be targeted by conventional high level languages, giving users "all the performance of hardware, all the ease of software."

The Ascenium team prefers LLVM bytecodes as input to its code generator for several reasons:

  • LLVM's all inclusive format makes global optimizations and consolidations such as global data dependency analysis easy.
  • LLVM's rich and strictly typed format generally make subtle and sophisticated optimizations easy.
  • LLVM's great ancillary tools and documentation make it easy to work with -- even hardware geeks can understand it!

Ascenium's HOT CHIPS 17 presentation describes the architecture and compiler in more detail.

By Carl Friedrich Bolz, Richard Emslie, Eric van Riet Paap and the rest of the PyPy Team

The PyPy Project is a reimplementation of Python written in Python itself, that is flexible and easy to experiment with. Our long-term goals are to target a large variety of platforms, small and large, by providing a compiler toolsuite that can produce custom Python versions. Platform, Memory and Threading models are to become aspects of the translation process - as opposed to encoding low level details into a language implementation itself. Eventually, dynamic optimization techniques - implemented as another translation aspect - should become robust against language changes.

At the time of this writing, PyPy currently targets LLVM and C.

By Tobias Nurmiranta

This is a small scheme compiler for LLVM, written in scheme. It is good enough to compile itself and work.

The code is quite similar to the code in the book SICP (Structure and Interpretation of Computer Programs), chapter five, with the difference that it implements the extra functionality that SICP assumes that the explicit control evaluator (virtual machine) already have. Much functionality of the compiler is implemented in a subset of scheme, llvm-defines, which are compiled to llvm functions.

The LLVM Visualization Tool (LLVM-TV) can be used to visualize the effects of transformations written in the LLVM framework. Our visualizations reflect the state of a compilation unit at a single instant in time, between transformations; we call these saved states "snapshots". A user can visualize a sequence of snapshots of the same module---for example, as a program is being optimized---or snapshots of different modules, for comparison purposes.

Our target audience consists of developers working within the LLVM framework, who are trying to understand the LLVM representation and its analyses and transformations. In addition, LLVM-TV has been designed to make it easy to add new kinds of program visualization modules. LLVM-TV is based on the wxWidgets cross-platform GUI framework, and uses AT&T Research's GraphViz to draw graphs.

Wiki page with overview; design doc, and user manual. You can download llvm-tv from LLVM SVN (http://llvm.org/svn/llvm-project/television/trunk).

Linear scan register allocation is a fast global register allocation first presented in Linear Scan Register Allocation as an alternative to the more widely used graph coloring approach. In this paper, I apply the linear scan register allocation algorithm in a system with SSA form and show how to improve the algorithm by taking advantage of lifetime holes and memory operands, and also eliminate the need for reserving registers for spill code.

Project report: PS, PDF

By eXtensible Systems, Inc.

The XPS project's purpose is to making programming computers easier by raising the level of abstraction in programming languages beyond the current practice. By using XML as a means for extensibility, XPS will support both meta-programming and domain engineering. In particular, it will make the creation of new Domain-Specific Languages very easy. By moving the programming abstraction into to the problem domain, the "impedance mis-match" between the problem domain and the solution domain is all but eliminated.

XPS combines an XML-based programming language, XPL, with a robust virtual machine making it easier to develop applications by hiding all the "computer science" and increasing the level of abstraction without losing performance. True to its name, XPL is highly extensible. It permits extension of both the programming language and the virtual machine with relative ease. Somewhat counter-intuitively, XPL is not a particularly programmer friendly language. It is designed to be fast, efficient, and easily compilable. It is expected that higher level (e.g. domain specific) languages will be designed that translate into XPL. These facilities support meta-programming and domain engineering so that software can be written using domain-specific vocabularies. The goal is to make it possible for the lay person to program computers without having to learn complicated programming languages or understand the tenets of computer science.

Currently, XPS is under development. It has just completed its 0.2.0 release which includes a basic XPL compiler that can reproduce its XPL input. The next release, 0.3.0 (Summer 2005) will compile XPL to executable code via LLVM's facilities. The decision to use LLVM was made in November, 2003 as it provides a much simpler and more modern compiler infrastructure than the other open source alternatives. Using LLVM for the "back end" of XPS will accelerate the development of XPS because many of the compilation and execution details are taken care of by LLVM.

Further information about XPS can be obtained at http://x-p-s.org/.

"Traditional architectures use the hardware instruction set for dual purposes: first, as a language in which to express the semantics of software programs, and second, as a means for controlling the hardware. The thesis of the Low-Level Virtual Architecture project is to decouple these two uses from one another, allowing software to be expressed in a semantically richer, more easily-manipulated format, and allowing for more powerful optimizations and whole-program analyses directly on compiled code.

The semantically rich format we use in LLVA, which is based on the LLVM compiler infrastructure's intermediate representation, can best be understood as a "virtual instruction set". This means that while its instructions are closely matched to those available in the underlying hardware, they may not correspond exactly to the instructions understood by the underlying hardware. These underlying instructions we call the "implementation instruction set." Between the two layers lives the translation layer, typically implemented in software.

In this project, we have taken our next logical steps in this effort by (1) porting the entire Linux kernel to LLVA, and (2) engineering an environment in which a kernel can be run directly from its LLVM bytecode representation -- essentially, a minimal, but complete, emulated computer system with LLVA as its native instruction set. The emulator we have invented, llva-emu, executes kernel code by translating programs "just-in-time" from the LLVM bytecode format to the processor's native instruction set.

Project report: PS, PDF

By Brian Fahs

"As every modern computer user has experienced, software updates and upgrades frequently require programs and sometimes the entire operating system to be restarted. This can be a painful and annoying experience. What if this common annoyance could be avoided completely or at least significantly reduced? Imagine only rebooting your system when you wanted to shut your computer down or only closing an application when you wanted rather than when an update occurs. The purpose of this project is to investigate the potential of performing dynamic patching of executables and create a patching tool capable of automatically generating patches and applying them to applications that are already running. This project should answer questions like: How can dynamic updating be performed? What type of analysis is required? Can this analysis be effectively automated? What can be updated in the running executable (e.g., algorithms, organization, data, etc.)?"

Project report: PS, PDF

By Tanya Brethour, Joel Stanley, and Bill Wendling

"In this report we present implementation details, empirical performance data, and notable modifications to an algorithm for PRE based on [the 1999 TOPLAS SSAPRE paper]. In [the 1999 TOPLAS SSAPRE paper], a particular realization of PRE, known as SSAPRE, is described, which is more efficient than traditional PRE implementations because it relies on useful properties of Static Single-Assignment (SSA) form to perform dataflow analysis in a much more sparse manner than the traditional bit-vector-based approach. Our implementation is specific to a SSA-based compiler infrastructure known as LLVM (Low-Level Virtual Machine)."

Project report: PS, PDF

"We present the design and implementation of Jello, a retargetable Just-In-Time (JIT) compiler for the Intel IA32 architecture. The input to Jello is a C program statically compiled to Low-Level Virtual Machine (LLVM) bytecode. Jello takes advantage of the features of the LLVM bytecode representation to permit efficient run-time code generation, while emphasizing retargetability. Our approach uses an abstract machine code representation in Static Single Assignment form that is machine-independent, but can handle machine-specific features such as implicit and explicit register references. Because this representation is target-independent, many phases of code generation can be target-independent, making the JIT easily retargetable to new platforms without changing the code generator. Jello's ultimate goal is to provide a flexible host for future research in runtime optimization for programs written in languages which are traditionally compiled statically."

Note that Jello eventually evolved into the current LLVM JIT, which is part of the tool lli.

Project report: PS, PDF