Architecture of Ceylon
The Ceylon project comprises the following major subsystems:
- launcher and module runtime,
- documentation compiler,
- IDE, and
- language module.
Contrary to common belief, compilers aren't magical, nor even as difficult as you probably imagine. You don't need a PhD to understand the Ceylon compiler.
What we call the typechecker, which is found in the
directory of the repository, is actually responsible for much more than just
typechecking. It includes:
- an ANTLR-based lexer/parser,
- a typesafe syntax tree for the Ceylon language,
- a model of the types encountered in the code, and
- a type analysis engine.
The lexer and parser are generated from the ANTLR grammar
defined in the file
Ceylon.g. The parser builds a syntax
tree representing the input source code.
The syntax tree is currently generated from a specification
defined in the file
Ceylon.nodes (but this might change in
future). The syntax tree has a Java class that represents each
syntactic construct in the language. An instance of the tree
represents the code in a certain compilation unit.
The model is an abstract representation of the types that are available to the compiler, not just in the compilation unit being compiled. Indeed, the compiler is even able to build a model for classes it encounters in precompiled module archives. However, note that the model contains much less information than the tree. For example, it does not contain any information about the procedural code contained in a class, method, or attribute.
The type analysis engine consists of several visitor classes that implement the rules defined in the language specification. They walk the syntax tree validating all the various rules that correct Ceylon code must satisfy, and attaching errors to tree nodes that fail to satisfy the rules. In addition, the type analysis visitors build up a model of the types they encounter in the tree, and create links from the tree to associated model objects. Thus, typing information is available to the compiler when it comes to transform the syntax tree to Java.
The typechecker has no dependencies to anything JVM-specific, so it can be reused with other backends.
Type analysis takes place in three phases. The type system was designed to never require more than three passes over the syntax tree.
DeclarationVisitorcreates model objects for each named declaration and keeping track of the scope in which it occurs.
importstatements and explicit type declarations, and assigns types to the model objects for explicitly typed declarations.
ExpressionVisitoranalyses the types of expressions, resolves member references, reports typing errors, and infers types of declarations without explicit type declarations.
Thus, the thing we call the compiler, which is found in the
compiler-java directory of the repository, is actually just half of the
compiler. This "compiler" actually calls the typechecker when
it needs the syntax tree for a compilation unit.
The compiler has two main responsibilities:
- to build a model from pre-compiled binary classes that are found in module archives, and
- transform the Ceylon syntax tree that is produced by the
typechecker to a Java syntax tree that is understood by
Finally, the compiler hands the Java syntax tree off to
to produce bytecode. We're essentially using
javac as the
world's most sophisticated bytecode library.
javac already supports incremental compilation, so does
the Ceylon compiler.
Launcher and module runtime
The Ceylon module runtime (in the
runtime directory of the repository)
is based on JBoss Modules.
The Ceylon launcher simply starts
java and invokes the module
runtime. JBoss Modules bootstraps via a local repository, which
must contain the following dependencies:
- the Ceylon language module,
- the Ceylon module resolver
- the Ceylon runtime
- the JBoss Modules
Finally, JBoss Modules is responsible for loading module archives as required according to the metadata contained in the module descriptors.
The documentation compiler (in the
compiler-java directory of the
repository, like the compiler) takes as its input the model produced
by the typechecker. It's job is to produce HTML documentation.
There is currently no support for alternate output formats.
The Ceylon IDE is a plugin for eclipse, and may be found in the
ceylon-ide-eclipse repository. It is based on
IMP, which provides us with a lot of
the infrastructure that is common to programming language
editors on Eclipse.
The Eclipse plugin is also built on top of the typechecker. It works directly with the syntax tree and model, which means that anything the typechecker knows about the source code, the IDE also knows. This includes types, members of types, errors, etc. And, of course, the IDE does not need to contain its own parser.
The IDE maintains a central model which it updates as part of the incremental compilation process. It also has a "forked" version of this model for each open Ceylon source editor. Each time a change is made in a source editor, a new "fork" of the model is produced. When the editor is saved, the central model is updated.
Searching for declarations is extremely fast in the Ceylon IDE since it works against the central model, not against the text of the source files.
The IDE does not directly use the compiler to perform its own work, but it does invoke the compiler as the last step of incremental compilation.
The language module is found in the
language directory of the repository.
The language module is special, because it contains types that
are used by the compiler to compile other code. Therefore, the
language module itself can't be compiled - there is a
chicken/egg problem where you would need to compile the language
module first, before you could compile the language module.
Furthermore, in order to achieve acceptable performance, the language module needs to take advantage of hand-written Java code.
- an incomplete implementation in Ceylon,
- a complete implementation in Java.
Keeping the three versions in sync is a rather painful process!
The language module for the JVM also contains several annotations which are used by the Ceylon compiler at compile time to reverse engineer the model from precompiled Ceylon code in a module archive.
There is another "half-compiler" (in the
compiler-js directory of the repository)
that uses the typechecker's syntax
or inside a browser. The compiler itself is written in Java, and node.js
is used for testing. There are two kinds of tests: one is to check the
correctness of the generated js code, and the other is to check that the
language module implementation in js works as expected (and it actually
runs all the tests from the ceylon.language project).
This project is what makes the Ceylon Web Runner possible.