compiladorThe compiler is a key element in developing software as it translates instructions given in one programming language (somewhat close to natural language) into something a computer can understand: machine language, made up of ones and zeros, which, in turn, is another abstraction to encode different voltages for an electrical signal. Despite its importance, and omnipresence, doubts may arise regarding its nature. What language is used to write a compiler? How do you compile a compiler?

Can a C compiler be written in C? These three questions are closely inter-related making them impossible to answer individually and, regardless of appearances, are not the modern version of the chicken and the egg! Let’s see why.

What language is used to write a compiler

To answer this, we need to look back to the beginnings of computer science. It was in 1952 when Grace Hopper, one of the most influential contributors to informatics, wrote the first compiler: the A-0 system. She  (to whom we owe the Word bug from the famous Mark I anecdote) put together subroutines, used over the years, into machine language and on a tape associated to a numeric code. The A-0 system could translate mathematical symbolic codes to machine language, using said codes, by searching through the tape for the corresponding subroutines.

Although this corresponds more to the idea of a linker or loader, the A-0 system is considered to be the first ever compiler.  Initially, this was done manually: someone had to actually convert instructions, which existed in a language totally alien to a computer (such as mathematical symbols), into binary.

As computing evolved and became more and more complex, instructions were entered in an assembler and directly mapped to machine language instructions, to be directly executed by a processor.

How do you compile a compiler? Can a C compiler be written in C?

A very simple compiler can be written from an assembler and machine code. Once you have a software that is able to translate something into binary instructions, you can use the original compiler to write a more sophisticated one (then use a second further refined one to write a third and so on). This iterative process of making a tool from a simpler version is known as bootstrapping. The something could be instructions written in the same programming language the compiler uses, creating new self-hosting. Gcc, one of the most popular C compilers, was built using this technique.

That said, there are a great many languages available with their respective compilers, which allow us to skip the first step of using  the assembler and machine code.

The following figure shows a very simple example of bootstrapping.  Suppose we invent a new language called T. To compile it, we need to write a compiler in another language, C for example (Tcompiler_c.c). By using an existing C compiler to compile Tcompiler_c.c, we can generate an executable Tcompiler_c. Subsequently, we can write a new T compiler, this time however, using its own language (Tcompiler_t.t). As we already have a program capable of compiling this, Tcompiler_c, we can use it to gain a new compiler, compiled from its own source code. This final step can be repeated as often as required to produce an ever more powerful versión.

compiler

Compiler tools continue to evolve to this very day.  Codes, executed in processors, partly depend on them to achieve optimum performance and efficiency. Thus, the integrated use of compilers and their theoretical understanding, which lead to consistently enhanced quality embedded software, are a fundamental part of Teldat.

 


About the author

Javier Marchan

Share this post

Tweet about this on TwitterShare on LinkedInShare on Google+Email this to someone