Over the weekend, I came across a couple of videos on how the Go compiler was migrated from C to Go. The Go compiler was originally written in C and after it reached a certain level of maturity, the creators were looking at how to bootstrap the compiler in Go so that the language can inherit some of the benefits of Go and to be able to do things that would not have been possible if still based on C.
Here’s the first video from the GopherCon where Russ Cox talks about his approach:
And here’s a session by Rob Pike the following year on how it went, with further details on the migration:
The approach is brilliant in theory and goes somewhat like this:
Parse the C code using a simple yacc parser
Generate parse tree. Tweak the tree to fix and re-write C-isms
Traverse the parse tree and output corresponding Go code
Compile Go code and validate output by comparing with C-based Go compiler
Repeat till both compilers generate the same output
A caveat is that the C parser from step 1 is a very specialized one built according to a very specific dialect followed by the original authors and not intended to be a general purpose converter which is a much bigger problem. Also, it was not a 100% automatic process and there’s still some code that needs to be hand rolled, but this makes the job of conversion easier, which would otherwise have been a tedious job.
Once converted to Go, the process of refactoring, profiling and restructuring can take place to evolve the code base using the Go toolchain.
DeepSeek R1, the new entrant to the Large Language Model wars has created quite a splash over the last few weeks. Its entrance into a space dominated by the Big Corps, while pursuing asymmetric and novel strategies has been a refreshing eye-opener.
GPT AI improvement was starting to show signs
Twenty years ago, it was easy to dislike Microsoft. It was the quintessential evil MegaCorp that was quick to squash competition, often ruthlessly, but in some cases slowly through a more insidious process of embracing, extending, and exterminating anything that got in the way. This was the signature personality of
Over the weekend, I came across a couple of videos on how the Go compiler was migrated from C to Go. The Go compiler was originally written in C and after it reached a certain level of maturity, the creators were looking at how to bootstrap the compiler in Go so that the language can inherit some of the benefits of Go and to be able to do things that would not have been possible if still based on C.
Here’s the first video from the GopherCon where Russ Cox talks about his approach:
And here’s a session by Rob Pike the following year on how it went, with further details on the migration:
The approach is brilliant in theory and goes somewhat like this:
A caveat is that the C parser from step 1 is a very specialized one built according to a very specific dialect followed by the original authors and not intended to be a general purpose converter which is a much bigger problem. Also, it was not a 100% automatic process and there’s still some code that needs to be hand rolled, but this makes the job of conversion easier, which would otherwise have been a tedious job.
Once converted to Go, the process of refactoring, profiling and restructuring can take place to evolve the code base using the Go toolchain.
And that’s how it’s done, folks.
Read Next
DeepSeek-R1, at the cusp of an open revolution
DeepSeek R1, the new entrant to the Large Language Model wars has created quite a splash over the last few weeks. Its entrance into a space dominated by the Big Corps, while pursuing asymmetric and novel strategies has been a refreshing eye-opener. GPT AI improvement was starting to show signs
Windows of Opportunity: Microsoft's Open Source Renaissance
Twenty years ago, it was easy to dislike Microsoft. It was the quintessential evil MegaCorp that was quick to squash competition, often ruthlessly, but in some cases slowly through a more insidious process of embracing, extending, and exterminating anything that got in the way. This was the signature personality of
US-11604662-B2
I’m happy to announce, that after a long wait, patent US-11604662-B2 has been issued.
Parallelizing and running distributed builds with distcc
Parallelizing the compilation of a large codebase is a breeze with distcc, which allows you to spread the load across multiple nodes and…