Glitchy: A Minimal LLVM Compiler
Why I Built a Programming Language
Like most of my projects, the reason I started was simple: I realized I didn’t actually know what a “programming language” was.
I could write Python, Java, and C++ code just fine, but the whole pipeline behind it felt like black magic. People would say “compiler” and “language” like they were interchangeable, and I nodded along like a functioning adult. Under the hood, I had no intuition for what turns text into something a machine can run.
So I gave myself a goal that was intentionally small: if I could build something that demystified compilers for me, I’d call it a win. Glitchy is the result.
Designing Glitchy
I wanted Glitchy to feel like the smallest “real” language I could build without it turning into a toy. That meant:
- Variables, branching, loops, and functions.
- A syntax simple enough that I could focus on the compiler
- A compiler pipeline that is readable, with logs that let you see your program move through stages.
The educational part mattered a lot. I didn’t want a compiler that worked only if you squinted at it. I wanted something a curious student (including future me) could actually trace. A million-line compiler is not something you can understand in an afternoon.
I also originally wanted Glitchy to be dynamically typed.
That desire did not survive for long.
What Glitchy Looks Like
Glitchy is intentionally familiar. Braces for blocks. C-style operators. No clever syntax tricks. I wanted the compiler to be interesting, not the punctuation. A small example:
set n = 5
set fact:int = 1
while (n > 1) {
fact = fact * n
n = n - 1
}
print("factorial = " + fact)
For the full syntax and more examples, see the README.
Why Python
Before getting technical, I should address the obvious question: why write a compiler in Python?
Not because it’s fast. It isn’t. And not because compilers are “supposed” to be written in Python. They usually aren’t.
I chose Python because speed was not the bottleneck I was trying to learn about. I wanted iteration speed. I wanted to explore ideas quickly, refactor aggressively, and spend my time understanding the pipeline instead of wrestling with memory management while I was still learning what an AST even looked like.
At the time, I also thought: “If my language is dynamically typed, implementing it in a dynamically typed language might make the design feel more natural.”
This was optimistic.
Still, I don’t regret it. Python is readable, and that mattered more to me than having a fast compiler. Glitchy was built to be understood, not to win benchmark competitions that nobody is running.
The Compiler Pipeline
Glitchy is a compiled language with a multi-stage pipeline. The basic flow looks like:
- Lexing: turn raw source text into tokens.
- Parsing: a handwritten recursive descent parser that turns tokens into an AST.
- Semantic analysis: walk the AST, resolve symbols, enforce rules, and figure out types.
- Code generation: emit LLVM IR.
- Compilation + execution: turn that into something runnable.
A major problem with my initial design
Here’s where my initial plan ran into a wall.
I wanted a dynamically typed language, but I also wanted to target LLVM IR.
If you don’t see the problem yet: LLVM is a statically typed backend. Every value has a static type.
So I tried to make dynamic typing work anyway.
The standard answer is boxing: represent values with a tagged runtime structure (type tag + payload), and generate helper logic to check tags, unbox, operate, and re-box. Many languages do this. It’s normal.
It’s also very easy to underestimate how much engineering effort it is when you’re building your first compiler.
I tried. I really did. I spent weeks trying to make boxing clean and ergonomic, and I got deep enough into it that I started dreaming about it! At some point I had to admit what was happening: I wasn’t learning “compiler fundamentals” anymore, I was building a runtime system.
So I made a tradeoff.
Glitchy shifted to type inference with optional type annotations, which let me know types at compile time while still keeping the language lightweight to write. It preserved the feeling I wanted (you don’t always have to spell types everywhere), while keeping the backend sane.
That decision taught me something important: “I want this feature” is not the same as “I can afford this feature in version one.”
The Glitch Idea
Glitchy has one feature that is intentionally not normal: it can “glitch” your program.
The idea started as a joke and turned into the most educational part of the project.
When glitch mode is enabled, the compiler intentionally mutates parts of the program in controlled ways, then runs the result and turns debugging into a small game: something went wrong, your job is to figure out what changed and why the output is different.
It’s not random corruption. It’s structured fault injection designed to make you practice reading behavior and reasoning about code paths. In the process, it forces you to do the thing beginners avoid: actually inspect and understand what the program is doing.
What I Learned
Glitchy taught me more than I expected, mostly because it forced me to get my hands dirty.
- A programming language is not syntax. It’s semantics plus tooling plus a pipeline that enforces rules.
- Parsing is only the beginning. The real “language” lives in analysis: scopes, symbols, types, and meaning.
- Backends like LLVM are powerful, but they demand clarity. If your frontend is vague, LLVM will punish you for it.
- Complexity compounds fast. Features that seem small (“just make it dynamically typed”) can secretly be entire subsystems.
Most importantly: I came out of it with intuition I didn’t have before. I know what a compiler is!
Special Thanks to Robert Nystrom for his wonderful book on the topic: Crafting Interpreters