This document describes the architecture and implementation of CCC
(Claude's C Compiler). For building, usage, and status information, see
README.md. Each src/ subdirectory also has its own
README.md with detailed per-module documentation.
- High-Level Pipeline
- Source Tree
- Compilation Pipeline (Data Flow)
- Key Design Decisions
- Design Philosophy
- Assembler and Linker Architecture
- Sub-Module Documentation
The compiler is a multi-phase pipeline. Each phase is a separate Rust module with a well-defined input/output interface. The entire flow -- from C source to ELF executable -- is handled internally with no external tools.
+---------------------------------------------------------------------+
| C Source Files (.c, .h) |
+----------------------------------+----------------------------------+
|
+----------------------------------v----------------------------------+
| FRONTEND (src/frontend/) |
| |
| +--------------+ +-------+ +--------+ +--------------+ |
| | Preprocessor |---»| Lexer |---»| Parser |---»| Sema | |
| | | | | | | | | |
| | macro expand,| |tokens | |spanned | | type check, | |
| | #include, | | with | | AST | | const eval, | |
| | #ifdef | | spans | | | | symbol table | |
| +--------------+ +-------+ +--------+ +------+-------+ |
+------------------------------------------------------------+-------+
|
AST + SemaResult (TypeContext,
expr types, const values)
|
+------------------------------------------------------------v-------+
| IR SUBSYSTEM (src/ir/) |
| |
| +------------------+ +------------------------------+ |
| | IR Lowering |--------»| mem2reg | |
| | | | | |
| | AST -> alloca- | | SSA promotion via dominator | |
| | based IR (every | | frontiers; insert phi nodes, | |
| | local is a stack | | rename values | |
| | slot) | | | |
| +------------------+ +--------------+---------------+ |
+----------------------------------------------+---------------------+
| SSA IR
+----------------------------------------------v---------------------+
| OPTIMIZATION PASSES (src/passes/) |
| |
| Phase 0: Inlining + post-inline cleanup |
| (inline -> mem2reg -> constant_fold -> copy_prop -> simplify |
| -> constant_fold -> copy_prop -> resolve_asm) |
| | |
| Main Loop (up to 3 iterations, dirty-tracked): |
| cfg_simplify -> copy_prop -> div_by_const -> narrow -> simplify |
| -> constant_fold -> gvn -> licm -> iv_strength_reduce |
| -> if_convert -> copy_prop -> dce -> cfg_simplify -> ipcp |
| | |
| Dead static elimination |
| | |
| Phi Elimination (SSA -> register copies) |
+----------------------------------+---------------------------------+
| non-SSA IR
+----------------------------------v---------------------------------+
| BACKEND (src/backend/) |
| |
| +---------------------------------------------------------+ |
| | Code Generation (ArchCodegen trait) | |
| | | |
| | +----------+ +----------+ +----------+ +----------+ | |
| | | x86-64 | | i686 | | AArch64 | | RISC-V64 | | |
| | | SysV ABI | | cdecl | | AAPCS64 | | LP64D | | |
| | +----+-----+ +----+-----+ +----+-----+ +----+-----+ | |
| +-------+-------------+------------+---------------+-------+ |
| | | | | |
| +-------v-------------v------------v---------------v-------+ |
| | Peephole Optimizer (per-arch) | |
| | store/load forwarding, dead stores, copy prop, branches | |
| +-------+-------------+------------+---------------+-------+ |
| | | | | |
| +-------v-------------v------------v---------------v-------+ |
| | Builtin Assembler (per-arch) | |
| | parse asm text -> encode instructions -> write ELF .o | |
| +-------+-------------+------------+---------------+-------+ |
| | | | | |
| +-------v-------------v------------v---------------v-------+ |
| | Builtin Linker (per-arch) | |
| | read .o + CRT + libs -> resolve symbols -> write ELF | |
| +-------+-------------+------------+---------------+-------+ |
+----------+-------------+------------+---------------+---------------+
| | | |
v v v v
ELF ELF ELF ELF
src/
frontend/ C source -> typed AST
preprocessor/ Macro expansion, #include, #ifdef, #pragma once
lexer/ Tokenization with source locations
parser/ Recursive descent, produces spanned AST
sema/ Type checking, symbol table, const evaluation
ir/ Target-independent SSA IR
lowering/ AST -> alloca-based IR
mem2reg/ SSA promotion (dominator tree, phi insertion)
passes/ SSA optimization passes
constant_fold Constant folding and propagation
copy_prop Copy propagation
dce Dead code elimination
gvn Global value numbering
licm Loop-invariant code motion
simplify Algebraic simplification
cfg_simplify CFG cleanup, branch threading
inline Function inlining (always_inline + small static)
if_convert Diamond if-conversion to select (cmov/csel)
narrow Integer narrowing (eliminate promotion overhead)
div_by_const Division strength reduction (mul+shift)
ipcp Interprocedural constant propagation
iv_strength_reduce Induction variable strength reduction
loop_analysis Shared natural loop detection (used by LICM, IVSR)
dead_statics Dead static function/global elimination
resolve_asm Post-inline asm symbol resolution
backend/ IR -> assembly -> machine code -> ELF
traits.rs ArchCodegen trait with shared default implementations
generation.rs IR instruction dispatch to trait methods
liveness.rs Live interval computation for register allocation
regalloc.rs Linear scan register allocator
state.rs Shared codegen state (stack slots, register cache)
stack_layout/ Stack frame layout with liveness-based slot packing
call_abi.rs Unified ABI classification (caller + callee)
cast.rs Shared cast and float operation classification
f128_softfloat.rs IEEE binary128 soft-float (ARM + RISC-V)
inline_asm.rs Shared inline assembly framework
common.rs Data sections, external tool fallback invocation
x86_common.rs Shared x86/i686 register names, condition codes
elf/ ELF constants, archive reading, shared types
elf_writer_common.rs Common ELF object file writing utilities
linker_common/ Shared linker types (symbols, dynamic linking, EH frame)
asm_preprocess.rs Assembly text preprocessing (macro expansion, conditionals)
asm_expr.rs Assembly expression evaluation
peephole_common.rs Shared peephole optimizer utilities (word matching, line store)
x86/
codegen/ x86-64 code generation (SysV AMD64 ABI) + peephole
assembler/ Builtin x86-64 assembler (parser, encoder, ELF writer)
linker/ Builtin x86-64 linker (dynamic linking, PLT/GOT, TLS)
i686/
codegen/ i686 code generation (cdecl, ILP32) + peephole
assembler/ Builtin i686 assembler (reuses x86 parser, 32-bit encoder)
linker/ Builtin i686 linker (32-bit ELF, R_386 relocations)
arm/
codegen/ AArch64 code generation (AAPCS64) + peephole
assembler/ Builtin AArch64 assembler (parser, encoder, ELF writer)
linker/ Builtin AArch64 linker (static + dynamic linking, IFUNC/TLS)
riscv/
codegen/ RISC-V 64 code generation (LP64D) + peephole
assembler/ Builtin RV64 assembler (parser, encoder, RV64C compress)
linker/ Builtin RV64 linker (dynamic linking)
common/ Shared types, symbol table, diagnostics
driver/ CLI parsing, pipeline orchestration
Each phase transforms the program into a progressively lower-level representation. The concrete Rust types flowing between phases are:
&str (C source text)
|
| Preprocessor::preprocess()
v
String (expanded text with line markers)
|
| Lexer::tokenize()
v
Vec<Token> (each Token = { kind: TokenKind, span: Span })
|
| Parser::parse()
v
TranslationUnit (AST: Vec<ExternalDecl> with source spans)
|
| SemanticAnalyzer::analyze()
v
TranslationUnit + SemaResult
| SemaResult bundles:
| - functions: FxHashMap<String, FunctionInfo>
| - type_context: TypeContext (struct layouts, typedefs, enums)
| - expr_types: FxHashMap<ExprId, CType>
| - const_values: FxHashMap<ExprId, IrConst>
|
| Lowerer::lower()
v
IrModule (alloca-based IR: every local is a stack slot)
|
| promote_allocas() (mem2reg)
v
IrModule (SSA form: phi nodes, virtual registers)
|
| run_passes() (up to 3 iterations with dirty tracking)
v
IrModule (optimized SSA)
|
| eliminate_phis()
v
IrModule (non-SSA: phi nodes lowered to register copies)
|
| Target::generate_assembly_with_opts_and_debug() (ArchCodegen dispatch)
v
String (target-specific assembly text)
|
| Builtin assembler (parse -> encode -> ELF .o)
v
ELF object file (.o)
|
| Builtin linker (resolve symbols -> apply relocs -> write ELF)
v
ELF executable
-
SSA IR: The IR uses SSA form with phi nodes, constructed via mem2reg over alloca-based lowering. This is the same approach as LLVM.
-
Trait-based backends: All four backends implement the
ArchCodegentrait (~185 methods). Shared logic (call ABI classification, inline asm framework, f128 soft-float) lives in default trait methods and shared modules. -
Linear scan register allocation: Loop-aware liveness analysis feeds a linear scan allocator (callee-saved + caller-saved) on all four backends. Register-allocated values bypass stack slots entirely.
-
Text-to-text preprocessor: The preprocessor operates on raw text, emitting GCC-style
# line "file"markers for source location tracking. Include guard detection avoids re-processing headers. -
Peephole optimization: Each backend has a post-codegen peephole optimizer that eliminates redundant patterns (store/load forwarding, dead stores, copy propagation) from the stack-based code generator. The x86 peephole is the most mature with 15 distinct pass functions.
-
Builtin assembler and linker: Each architecture has a native assembler (AT&T/ARM/RV syntax parser, instruction encoder, ELF object writer) and a native linker (symbol resolution, relocation application, ELF executable writer). No external toolchain is required.
-
Dual type system: CType represents C-level types (preserving
intvslongdistinctions for type checking), while IrType is a flat machine-level enumeration (I8..I128,U8..U128,F32,F64,F128,Ptr,Void). The lowering phase bridges between them.
-
Separation of concerns through representations. Each major phase works on its own representation: the frontend on text/tokens/AST, the IR subsystem on alloca-based IR, the optimizer on SSA IR, and the backend on non-SSA IR. Phase boundaries are explicit ownership transfers, not shared mutable state.
-
Alloca-then-promote for SSA construction. Rather than constructing SSA directly during AST lowering (which interleaves C semantics with SSA bookkeeping), the lowerer emits simple alloca/load/store sequences. The mem2reg pass then promotes these to SSA independently. This is the same strategy LLVM uses and cleanly separates the two concerns.
-
Trait-based backend abstraction. The
ArchCodegentrait (~185 methods) captures the interface between the shared code generation framework and architecture-specific instruction emission. Default implementations express algorithms once (e.g., the 7-phase call sequence inemit_call), while backends supply only the architecture-specific primitives. -
Zero external dependencies for compilation. The entire compilation pipeline -- from C source to ELF executable -- is self-contained. No lexer generators, parser generators, register allocator libraries, external assemblers, or external linkers are required. Every component is implemented from scratch using only general-purpose Rust crates.
The builtin assembler and linker are the default when compiling without
the gcc_assembler or gcc_linker Cargo features. The selection is done
at compile time via #[cfg(feature = "...")] -- there are no runtime
environment variables.
Each assembler follows a three-stage pipeline:
Assembly text (String)
|
| Parser
v
Vec<AsmStatement> (parsed instructions, directives, labels)
|
| Encoder
v
Vec<u8> (encoded machine code bytes + relocation entries)
|
| ELF Writer
v
ELF object file (.o)
| Architecture | Parser | Encoder | Extra Features |
|---|---|---|---|
| x86-64 | AT&T syntax, shared | REX prefixes, ModR/M, SIB | SSE/AES-NI encoding |
| i686 | Reuses x86 parser | No REX, 32-bit operands | ELFCLASS32, Elf32_Rel |
| AArch64 | ARM assembly syntax | Fixed 32-bit encoding | imm12 auto-shift |
| RISC-V | RV assembly syntax | Fixed 32-bit encoding | RV64C compression |
Each linker reads ELF object files and static archives, resolves symbols, applies relocations, and writes a complete ELF executable:
| Architecture | Link Mode | Key Relocations | Special Features |
|---|---|---|---|
| x86-64 | Dynamic | R_X86_64_64, PC32, PLT32, GOTPCREL | PLT/GOT, TLS |
| i686 | Dynamic | R_386_32, PC32, PLT32, GOTPC, GOTOFF | 32-bit ELF, .rel |
| AArch64 | Static + Dynamic | ADR_PREL_PG_HI21, ADD_ABS_LO12_NC, CALL26 | PLT/GOT, IFUNC/IPLT, TLS |
| RISC-V | Dynamic | HI20, LO12_I, LO12_S, CALL, PCREL_HI20 | TLS GD→LE relaxation |
When compiled with the gcc_assembler and/or gcc_linker Cargo features,
the compiler delegates to the GCC cross-compiler toolchain for the
corresponding stages. A warning is printed when the fallback is used. This
mode is useful for debugging: compile a binary with GCC features and compare
byte-level output against the standalone binary.
Each compiler subsystem has its own detailed design document:
| Module | README |
|---|---|
| Frontend (preprocessor, lexer, parser, sema) | src/frontend/README.md |
| IR subsystem (lowering, mem2reg) | src/ir/README.md |
| Optimization passes | src/passes/README.md |
| Backend (codegen, assembler, linker) | src/backend/README.md |
| Common (types, diagnostics, source) | src/common/README.md |
| Driver (CLI, pipeline) | src/driver/README.md |
Per-architecture backend documentation:
| Architecture | Overview | Code Generation | Assembler | Linker |
|---|---|---|---|---|
| x86-64 | x86/README.md |
x86/codegen/README.md |
x86/assembler/README.md |
x86/linker/README.md |
| i686 | i686/README.md |
i686/codegen/README.md |
i686/assembler/README.md |
i686/linker/README.md |
| AArch64 | arm/README.md |
arm/codegen/README.md |
arm/assembler/README.md |
arm/linker/README.md |
| RISC-V 64 | riscv/README.md |
riscv/codegen/README.md |
riscv/assembler/README.md |
riscv/linker/README.md |
The x86-64 peephole optimizer has its own detailed documentation: src/backend/x86/codegen/peephole/README.md.