axion/README.md at main · dinnar1407-code/axion

Write 2x fewer tokens. Run faster than C. Humanize on demand.

The Problem

Programming languages were designed for humans. We optimized for readability, expressiveness, and developer experience — because humans were the ones writing and reading code.

That era is ending.

Today, AI agents write most of the code. Humans review it, set direction, and define contracts — but the line-by-line implementation is increasingly machine-generated. Yet we still force agents to write in languages designed for human eyes: verbose syntax, redundant type annotations, descriptive variable names, docstrings that the agent itself doesn't need.

We're paying twice. Once at generation time — more tokens means more compute, more latency, more cost. Once at runtime — human-friendly languages like Python and TypeScript carry performance overhead by design. The agent is slow to write AND the output is slow to run.

A local SLM generating Python at 4 tokens/second needs ~3,200 tokens for a typical module. That's 13 minutes of waiting. For a 10-module project, that's over 2 hours just for the agent to type.

This isn't slow — it's unusable. Local agent coding with traditional languages is dead on arrival.

Axion cuts those 3,200 tokens to ~1,600. Same module, same functionality. 6 minutes instead of 13. 1 hour instead of 2. On cloud APIs: half the cost, every generation, forever. And the compiled binary runs faster than C — not slower like Python or TypeScript.

What if we split the artifact? The agent writes in a compact format optimized for generation. The compiler produces optimized native code. A deterministic "humanizer" reconstructs readable code when humans need it. No information lost.

Agent writes:    compact .ax     → compiler → native binary (faster than C)
                      ↓
              deterministic humanizer         (no AI needed)
                      ↓
Human reads:     readable code   + Mermaid diagrams + spec contracts

That's Axion.

At a Glance

What the agent writes (checkout.ax)

#S 0="checkout_sessions"
#D A=O.math.round D=O.http.err P=O.list.len
@start
F(s)->J{c=@get(a0);C(P(c["items"])==0)
{R D("Cart is empty")};tot=@total(a0);
sess={"user_id":a0,"items":c["items"],
"subtotal":tot["subtotal"],"status":
"pending"};@insert($0,sess)}

@calc_tax
F(f,s)->f{rates={"CA":0.0725,"NY":0.08,
"TX":0.0625};rate=E(rates[a1],0.05);
A(a0*rate,2)}

What the human reads (axion humanize)

def begin_checkout(user_id: str) -> dict:
    """Initialize checkout session"""
    c = get(user_id)
    if len(c['items']) == 0:
        return error('Cart is empty')
    tot = total(user_id)
    sess = {'user_id': user_id,
            'items': c['items'],
            'subtotal': tot['subtotal'],
            'status': 'pending'}
    return insert('checkout_sessions', sess)

def calculate_tax(subtotal: float,
                  state: str) -> float:
    """Calculate tax by state/region"""
    rates = {'CA': 0.0725, 'NY': 0.08,
             'TX': 0.0625}
    rate = try rates[state] except 0.05
    return round(subtotal * rate, 2)

Three views of the same code. The agent writes the left. The human reads the right. A Mermaid diagram shows the architecture. All transformations are deterministic — no AI in the loop.

Can an LLM actually write this? Axion has zero training data — no model has ever seen .ax code. We gave Claude Opus 4.6 just the language skill file (112 lines of syntax rules). It wrote a full e-commerce system — 131 functions across 10 modules — on the first shot. All parsing, type-checking, and compiling to native binaries correctly.

The Results

Metric	Axion	Python	TypeScript	Rust	C (gcc -O3)
Avg tokens (4 projects)	6,479	12,748	10,630	14,547	—
Token ratio vs Axion	1.0x	2.0x	1.6x	2.2x	—
Lines of code (avg)	321	1,105	1,029	1,400	—
fib(42) runtime	388ms	~37,000ms	1,534ms	481ms	450ms
vs Axion	baseline	95x slower	4x slower	1.2x slower	1.2x slower
Binary size	33 KB	—	—	433 KB	33 KB
Memory (peak RSS)	1,216 KB	8,496 KB	~50 MB	1,552 KB	1,248 KB

The Theory: Adaptive Compression for Code

Axion applies information-theoretic principles to language design. A language is an encoding scheme — optimize it for the encoder (the LLM), not the decoder (the human).

Huffman-style per-file dictionaries

#D — Ontology dictionaries. The agent assigns single-character aliases to frequent API calls. O.http.ok (10 chars, ~3 tokens) becomes A (1 char, 1 token). Written once at the top, used throughout.

#S — String tables. Repeated literals become numbered references. "Product not found" (21 chars) becomes $0 (2 chars).

Both written in a single pass — the agent plans ahead, writes headers inline, compiler expands deterministically.

Why it works with LLMs

Design choice	Token savings	How
Positional params (`a0`, `a1`)	~15%	Spec file carries the real names
Single-char keywords (`F`, `C`, `L`)	~10%	Instead of `function`, `if`, `for`
Ontology references (`O.http.ok`)	~10%	Shared vocabulary, not inline code
`#D` dictionaries	+10.5%	Per-file Huffman for API calls
`#S` string tables	+6.4%	Per-file Huffman for literals
Combined	2x fewer tokens

The separation of concerns

Agent writes:    compact .ax + .spec + .log    (optimized for generation)
                          ↓
Compiler:        deterministic expansion        (no AI needed)
                          ↓
Human reads:     readable Python-like code      (optimized for comprehension)
                 + Mermaid architecture diagrams
                 + spec contracts with pre/postconditions

How It Works

┌─────────────────────────────────────────────┐
│  SPEC (.spec) — Human-readable              │  Names, types, contracts, docs
├─────────────────────────────────────────────┤
│  IMPL (.ax) — Agent-optimized               │  Ultra-compact with #D and #S
├─────────────────────────────────────────────┤
│  ONTOLOGY (.ont) — Shared brain             │  144 standard operations
├─────────────────────────────────────────────┤
│  LOG (.log) — Decision record               │  Why, not what
└─────────────────────────────────────────────┘

Layer	Who writes	Who reads	What it contains
Spec	Agent (once), human reviews	Human	Function names, param names, pre/postconditions, descriptions
Impl	Agent	Compiler, humanizer	Compact code with `#D`/`#S` dictionaries
Ontology	Shared	Agent + compiler	144 standard ops: `io`, `str`, `math`, `list`, `dict`, `http`, `db`...
Log	Agent	Humanizer	Variable name mappings, design rationale

Language Reference

Types

Code	Type	Code	Type
`s`	string	`J`	dict / JSON
`i`	int / i64	`L`	list
`f`	float / f64	`v`	void
`b`	bool	`eN`	enum(N)

All Constructs

Construct	Syntax	Example
Function	`F(types)->ret{body}`	`F(i,i)->i{a0+a1}`
If/Else	`C(cond){then}:{else}`	`C(a0>0){a0}:{0}`
For loop	`L(var,iter){body}`	`L(x,a0){O.io.print(x)}`
While	`W(cond){body}`	`W(v<10){v=v+1;}`
Match	`M(expr){pat:res,...}`	`M(a0){0:"zero",_:"other"}`
Try/Catch	`E(expr,fallback)`	`E(O.conv.atoi(a0),0)`
Return	`R expr`	`R O.http.err("fail")`
Lambda	`\N{body}`	`\1{a0*2}`
Pipe	`expr\|>fn`	`a0\|>O.list.sort`
Import	`I module`	`I store`
Extern (FFI)	`@extern name` / `X(types)->ret`	`@extern puts` / `X(s)->i`
Inline C	`__C("code")`	`__C("printf(\"hi\")")`

Per-File Compression (#D + #S)

#D — Ontology dictionaries:

#D A=O.http.ok D=O.db.find G=O.http.err
@get_user
F(i)->J{v=D("users",a0);v?A(v):G("not found")}

#S — String tables:

#S 0="Product not found" 1="checkout_sessions"
@get
F(s)->J{v=O.db.find($1,a0);v?O.http.ok(v):O.http.err($0)}

Combined savings: 16.9% on top of base syntax compression.

Performance: Why Axion Beats `gcc -O3`

Axion doesn't have a smarter optimizer than LLVM. It has more information.

The compiler sees the entire program as a single unit and encodes knowledge into C attributes:

Annotation	Effect	Why gcc can't infer it
`__attribute__((const))`	Pure function → CSE, hoisting	Needs cross-module analysis
`__attribute__((pure))`	Read-only → redundant call elimination	Needs whole-program visibility
`__builtin_expect`	Branch prediction hints	gcc treats all branches equally
`__attribute__((always_inline))`	Force-inline leaf functions	gcc uses conservative heuristics
`__restrict__`	No pointer aliasing	Axion has no aliasing by design
`__attribute__((nonnull))`	Null-check elimination	gcc can't prove safety in general

All functions in one compilation unit → whole-program optimization for free. Real C/Rust projects compile files separately, losing cross-module optimization.

CLI

# Compile & Run
axion run file.ax                       # interpret via Python
axion compile file.ax -o bin            # native binary (C → clang -O3)
axionc build file.ax -o bin             # Rust compiler (faster)

# Human-Readable
axion humanize file.ax --spec spec      # reverse-compile to readable code
axion visualize file.ax --html          # Mermaid architecture diagram

# Quality
axion check file.ax --spec spec         # type checker
axion validate file.ax --spec spec      # contract validation

# Search (#D-aware)
axion grep "O.http.ok" src/             # finds through aliases
axion refs validate src/                # find all callers
axion index src/                        # show alias map

It Works: Snake Game in One Shot

We asked the agent to build a terminal Snake game in Axion — with FFI for terminal control, non-blocking input, ANSI rendering, and a game loop with speed progression. One shot. No iteration.

34KB native binary. WASD to move, eat food, grow, speed increases. Game over on wall/self collision. The agent read the skill file, understood the FFI system, wrote the game, and it compiled and ran correctly on the first generation.

Compilation Backends

.ax source
    ├──→ Python        (interpreted, for development)
    ├──→ C → clang -O3 (fastest runtime, recommended)
    ├──→ LLVM IR       (no C compiler needed)
    └──→ x86-64 direct (zero dependencies, PoC)

Project Structure

axion/
├── compiler/          # Rust compiler (35µs parse, 6µs codegen)
├── axion/             # Python toolchain (humanize, visualize, search, type check)
├── stdlib/            # Standard ontology (144 ops) + C runtime header
├── editor/            # Tree-sitter grammar + syntax highlighting
├── examples/          # 6 projects: hello, todo, url-shortener, blog, ecommerce, chat, snake
├── comparisons/       # Same apps in Python / TypeScript / Rust
├── benchmarks/        # Reproducible benchmark suite + charts
└── skills/            # 5 focused agent skill files

For AI Agent Developers

Axion ships with 5 focused skills — each loads only when relevant:

Skill	Trigger	Lines
`axion-write`	"build me X", "create"	112
`axion-edit`	"fix", "modify", "refactor"	98
`axion-build`	"compile", "run", FFI	82
`axion-review`	"check", "validate"	85
`axion-read`	"explain", "what does this do"	108

No 1,400-line monolith in context. Each skill teaches exactly what the agent needs for that task.

Building from Source

# Python toolchain
pip install -e .

# Rust compiler
cd compiler && cargo build --release

Requires Python 3.11+ and (optionally) Rust for axionc.

Reproducing the Benchmarks

python3 benchmarks/token_compare.py     # detailed token comparison
python3 benchmarks/full_benchmark.py    # generate charts + summary

Environment

Component	Version
Platform	macOS 15, Apple Silicon (ARM64)
C compiler	Apple clang 17.0.0
Rust	rustc (stable)
Python	3.9+
LLVM (llvmlite)	14.x

The agent writes less. The program runs faster. The human sees more.

_{MIT License}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Problem

At a Glance

The Results

The Theory: Adaptive Compression for Code

Huffman-style per-file dictionaries

Why it works with LLMs

The separation of concerns

How It Works

Language Reference

Performance: Why Axion Beats `gcc -O3`

CLI

It Works: Snake Game in One Shot

Compilation Backends

Project Structure

For AI Agent Developers

Building from Source

Reproducing the Benchmarks

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

The Problem

At a Glance

The Results

The Theory: Adaptive Compression for Code

Huffman-style per-file dictionaries

Why it works with LLMs

The separation of concerns

How It Works

Language Reference

Performance: Why Axion Beats gcc -O3

CLI

It Works: Snake Game in One Shot

Compilation Backends

Project Structure

For AI Agent Developers

Building from Source

Reproducing the Benchmarks

Performance: Why Axion Beats `gcc -O3`