-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Formal Evaluation Semantics for Aiur
Goal
Define an Aiur.eval function in Lean — a pure, definitional interpreter for Aiur
Terms — so that we can state and prove theorems about the behavior of Aiur programs.
This is not meant to be efficient; it is a specification against which the bytecode
compiler and the Rust runtime can be verified.
Overview
The core idea: given a Toplevel (the set of all function/datatype declarations) and
a Function, evaluating that function on concrete Value arguments produces a
Result containing:
- An output
Value(the return value). - A
Prop-valued side condition accumulating allassertEqobligations.
This lets us state theorems like:
-- If `foo_fn` does `assert_eq!(x, 0); 2 * x`:
theorem foo_spec (top : Toplevel) (x : G) :
Aiur.eval top `foo_fn [.field x] = .ok (.field (2 * x)) (x = 0)
Core Definitions
Values
Value is the semantic domain — what Aiur terms evaluate to. It mirrors Data but
is fully reduced:
inductive Value where
| unit : Value
| field : G → Value
| tuple : Array Value → Value
| array : Array Value → Value
deriving Repr, BEq, InhabitedMemory
Store/load operations require a heap model. A Memory is a simple map from addresses
to values:
structure Memory where
cells : Array Value
deriving Inhabited
def Memory.store (mem : Memory) (v : Value) : G × Memory :=
let addr := G.ofNat mem.cells.size
(addr, { cells := mem.cells.push v })
def Memory.load (mem : Memory) (addr : G) : Option Value :=
mem.cells[addr.val.toNat]?Result and EvalState
Evaluation is partial (recursion may diverge, pattern matches may be non-exhaustive)
and stateful (memory, IO buffers). We track:
structure EvalState where
memory : Memory
ioBuffer : IOBuffer
fuel : Nat -- recursion bound
inductive Result where
| ok : Value → Prop → Result -- value + accumulated assertions
| fail : String → Result -- stuck / out of fuel / match failureThe Prop component in ok is the conjunction of all assertEq conditions
encountered during evaluation. When there are no assertions, this is True.
Environment
An Env maps local variables to values. Function lookup goes through the Toplevel:
abbrev Env := Std.HashMap Local Value
def lookupFunction (top : Toplevel) (name : Global) : Option Function :=
top.functions.find? fun f => f.name == nameThe Evaluator
partial def eval (top : Toplevel) (env : Env) (state : EvalState) :
Term → EvalState × ResultThe key cases:
| Term | Semantics |
|---|---|
.var x |
Look up x in env |
.data (.field g) |
.ok (.field g) True |
.data (.tuple ts) |
Evaluate each element, combine |
.data (.array ts) |
Evaluate each element, combine |
.ret t |
Evaluate t (marks escaping to caller) |
.let pat val body |
Evaluate val, match against pat to extend env, evaluate body |
.match t branches |
Evaluate t, try each (pat, body) in order |
.app name args |
Evaluate args, look up function, substitute into body, recurse (decrement fuel) |
.add a b |
Evaluate both to fields, return field addition (mod gSize) |
.sub a b |
Field subtraction |
.mul a b |
Field multiplication |
.eqZero a |
Evaluate a to .field g, return .field (if g == 0 then 1 else 0) |
.assertEq a b ret |
Evaluate a, b, evaluate ret, return ret's value with (a_val = b_val) ∧ ret_prop |
.store t |
Evaluate t, push to memory, return pointer field |
.load t |
Evaluate t to a field (address), look up in memory |
.proj t i |
Evaluate t to a tuple, project component i |
.get t i |
Evaluate t to an array, index at i |
.slice t i j |
Evaluate t to an array, extract [i, j) |
.set t i v |
Evaluate t to an array, evaluate v, replace at i |
For function calls, fuel is decremented. If fuel reaches 0, the result is
.fail "out of fuel". This makes eval total.
Pattern Matching
def matchPattern (pat : Pattern) (v : Value) : Option (List (Local × Value))This attempts to destructure v according to pat, returning bindings on success.
For Pattern.field g, it checks that v = .field g. For Pattern.var x, it always
succeeds with [(x, v)]. For constructors (Pattern.ref), it checks the tag and
recursively matches sub-patterns.
assertEq Accumulation
The crucial design point: assertEq does not fail evaluation. Instead, the equality
is accumulated as a Prop in the result:
| .assertEq a b ret => do
let (state, a_val, a_prop) ← evalValue top env state a
let (state, b_val, b_prop) ← evalValue top env state b
let (state, ret_val, ret_prop) ← eval top env state ret
let assertion := (a_val = b_val)
(state, .ok ret_val (a_prop ∧ b_prop ∧ assertion ∧ ret_prop))This mirrors how the circuit works: assertEq is a constraint, not a runtime check.
The evaluator produces the output regardless, and the proposition captures what must
hold for the execution to be valid.
Flattening to Array G
To connect eval to the existing test infrastructure (which works with Array G),
define:
def Value.flatten : Value → Array G
| .unit => #[]
| .field g => #[g]
| .tuple vs => vs.foldl (init := #[]) fun acc v => acc ++ v.flatten
| .array vs => vs.foldl (init := #[]) fun acc v => acc ++ v.flattenAnd a top-level entry point:
def Aiur.run (top : Toplevel) (funcName : Global) (args : Array G)
(fuel : Nat := 1000) : Option (Array G × Prop) :=
let func ← lookupFunction top funcName
let env := Env.ofList (func.inputs.zip (unflattenArgs func.inputs args))
let state := { memory := default, ioBuffer := default, fuel }
match eval top env state func.body with
| (_, .ok val prop) => some (val.flatten, prop)
| _ => noneStating Theorems
With this setup, we can write specifications like:
-- A function that asserts its input is zero and doubles it
-- fn foo(x: G) -> G { assert_eq!(x, 0); x + x }
theorem foo_correct (top : Toplevel) (x : G)
(h_func : lookupFunction top `foo = some foo_fn) :
Aiur.run top `foo #[x] = some (#[x + x], x = (0 : G)) := by
...
-- A pure function with no assertions
-- fn sum(x: G, y: G) -> G { x + y }
theorem sum_correct (top : Toplevel) (x y : G)
(h_func : lookupFunction top `sum = some sum_fn) :
Aiur.run top `sum #[x, y] = some (#[x + y], True) := by
...
-- Conditional: fn abs_or_zero(x: G) -> G { match eq_zero(x) { 1 => 0, _ => x } }
theorem abs_or_zero_spec (top : Toplevel) (x : G) ... :
Aiur.run top `abs_or_zero #[x] =
if x = 0 then some (#[(0 : G)], True) else some (#[x], True) := by
...Key Design Decisions
1. Prop-valued assertions, not runtime failure
assertEq accumulates a Prop rather than causing eval to fail. This is essential
because Aiur is a circuit language: the prover always produces an output, and the
constraints determine whether the output is valid. Separating the output from the
validity condition is what makes formal verification useful — you can reason about what
the circuit computes independently of whether the constraints are satisfied.
2. Fuel-based termination
Using a fuel : Nat parameter makes eval structurally recursive (total). Theorems
are stated with "sufficient fuel" hypotheses, or we prove that certain inputs require
bounded recursion.
3. Operating on Term (pre-simplification)
eval works on Term, the surface-level IR, not on TypedTerm or Bytecode. This
means it can be used to specify the semantics of Aiur programs as written. A separate
correctness theorem would relate eval on Term to execution of compiled Bytecode.
4. Memory as an array
The memory model is a simple append-only array. store appends and returns the index
as a G. load indexes into the array. This is adequate because Aiur memory is
append-only (no mutation of existing cells).
Verification Roadmap
Phase 1: Definitional interpreter (Aiur.eval)
- Define
Value,Memory,EvalState,Result,Env. - Implement
eval,matchPattern,Value.flatten. - Implement
Aiur.runas the top-level entry point.
Phase 2: Basic specifications
- State and prove specs for simple functions:
id,sum,prod. - State and prove specs involving
assertEq. - State and prove specs involving
matchon field values.
Phase 3: Data structure operations
- Specs for
proj,get,slice,set. - Specs for tuple/array construction and destructuring.
- Specs involving
store/load.
Phase 4: Recursion and datatypes
- Specs for recursive functions (
factorial,fibonacci) with fuel analysis. - Specs for enum construction and pattern matching (
Shape,Nat). - Mutual recursion (
even/odd).
Phase 5: Compiler correctness
- Define a relation between
evalonTermandBytecode.Toplevel.execute. - Prove that compilation preserves semantics: if
evalproduces(val, prop),
thenexecuteon the compiled bytecode producesval.flatten(assumingprop
holds).
Phase 6: Byte and u32 operations
- Specs for
u8_add,u8_xor,u8_shift_left,u8_bit_decomposition, etc. - Specs for
u32_less_than. - These require modeling the Goldilocks field arithmetic precisely.
Relation to CSLib
The approach is analogous to how CSLib provides
foundational definitions for formalizing computer science in Lean. Where CSLib defines
evaluation semantics for lambda calculus and other foundational languages, we define
evaluation semantics for Aiur — a domain-specific circuit language. The key parallel
is: define a clean denotational/operational semantics first, then prove properties
against it, then (optionally) prove that concrete implementations refine the semantics.