Observer

Observer is a deterministic, artifact-oriented verification platform.

It is built for teams whose verification has outgrown a single language runner, a pile of shell glue, and a purely local pass/fail loop.

Observer gives you one explicit model for:

discovering verification targets through explicit providers
lowering them into canonical inventory
expressing expectations in suites
running them deterministically
emitting structured reports and derived analytics artifacts

If your current setup feels too magical, too fragile, too hard to compare across builds, or too weak to serve as an operational contract, Observer is aimed directly at that problem.

flowchart LR
	A[Language-native tests or workflow cases] --> B[Explicit provider or filesystem discovery]
	B --> C[Canonical inventory or case set]
	C --> D[Observer suite\nsimple or full]
	D --> E[Deterministic run records\nJSONL report]
	E --> F[Cube / Compare / Compare Index]
	E --> G[Console UX]
	F --> H[Self-contained HTML explorers]

Why It Exists

Most test tooling is optimized for one runtime, one local feedback loop, and one pass/fail moment.

Observer is for the cases where that is no longer enough.

It is designed for projects that need:

deterministic execution and canonical artifacts
explicit provider boundaries instead of implicit discovery conventions
workflow verification and product certification, not just unit-style test execution
machine-readable reports that can be analyzed and compared later
one platform that can span multiple ecosystems cleanly
one maintained verification topology that can answer product questions directly

Observer behaves more like a build artifact pipeline and product certification layer than a bag of conventions.

Who It Is For

Observer is a strong fit if your project has one or more of these problems:

verification spans more than one language or runtime
your CI outputs need to be reproducible and mechanically comparable
shell glue and ad hoc harness code are becoming the real testing framework
you need to verify workflows, artifacts, or staged pipelines, not just function calls
you want machine-readable run artifacts that can feed later analysis

Observer is probably not the right first tool if all you need is:

a lightweight unit test runner for one language
a purely local red-green loop with no artifact discipline
no need for canonical inventory, derived reports, or cross-build comparison

See It In 60 Seconds

This is the shape of a real local flow using the runnable Rust starter already in the repository:

cd lib/rust/starter
make list
make inventory
cat tests.inv
make run
make verify

What that gives you, in order:

raw provider host discovery
derived canonical inventory
the exact public execution contract Observer will run against
a real suite execution with the human console
hash and JSONL verification against checked-in expected artifacts

That is the product in miniature.

What It Looks Like In Practice

This is the kind of terminal loop Observer is designed to make normal:

$ observer run --inventory tests.inv --suite tests.obs --ui compact --report jsonl > report.jsonl
PASS  ledger/applies-ordered-postings
PASS  ledger/rejects-overdraft
FAIL  format/renders-balance-line

Summary  2 pass  1 fail  exit 1
Failed:
	format/renders-balance-line

$ observer cube --report report.jsonl --out build-1234.cube.json
{"k":"observer_cube_result","v":"0","out":"build-1234.cube.json","status":"ok"}

$ observer view --cube build-1234.cube.json --out build-1234.html
{"k":"observer_view_result","v":"0","out":"build-1234.html","view_kind":"cube","status":"ok"}

The point is not just that a run passed or failed.

The point is that the run became a stable artifact you can inspect, compare, publish, revisit later, and use as part of a larger product verdict.

What Makes Observer Different

Deterministic by default

Observer is built around deterministic ordering, canonical normalization, and stable derivation.

That means you can:

trust outputs in CI
regenerate goldens mechanically
diff one build against another
explain what changed without hand-waving

Canonical inventory as the execution contract

Observer does not blur discovery and execution together.

Discovered targets are first lowered into canonical inventory.

That inventory becomes the explicit contract that suites run against.

Verification beyond “run this test file”

Observer supports both:

a simple suite surface for routine expectations
a full suite surface for richer verification flows involving workflows, artifacts, extraction, branching, and publication

Both lower to one semantic core.

Structured artifacts, not just console text

Observer emits machine-readable reports and can derive:

telemetry summaries
build cubes
pairwise compares
compare-index artifacts across build sets
self-contained HTML explorer views

That makes post-run analysis a first-class part of the product instead of an afterthought.

Product-level operational truth

Observer now lives above individual suite runs too.

It can declare:

what certifies a product
which ordered stages define release health
which artifacts and reports come out of those stages
what exact contract was satisfied when the product passed

That is a different category of value from simply "we ran some tests".

Polyglot by design

Observer is not tied to one language runtime or one authoring surface.

The repo currently includes real onboarding paths for:

shell-oriented workflow verification
C providers
Go providers
.NET providers
Rust providers
TypeScript provider authoring
Python provider authoring

The language-specific APIs are allowed to feel native. The platform contract stays explicit and deterministic.

Not JUnit

Observer can run tests, but that is the least interesting thing about it.

JUnit-class tools answer a narrower question:

did these tests pass in this ecosystem

Observer answers broader product questions:

what is the explicit execution contract
what certifies this product
which staged proofs define release health
what artifacts were emitted
how do two runs compare mechanically

If you position Observer as a generic test runner, it sounds interchangeable.

If you position it as a verification platform with canonical contracts, product certification, and derived analytics, its actual teeth become visible.

What You Can Do Today

The observer CLI already ships real operator tooling for:

derive-inventory
hash-inventory
hash-suite
report-header
hash-product
certify
run
summarize-telemetry
cube
cube-product
compare
view
doctor
completion
manpage

It also includes:

serious built-in help and runnable examples
human-oriented console modes plus machine-oriented report output
version plus build stamping
licensing output
shell completions
manpage generation

A Typical Observer Flow

observer derive-inventory --config observer.toml --provider rust > tests.inv
observer run --inventory tests.inv --suite tests.obs --surface simple --analytics --report jsonl > build-1234.report.jsonl
observer cube --report build-1234.report.jsonl --out build-1234.cube.json
observer compare --cube build-1234.cube.json --cube build-1235.cube.json --out compare.json
observer view --compare compare.json --out compare.html

That flow is a good summary of the product thesis:

explicit provider boundary
canonical execution contract
deterministic run records
derived analysis artifacts
local, inspectable outputs

New: Product Certification

Observer now has a first-class product layer above individual suites.

This is the new part of the system.

It exists for products that are only considered ready when several heterogeneous verification areas pass together, such as:

unit suites plus workflow corpus suites
producer and consumer compatibility suites
server, client, and contract suites
compiler unit, golden, and pipeline suites

Instead of encoding that rule in shell glue or CI YAML, you can now declare one product definition that names the stages, their working directories, and the certification rule.

What A Product Definition Is

A product definition is a canonical JSON file, typically product.json, that declares:

one stable product_id
one certification rule such as all_pass
an ordered list of certification stages
one runner contract per stage

In v0, each stage is an observer_suite runner. That means the product layer reuses normal Observer suites as the stage-level verification mechanism.

Typical shape:

{
	"k": "observer_product",
	"v": "0",
	"product_id": "demo",
	"certification_rule": "all_pass",
	"stages": [
		{
			"stage_id": "unit",
			"runner": {
				"k": "observer_suite",
				"cwd": "unit",
				"suite": "tests.obs",
				"inventory": "tests.inv",
				"surface": "simple",
				"mode": "default"
			}
		},
		{
			"stage_id": "workflow",
			"runner": {
				"k": "observer_suite",
				"cwd": "workflow",
				"suite": "tests.obs",
				"surface": "full",
				"mode": "default"
			}
		}
	]
}

How The New Commands Fit Together

The product layer adds three important CLI commands.

hash-product

parses a product definition
normalizes it canonically
emits one stable product_sha256

certify

executes the declared stages in source order
changes into each stage's declared cwd
runs the stage suite using the stage's own suite, inventory, config, surface, and mode
writes each child suite report under that stage's local .observer/product/ directory
emits one canonical product report on stdout
returns one final exit code for the product verdict

cube-product

reads a product report
resolves the child suite reports recorded by certify
derives one build cube per stage
derives one compare-index across those stage cubes
lets existing view flows render the product analytics outputs directly

Product Evidence Model

certify produces two layers of evidence:

a product report on stdout describing the product header, stage outcomes, and final summary
one child suite report per stage written locally under that stage's .observer/product/ directory

That split is deliberate.

The product report explains the product-level verdict.

The child reports preserve the normal suite-level evidence for each certification stage.

End-To-End Product Flow

observer hash-product --product product.json
observer certify --product product.json > product.default.jsonl
observer cube-product --report product.default.jsonl --root . --out analytics/product
observer view --compare-index analytics/product/product.compare-index.json --out product.html

This is the new top-level workflow when one product is certified by multiple Observer suites together.

Observer now uses that same product layer on itself through the repo-owned product.json contract and the stage tree under tests.

Where To Read The Full Contract

The implementation-level contract for the product layer lives in:

specs/60-product-certification.md

That spec defines the canonical product JSON shape, normalization and hashing semantics, product report records, and the initial CLI surface.

Pick A Starting Path

If you want to get hands-on quickly, start here:

product.json plus tests for the repo-owned Observer self-certification flow
examples/README.md for the structural and operational manual that explains how examples hand off into a real product-owned verification tree
lib/shell/starter-pipeline for staged artifact workflows without writing a provider library
examples/product-certify for the new top-level product-certification flow over a unit stage plus a workflow stage
lib/c/starter for a standalone C provider host
lib/go/starter for a standalone Go provider host
lib/java/starter for a standalone Java provider host
lib/java/starter-embedded if your Java application already owns its CLI and you want myapp observe ...
examples/java-consumer-maven for a normal Maven-shaped Java consumer project
examples/java-consumer-gradle for a normal Gradle-shaped Java consumer project using the optional JUnit 5 bridge
lib/go/starter-embedded if your Go application already owns its CLI and you want myapp observe ...
lib/go/starter-embedded-failure if you want the Go embedded path plus an intentional failing example
lib/dotnet/starter for a standalone .NET provider host
lib/dotnet/starter-embedded if your .NET application already owns its CLI and you want myapp observe ...
lib/python/starter for a standalone Python provider host
lib/python/starter-embedded if your Python application already owns its CLI and you want myapp observe ...
lib/rust/starter for a standalone Rust provider host
lib/rust/starter-embedded if your Rust application already owns its CLI and you want myapp observe ...
lib/rust/starter-embedded-failure if you want the embedded path plus an intentional failing example

Practical manuals live in:

lib/shell/HOWTO.md and lib/shell/README.md
lib/c/HOWTO.md and lib/c/README.md
lib/go/HOWTO.md and lib/go/README.md
lib/java/HOWTO.md and lib/java/README.md
lib/java-junit5/README.md for the optional Java JUnit 5 bridge
lib/dotnet/HOWTO.md and lib/dotnet/README.md
lib/python/HOWTO.md and lib/python/README.md
lib/rust/HOWTO.md and lib/rust/README.md

Why Use Observer

Use Observer if you want:

a verification platform with explicit contracts instead of magical discovery assumptions
one model that covers tests, workflows, artifacts, and analysis together
deterministic and canonical artifacts you can hash, diff, and trust
a provider model that lets language-native authoring surfaces plug into one core platform
local-first analytics and comparison flows without backend infrastructure
a CLI that serves both automation and humans cleanly

Good Fit

Observer is a particularly good fit for:

language tooling and compiler projects
staged build and artifact pipelines
teams that care about goldens, reproducibility, and determinism
polyglot environments where one language-specific runner is not enough
projects that need both execution and post-run analysis

Why Not Just Use A Normal Test Runner

A normal test runner is often enough if all you want is:

one language
one runtime
one local pass/fail loop
no canonical inventory layer
no workflow artifact verification
no structured post-run analytics

Observer is for the cases where that simplicity stops being enough.

It is for teams that need a stronger execution contract, stronger determinism, and better artifact discipline.

Learn More

FEATURES.md for the product-level feature pitch
OBSERVER.md for the platform definition and spec map
specs/00-architecture.md for the architectural core
specs/13-provider-authoring.md for the provider authoring model
specs/30-suite.md for the suite model
specs/40-reporting.md for reporting semantics
specs/50-workflow-verification.md for workflow verification

Licensing

Observer uses an explicit split licensing model:

core platform and repository materials: GPL-3.0-or-later
files under lib/*: MIT

See LICENSING.md for the exact boundary and lib/LICENSE for the MIT text used by the library subtree.

Status

This repository already contains a working reference implementation, a serious CLI, runnable starters, provider libraries, published sample artifacts, and conformance coverage.

It is not a vague concept repo.

It is the beginning of a real verification platform.

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
crates		crates
etc/observer		etc/observer
examples		examples
lib		lib
specs		specs
tests		tests
.gitignore		.gitignore
BUILD		BUILD
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DX.md		DX.md
FEATURES.md		FEATURES.md
LICENSE		LICENSE
LICENSING.md		LICENSING.md
Makefile		Makefile
OBSERVER.md		OBSERVER.md
README.md		README.md
TODO.md		TODO.md
VERSION		VERSION
observer.sln		observer.sln
product.json		product.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Observer

Why It Exists

Who It Is For

See It In 60 Seconds

What It Looks Like In Practice

What Makes Observer Different

Deterministic by default

Canonical inventory as the execution contract

Verification beyond “run this test file”

Structured artifacts, not just console text

Product-level operational truth

Polyglot by design

Not JUnit

What You Can Do Today

A Typical Observer Flow

New: Product Certification

What A Product Definition Is

How The New Commands Fit Together

Product Evidence Model

End-To-End Product Flow

Where To Read The Full Contract

Pick A Starting Path

Why Use Observer

Good Fit

Why Not Just Use A Normal Test Runner

Learn More

Licensing

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Observer

Why It Exists

Who It Is For

See It In 60 Seconds

What It Looks Like In Practice

What Makes Observer Different

Deterministic by default

Canonical inventory as the execution contract

Verification beyond “run this test file”

Structured artifacts, not just console text

Product-level operational truth

Polyglot by design

Not JUnit

What You Can Do Today

A Typical Observer Flow

New: Product Certification

What A Product Definition Is

How The New Commands Fit Together

Product Evidence Model

End-To-End Product Flow

Where To Read The Full Contract

Pick A Starting Path

Why Use Observer

Good Fit

Why Not Just Use A Normal Test Runner

Learn More

Licensing

Status

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages