poly-bench

A high-performance multi-language benchmarking framework with a custom DSL and full LSP support.

Overview

poly-bench lets you define benchmarks once and run them across multiple languages (Go, TypeScript, Rust) with unified output and comparison. It's designed for:

Cross-language library comparisons — Compare your Go library against its TypeScript equivalent
Performance regression testing — Track performance across languages over time
Fair benchmarking — Same data, same iterations, unified measurement

Architecture

poly-bench is a modular Rust workspace. A .bench file is parsed, validated, lowered to IR, compiled/generated for each language target, executed in isolated runtime environments, and then reported in multiple formats.

End-to-End Flow

flowchart LR
  A[".bench source"] --> B["poly-bench-dsl\nparse + validate"]
  B --> C["poly-bench-ir\nlowering + normalization"]
  C --> D["poly-bench-runtime\ncodegen / compile prep"]
  D --> E["poly-bench-executor\nrun benchmarks"]
  E --> F["poly-bench-reporter\nconsole / markdown / json"]

Workspace Components

The repository is split by responsibility so each stage can evolve independently:

Crate	Responsibility
`poly-bench-dsl`	DSL lexer/parser, AST, validation, and formatting for `.bench` files
`poly-bench-ir`	Intermediate representation used by downstream compile/run stages
`poly-bench-runtime`	Runtime-specific code generation and execution adapters
`poly-bench-executor`	Benchmark orchestration, iteration flow, measurement collection
`poly-bench-reporter`	Rendering benchmark results (console, markdown, json)
`poly-bench-project`	Project initialization, manifest/dependency workflows
`poly-bench-stdlib`	Built-in DSL modules (`std::anvil`, `std::charting`, `std::constants`)
`poly-bench-syntax` + `tree-sitter-polybench`	Error-tolerant syntax layer and grammar tooling
`poly-bench-lsp-v2`	LSP implementation powering editor diagnostics, formatting, and completions

Execution Model

Compile-like pipeline: The DSL is normalized before execution so runtimes consume a consistent, validated IR.
Isolated language runtimes: Go/TypeScript/Rust execution is separated to reduce cross-language environment leakage.
Comparable runs: Shared fixtures and suite-level configuration keep measurements aligned across languages.
Output decoupling: Report formatting is separated from execution, so CI automation can consume machine-readable output while humans get rich console summaries.

CLI and Runtime Architecture

The poly-bench CLI is an orchestrator over workspace crates rather than a monolith. High-level command groups map cleanly to architectural layers:

Authoring/validation: check, fmt, compile, codegen
Execution: run
Project lifecycle: init, new, add, remove, install, build
Tooling/runtime state: cache stats|clear|clean, lsp

This keeps the CLI surface area broad while retaining narrow internal module boundaries.

LSP Architecture

The LSP uses the syntax + DSL layers for robust editor behavior, including partial/error-tolerant parsing, semantic diagnostics, and formatting of .bench files. It is exposed through poly-bench lsp and consumed by the VS Code/Cursor extension in extensions/vscode.

For deeper implementation detail, see the docs architecture page: docs/src/content/docs/core/architecture.mdx.

Quick Start

# Install
curl -sSL https://polybench.evm-tooling.tools | bash

# Initialize a project
poly-bench init my-benchmarks

# Run benchmarks
poly-bench run

Language Features

Suite Structure

use std::charting

suite hash {
    description: "Hash function benchmarks"
    iterations: 5000
    warmup: 1000
    baseline: "go"
    
    setup go {
        import "crypto/sha256"
        
        helpers {
            func hash(data []byte) []byte {
                h := sha256.Sum256(data)
                return h[:]
            }
        }
    }
    
    setup ts {
        import { keccak256 } from 'viem'
    }
    
    fixture data {
        hex: "68656c6c6f20776f726c64"  # "hello world"
    }
    
    bench sha256 {
        go:  hash(data)
        ts:  keccak256(data_hex)
    }
}

DSL Keywords

Keyword	Description
`suite`	Top-level container for benchmarks
`setup <lang>`	Language-specific initialization (imports, helpers, init code)
`fixture`	Shared test data with portable hex encoding
`bench`	Individual benchmark definition
`globalSetup`	File-level initialization (e.g., spawning Anvil)
`use std::*`	Import standard library modules

Suite Configuration

declare suite example performance timeBased sameDataset: true {
    description: "Suite description"
    // [!code focus:4]
    iterations: 5000           # Used when runMode is iterationBased
    warmup: 1000               # Warmup iterations before timing
    baseline: "go"             # Baseline language for comparisons
    targetTime: 3000ms         # Used when runMode is timeBased
    count: 3                   # Runs per benchmark for statistics
    memory: true               # Enable memory allocation profiling
    outlierDetection: true     # IQR-based outlier removal
    cvThreshold: 5.0           # Coefficient of variation target (%)
    timeout: 30000             # Suite timeout (ms)
    sink: true                 # Prevent dead code elimination
}

mode is deprecated and rejected. Configure execution semantics in the declare suite header.

Validation & Lint Rules

The DSL validator and LSP enforce semantic rules for fair comparisons:

Rule	Severity	Description
`same-dataset-inconsistent-fixtures`	Warning	When `sameDataset: true`, benchmarks should use the same fixture set; differing fixture references may indicate unfair comparison
`chart-requires-multiple-benchmarks`	Error	`drawLineChart` / `drawBarChart` require at least 2 benchmarks for meaningful trend/comparison
`baseline-missing-in-benchmark`	Error	When `baseline` is configured, every benchmark must implement the baseline language

See the DSL Reference for valid and invalid examples.

Setup Blocks

Setup blocks contain language-specific initialization with four sections:

setup go {
    import "crypto/sha256"
    import "encoding/hex"
    
    declare {
        var globalCounter int
        type Config struct { Size int }
    }
    
    init {
        globalCounter = 0
    }
    
    helpers {
        func hash(data []byte) []byte {
            h := sha256.Sum256(data)
            return h[:]
        }
    }
}

Fixtures

Portable test data shared across languages:

# Hex-encoded data
fixture short_data {
    hex: "deadbeef"
}

# Load from file
fixture long_data {
    hex: @file("fixtures/1kb.hex")
}

# Generic external data source (raw bytes)
fixture packet {
    data: @file("fixtures/packet.bin")
    encoding: raw
}

# Structured input extraction
fixture request_id {
    data: @file("fixtures/request.json")
    format: json
    selector: "$.items[0].id"
}

# Language-specific implementations
fixture complex {
    go:   buildComplexStruct()
    ts:   buildComplexObject()
    rust: build_complex_struct()
}

hex: remains supported for compatibility. The new fixture model supports:

data: inline string or @file(...)
encoding: hex | raw | utf8 | base64
format: json | csv (optional structured normalization)
selector: optional extraction path ($.a[0].b for JSON, row,col for CSV)

Benchmarks

bench keccak256 {
    description: "Keccak-256 hash"
    iterations: 10000          # Override suite default
    warmup: 500
    timeout: 5000
    tags: ["crypto", "hash"]
    
    before {
        go:   prepareData()
        ts:   prepareData()
    }
    
    go:   keccak256(data)
    ts:   keccak256(data_hex)
    rust: keccak256(&data)
    
    after {
        go: logResult()
        ts: logResult()
    }
    
    validate {
        go:   len(result) == 32
        ts:   result.length === 66
    }
}

Lifecycle Hooks

before — Runs once before benchmark iterations
after — Runs once after iterations (per-language teardown/inspection)
each — Runs before each iteration (outside timing)

Standard Library

use std::anvil      # Ethereum node spawning
use std::charting   # Chart generation
use std::constants  # Mathematical constants

globalSetup {
    anvil.spawnAnvil(fork: "https://eth-mainnet.g.alchemy.com/v2/...")
}

bench example {
    go: doSomething(os.Getenv("ANVIL_RPC_URL"))
    
    after {
        // [!code focus]
        charting.drawTable(title: "Results", output: "results-table.svg", sortBy: "speedup")
        charting.drawSpeedupChart(title: "Speedup")
    }
}

LSP Support

poly-bench includes a full-featured language server for editor integration.

Features

Feature	Description
Diagnostics	Parse errors, validation warnings, embedded language type checking
Formatting	Document formatting with proper indentation
Completions	Context-aware completions for keywords, stdlib, and user symbols
Hover	Documentation for DSL keywords and stdlib; delegates to gopls/tsserver/rust-analyzer for embedded code
Semantic Highlighting	Full syntax highlighting via semantic tokens
Embedded Language Support	Go, TypeScript, Rust, and Python code blocks are checked by their respective language servers

Editor Extensions

VS Code / Cursor: Install from extensions/vscode or the marketplace
More editors coming soon

Starting the LSP

poly-bench lsp  # Starts language server on stdio

CLI Reference

Core Commands

poly-bench check <file>           # Parse and validate
poly-bench compile [<file>]       # Compile-check without running
poly-bench cache stats            # Show compile cache/workspace stats
poly-bench cache clear            # Clear compile cache
poly-bench cache clean            # Clean .polybench workspace
poly-bench run [<file>]           # Execute benchmarks
poly-bench codegen <file>         # Generate code without running
poly-bench fmt [<files>...]       # Format .bench files

Project Management

poly-bench init [<name>]          # Initialize new project
poly-bench new <name>             # Create benchmark template
poly-bench add --go <pkg>         # Add Go dependency
poly-bench add --ts <pkg>         # Add npm dependency  
poly-bench add --rs <crate>       # Add Rust crate
poly-bench remove --go <pkg>      # Remove Go dependency
poly-bench remove --ts <pkg>      # Remove npm dependency
poly-bench remove --rs <crate>    # Remove Rust crate
poly-bench install                # Install dependencies
poly-bench build                  # Build runtime environment

Run Options

poly-bench run hash.bench \
    --lang go \                   # Run only Go
    --iterations 100000 \         # Override iterations
    --report markdown \           # Output format: console, markdown, json
    --output results/             # Output directory

Output Formats

Console (default)

═══════════════════════════════════════════════════════════════════
  BENCHMARK RESULTS
═══════════════════════════════════════════════════════════════════

OVERALL SUMMARY

  Go is 2.5x faster overall

  Total Benchmarks:    10
  Go Wins:             8 (80%)
  TypeScript Wins:     2 (20%)

Markdown

Generates benchmark-report.md with tables and statistics.

JSON

Structured output to benchmark-results.json for CI/automation.

SVG Charts

Visual comparisons via charting.* directives in after blocks.

Supported Languages

Language	Runtime	Memory Profiling	Concurrency
Go	Subprocess + plugin	`runtime.ReadMemStats`	Goroutines
TypeScript	Node.js	`process.memoryUsage()`	—
Rust	Cargo subprocess	Supported	—

Requirements

Go 1.21+ (for Go benchmarks)
Node.js 18+ (for TypeScript benchmarks)
Rust 1.70+ (for Rust benchmarks, or building from source)

Installation

One-liner:

curl -sSL https://polybench.evm-tooling.tools | bash

From source:

cargo install --path .

Upgrade:

poly-bench upgrade

Release

Releases are created via the justfile:

just prod release patch

You can also use:

just prod release minor
just prod release major
just prod release explicit v0.1.0
just prod release docs

This bumps versions in Cargo.toml and extensions/vscode/package.json (for patch/minor/major), creates a git tag, and triggers the GitHub release workflow.

License

MIT License — see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 352 Commits
.github		.github
.vscode		.vscode
build		build
cli		cli
docs		docs
examples		examples
extensions		extensions
install		install
integration-tests		integration-tests
poly-bench-dsl		poly-bench-dsl
poly-bench-executor		poly-bench-executor
poly-bench-grammar		poly-bench-grammar
poly-bench-ir-traits		poly-bench-ir-traits
poly-bench-ir		poly-bench-ir
poly-bench-lsp-v2		poly-bench-lsp-v2
poly-bench-project		poly-bench-project
poly-bench-reporter		poly-bench-reporter
poly-bench-runtime		poly-bench-runtime
poly-bench-stdlib		poly-bench-stdlib
poly-bench-syntax		poly-bench-syntax
poly-bench-traits		poly-bench-traits
scripts		scripts
.gitignore		.gitignore
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
justfile		justfile
rustfmt.toml		rustfmt.toml

Folders and files

Latest commit

History

Repository files navigation

poly-bench

Overview

Architecture

End-to-End Flow

Workspace Components

Execution Model

CLI and Runtime Architecture

LSP Architecture

Quick Start

Language Features

Suite Structure

DSL Keywords

Suite Configuration

Validation & Lint Rules

Setup Blocks

Fixtures

Benchmarks

Lifecycle Hooks

Standard Library

LSP Support

Features

Editor Extensions

Starting the LSP

CLI Reference

Core Commands

Project Management

Run Options

Output Formats

Console (default)

Markdown

JSON

SVG Charts

Supported Languages

Requirements

Installation

Release

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 41

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages