Helix Context

Genome-based context compression for local LLMs.

Makes 9k tokens of context window feel like 600k by treating context like a genome instead of a flat text buffer.

  Client (Continue, Cursor, any OpenAI client)
         |
         v
  +--------------------------+
  |  Helix Proxy (FastAPI)   |  Port 11437
  |  /v1/chat/completions    |  OpenAI-compatible
  |                          |
  |  1. Extract query        |
  |  2. Express pipeline     |  <-- Genome (SQLite)
  |  3. Inject context       |  <-- Ribosome (CPU model)
  |  4. Forward to Ollama    |  --> localhost:11434
  |  5. Stream tee response  |
  |  6. Background replicate |
  +--------------------------+

Instead of stuffing your entire codebase into the prompt, Helix compresses it into a persistent SQLite genome and expresses only the relevant genes per turn. The model sees compressed context, not raw text. Conversations replicate back into the genome automatically, building institutional memory over time.

Quick Start

# Install from PyPI (beta)
pip install helix-context --pre

# Pull a small model for the ribosome (context codec)
ollama pull gemma4:e2b

# Start the proxy
helix
# or: python -m uvicorn helix_context.server:app --host 127.0.0.1 --port 11437

# Seed the genome with your own project files
python examples/seed_genome.py path/to/your/project/

# Check genome health
curl http://127.0.0.1:11437/stats

Point any OpenAI-compatible client at http://127.0.0.1:11437/v1 and start chatting. Context compression happens transparently.

How It Works

6-step expression pipeline per turn:

Step	What	Cost	Blocking?
1. Extract	Heuristic keyword extraction from query	0 tokens	No
2. Express	SQLite promoter lookup + synonym expansion + co-activation	0 tokens	No
3. Re-rank	Small CPU model scores candidates by relevance	~300 tokens	Yes
4. Splice	Small CPU model trims introns, keeps exons (batched)	~600 tokens	Yes
5. Assemble	Join spliced parts, enforce token budget, wrap in tags	0 tokens	No
6. Replicate	Pack query+response exchange back into genome	~300 tokens	No (background)

Token budget:

3k tokens: ribosome decoder prompt (fixed, tells the big model how to read codons)
6k tokens: expressed context (compressed, spliced)
600k+: genome cold storage (SQLite, never fully loaded)

Key Features

Context Health Monitor (Delta-Epsilon)

Every query computes a health signal measuring how well the genome served it:

{
  "context_health": {
    "ellipticity": 0.82,
    "coverage": 0.75,
    "density": 0.68,
    "freshness": 1.0,
    "genes_expressed": 3,
    "genes_available": 42,
    "status": "aligned"
  }
}

Status	Ellipticity	Meaning
`aligned`	>= 0.7	Genome is well-grounded, model is informed
`sparse`	>= 0.3	Gaps exist, model may guess on some topics
`stale`	any	Expressed genes are outdated (low freshness)
`denatured`	< 0.3	Context is unreliable, high hallucination risk

Horizontal Gene Transfer (HGT)

Export a genome and import it into another Helix instance:

# Export
python examples/hgt_transfer.py export -d "Project knowledge snapshot"

# Preview what an import would change
python examples/hgt_transfer.py diff genome_export.helix

# Import into another instance
python examples/hgt_transfer.py import genome_export.helix

Three merge strategies: skip_existing (safe default), overwrite, newest. Content-addressed gene IDs ensure deduplication across instances.

Associative Memory

Genes that are frequently expressed together build co-activation links. When you query for topic A, the genome also pulls in topic B if they've been co-expressed before. This creates an organic associative memory that grows smarter over time.

Synonym Expansion

Configure lightweight query expansion in helix.toml:

[synonyms]
cache = ["redis", "ttl", "invalidation", "cdn"]
auth = ["jwt", "login", "security", "token"]

When a user asks about "cache", the genome also searches for "redis", "ttl", etc.

HTTP Endpoints

Endpoint	Method	Description
`/v1/chat/completions`	POST	OpenAI-compatible proxy (primary integration)
`/ingest`	POST	Ingest content into genome: `{content, content_type, metadata?}`
`/context`	POST	Query genome for context: `{query}` (Continue format)
`/stats`	GET	Genome metrics, compression ratio, health
`/health`	GET	Server status, ribosome model, gene count

Continue IDE Integration

Add to ~/.continue/config.yaml:

models:
  - name: Helix (Local)
    provider: openai
    model: gemma4:e4b
    apiBase: http://127.0.0.1:11437/v1
    apiKey: EMPTY
    roles: [chat]
    defaultCompletionOptions:
      contextLength: 128000
      maxTokens: 4096

Use Chat mode (not Agent mode). Set contextLength high so Continue sends the full message; Helix handles compression downstream.

Python API

from helix_context import HelixContextManager, load_config

config = load_config()
helix = HelixContextManager(config)

# Ingest content
helix.ingest("Your document text here", content_type="text")
helix.ingest(open("src/main.py").read(), content_type="code")

# Build context for a query
window = helix.build_context("How does auth work?")
print(window.expressed_context)
print(window.context_health.status)  # "aligned" / "sparse" / "denatured"

# Learn from an exchange
helix.learn("How does auth work?", "JWT middleware validates tokens...")

# Export genome
from helix_context.hgt import export_genome
export_genome(helix.genome, "project.helix", description="Auth system knowledge")

ScoreRift Integration

Helix includes a bridge to ScoreRift for divergence-based context health monitoring:

from helix_context.integrations.scorerift import GenomeHealthProbe, cd_signal

# Probe genome health
probe = GenomeHealthProbe("http://127.0.0.1:11437")
report = probe.full_scan()

# Register as ScoreRift dimensions
from helix_context.integrations.scorerift import make_genome_dimensions
engine.register_many(make_genome_dimensions())

# Feed divergence resolutions back into the genome
from helix_context.integrations.scorerift import resolution_to_gene
resolution_to_gene("security", auto_score=0.85, manual_score=1.0,
                   resolution="False positives in auth module scanner rules")

Configuration

All config in helix.toml:

[ribosome]
model = "auto"              # auto-detect from Ollama
timeout = 10                # seconds before fallback
keep_alive = "30m"          # keep model loaded (eliminates swap latency)

[budget]
ribosome_tokens = 3000
expression_tokens = 6000
max_genes_per_turn = 8
splice_aggressiveness = 0.5  # 0=keep all, 1=ruthless trim

[genome]
path = "genome.db"
cold_start_threshold = 10   # genes needed before history stripping

[server]
port = 11437
upstream = "http://localhost:11434"

[synonyms]
cache = ["redis", "ttl", "invalidation", "cdn"]
auth = ["jwt", "login", "security", "token"]

Testing

# Mock tests only (no Ollama needed, ~8s)
pytest tests/ -m "not live"

# Live tests (requires Ollama)
pytest tests/ -m live -v -s

# Full suite
pytest tests/ -v

183 tests across 7 test files, 18 diverse fixtures (code, essays, poems, science).

Architecture

Module	Role
`schemas.py`	Gene, ContextWindow, ContextHealth, ChromatinState
`codons.py`	CodonChunker (text/code splitting) + CodonEncoder (serialization)
`genome.py`	SQLite genome with promoter-tag retrieval + co-activation
`ribosome.py`	Small-model codec: pack, re_rank, splice, replicate
`context_manager.py`	6-step pipeline orchestrator + pending replication buffer
`server.py`	FastAPI proxy + standalone endpoints
`config.py`	TOML config loader with synonym map
`hgt.py`	Genome export/import (Horizontal Gene Transfer)
`integrations/scorerift.py`	CD spectroscope bridge to ScoreRift

Origin

Built as a standalone package extracted from BigEd CC. Implements the "Ribosome Hypothesis" for local LLM context management.

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
examples		examples
helix_context		helix_context
scripts		scripts
tests		tests
training		training
.gitignore		.gitignore
BENCHMARK_NOTES.md		BENCHMARK_NOTES.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
SESSION_HANDOFF.md		SESSION_HANDOFF.md
helix.toml		helix.toml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Helix Context

Quick Start

How It Works

Key Features

Context Health Monitor (Delta-Epsilon)

Horizontal Gene Transfer (HGT)

Associative Memory

Synonym Expansion

HTTP Endpoints

Continue IDE Integration

Python API

ScoreRift Integration

Configuration

Testing

Architecture

Origin

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Helix Context

Quick Start

How It Works

Key Features

Context Health Monitor (Delta-Epsilon)

Horizontal Gene Transfer (HGT)

Associative Memory

Synonym Expansion

HTTP Endpoints

Continue IDE Integration

Python API

ScoreRift Integration

Configuration

Testing

Architecture

Origin

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages