Ugnos has a healthy number of 100+ re-clones per 14 days after each feature release. THANK YOU ALL FOR USING UGNOS!
ugnos is a concurrent, embeddable time-series storage + query engine designed for durability and high-throughput ingest in Rust services. The ugnosd daemon exposes Prometheus-compatible HTTP APIs (query, Remote Write) and optional gRPC with auth.
For project goals and long-term architecture, see the whitepaper.
We plan to add client SDKs for multiple languages (Rust, Go, Python, TypeScript); the order and the structure is not yet fixed. We tend to monorepo, as shown below:
Organization: ugnos
Single repo: ugnos/ugnos (or ugnos/monorepo)
ugnos/
├── core/ # Daemon + library crate (current contents of ugnos repo)
│ ├── src/
│ ├── proto/ # ugnos + prompb
│ ├── openapi.yaml # (future)
│ ├── Cargo.toml
│ └── ...
├── sdk/
│ ├── rust/ # Client crate (crates.io)
│ ├── go/ # Go module
│ ├── python/ # PyPI package
│ └── typescript/ # npm package
├── .github/workflows/ # Path-filtered CI: core, sdk/rust, sdk/go, etc.
└── README.md # Points to core/ and sdk/* subdirs
For latest changelog and version history, see the CHANGELOG.
This crate is a library-grade database core intended to be embedded into a Rust process (service/agent/daemon). The ugnosd daemon exposes it as a networked server (HTTP + gRPC) with auth and Prometheus-compatible APIs.
- This is:
- An embeddable time-series ingest + query core with WAL/snapshots/segments (SST-like) and a structured event hook.
- A production daemon (
ugnosd) with HTTP ops (liveness/readiness), Prometheus Remote Write, Prometheus HTTP API (instant/range query, labels, series), optional gRPC, and deny-by-default AuthN/AuthZ. - A PromQL-like query surface (vector/range selectors, label matchers, window functions, aggregations) with vectorized execution and Grafana-compatible Prometheus datasource.
- A PromQL library API (
ugnos::promql): run instant query, range query, labels, label values, and series directly against aDbCorefrom Rust code — typed results (InstantSample,RangeSeries,MetricLabels), unifiedPromqlError, and optional time/step parsing; no HTTP required. - Suitable for single-process or single-node deployment where you own deployment, IO, and operational integration.
- This is not (yet):
- A distributed system (no replication, consensus, sharding across nodes).
- A full SQL layer (queries are PromQL-like and programmatic APIs).
- A turnkey operational product (no built-in backup orchestration, migrations tooling, or admin UI).
- Concurrent ingest: sharded write buffering + background flush thread.
- Durable persistence:
- WAL with explicit format versioning and per-record CRC32 checksums.
- Snapshots with explicit format versioning, payload CRC32, and atomic install (temp + rename + fsync).
- On-disk segment engine (SST-like):
- Immutable segment files with per-series columnar blocks; block-level checksums and versioning (v2 header with payload CRC32 and version field).
- Timestamp delta encoding (varint) for series blocks; configurable float encoding strategies (Raw64, GorillaXor).
- Tag dictionary encoding; optional per-block compression (LZ4, Zstd with configurable level).
- Time index per segment/block (time-range via binary search); tag index (inverted index with Roaring bitmaps) for tag filters without full scans.
- Atomic manifest (
MANIFEST.bin) tracking active segments and retention watermark. - Background compaction (L0 → L1 merge) with safe concurrent reads.
- Indexing & cardinality:
- Tag filters use the tag index (bitmap intersection); configurable series cardinality hard limit per scope with explicit error and metrics.
- Retention/TTL:
- Immediate logical deletion via tombstone watermark.
- Physical removal via compaction guarantees.
- Query & APIs (daemon / library):
- PromQL-like query surface: vector/range selectors, label matchers (
=,!=,=~,!~), window functions (rate,increase,avg_over_time, …), aggregations with grouping (sum by,avg without, …); IEEE 754 semantics (NaN propagation, min/max special case). - Query planner with
explain(); vectorized execution with optional parallelism cap (query_max_parallel_series). - PromQL library API (
ugnos::promql):query_instant,query_range,labels,label_values,serieswith typed results andPromqlError; optionalparse_eval_time/parse_stepfor config/CLI parity with the HTTP API. - Prometheus HTTP API v1:
GET /api/v1/query,GET /api/v1/query_range,GET /api/v1/labels,GET /api/v1/label/<name>/values,GET /api/v1/series(Grafana Prometheus datasource compatible). - Prometheus Remote Write:
POST /api/v1/write(Snappy-compressed protobuf); backpressure/cardinality limit → 429 and metrics. - gRPC (Tonic) for ingest/query/admin; AuthN/AuthZ deny-by-default for HTTP and gRPC (e.g.
http_write_token, gRPC auth).
- PromQL-like query surface: vector/range selectors, label matchers (
- Observability hooks:
- No stdout logging in core hot paths.
- Structured
DbEventstream viaDbConfig.event_listener.
- Bench suite reports segment size per encoding configuration; tests assert p99 query latency within target for encoded/compressed segments.
- Acceptance and break-it tests covering format layout validation, checksum/version enforcement, roundtrip correctness, and corruption detection.
All persistence lives under DbConfig.data_dir:
wal/wal.log(current WAL)wal_*.log(rotated WAL segments; may exist briefly)
snapshots/snapshot_<timestamp>.bin(atomic, checksummed snapshots)
engine/segments/MANIFEST.bin(atomic + checksummed)seg_<id>_l0.seg,seg_<id>_l1.seg, ...
DbCore::flush()blocks until the flush is complete.DbCore::snapshot()blocks until the snapshot is written (when enabled).- With segments enabled,
DbCore::recover():- Uses segment max-seq to replay only the WAL tail.
- Truncates
wal.logback to just the header (bounded restart cost).
- Query results are not guaranteed to be globally sorted across multiple segments; sort by timestamp if you need ordering.
DbConfig is intended to be explicit and production-friendly:
use std::path::PathBuf;
use std::time::Duration;
use tempfile::TempDir;
use ugnos::{DbConfig, DbCore};
use ugnos::encoding::{
BlockCompression::Zstd,
FloatEncoding::GorillaXor,
};
let dir = TempDir::new().unwrap();
let mut cfg = DbConfig::default();
cfg.data_dir = PathBuf::from(dir.path());
// Durability toggles
cfg.enable_wal = true;
cfg.enable_snapshots = true;
cfg.enable_segments = true; // segment engine + compaction + retention
// Tuning
cfg.wal_buffer_size = 1_000;
cfg.flush_interval = Duration::from_millis(250);
cfg.snapshot_interval = Duration::from_secs(60 * 15);
// Retention (optional):
// makes data older than now - ttl invisible,
// and compaction reclaims disk.
cfg.retention_ttl = Some(Duration::from_secs(60 * 60 * 24 * 7));
cfg.retention_check_interval = Duration::from_secs(1);
// Encoding & compression (series blocks in segments):
// float (Raw64 | GorillaXor), tag dictionary, LZ4/Zstd.
cfg.segment_store.encoding.float_encoding = GorillaXor;
cfg.segment_store.encoding.compression = Zstd { level: 3 };
// Cardinality (optional):
// hard limit for distinct series per scope;
// scope is derived from tags[cardinality_scope_tag_key].
// When exceeded, insert returns DbError::SeriesCardinalityLimitExceeded
// and metrics:
// - ugnos_cardinality_limit_rejections
// - ugnos_series_cardinality
// cfg.max_series_cardinality = Some(100);
// cfg.cardinality_scope_tag_key = Some("tenant".to_string());
let mut db = DbCore::with_config(cfg).unwrap();
db.recover().unwrap();Core emits structured events via DbConfig.event_listener. With cardinality limits enabled, telemetry exposes ugnos_cardinality_limit_rejections and ugnos_series_cardinality (when using the Prometheus recorder).
use std::sync::{Arc, Mutex};
use ugnos::{DbEvent, DbEventListener};
#[derive(Debug)]
struct MemoryEvents(Arc<Mutex<Vec<DbEvent>>>);
impl DbEventListener for MemoryEvents {
fn on_event(&self, event: DbEvent) {
self.0.lock().unwrap().push(event);
}
}use std::path::PathBuf;
use std::sync::{Arc, Mutex};
use std::time::Duration;
use tempfile::TempDir;
use ugnos::{DbConfig, DbCore, DbEvent, DbEventListener, TagSet};
#[derive(Debug)]
struct MemoryEvents(Arc<Mutex<Vec<DbEvent>>>);
impl DbEventListener for MemoryEvents {
fn on_event(&self, event: DbEvent) {
self.0.lock().unwrap().push(event);
}
}
fn main() -> Result<(), ugnos::DbError> {
let events = Arc::new(Mutex::new(Vec::new()));
let dir = TempDir::new().unwrap();
let mut cfg = DbConfig::default();
cfg.data_dir = PathBuf::from(dir.path());
cfg.enable_wal = true;
cfg.enable_snapshots = true;
cfg.enable_segments = true;
cfg.retention_ttl = Some(
Duration::from_secs(60 * 60 * 24 * 30) // 30 days
);
cfg.event_listener = Arc::new(MemoryEvents(events.clone()));
let mut db = DbCore::with_config(cfg)?;
db.recover()?;
let mut tags = TagSet::new();
tags.insert("host".to_string(), "server1".to_string());
tags.insert("region".to_string(), "us-east".to_string());
db.insert("cpu_usage", 100, 0.75, tags.clone())?;
db.insert("cpu_usage", 200, 0.80, tags.clone())?;
db.flush()?; // blocks until durable
let mut results = db.query(
"cpu_usage", 0..u64::MAX, Some(&tags))?;
results.sort_by_key(|(ts, _)| *ts);
assert_eq!(results.len(), 2);
Ok(())
}The ugnosd binary runs UGNOS as a production daemon. Configuration is layered (later overrides earlier):
- Defaults — built-in
DbConfigdefaults - Config file — TOML at
--config <path>or, if omitted,ugnosd.tomlin the current directory (if present) - Environment —
UGNOS__*variables (e.g.UGNOS__DATA_DIR,UGNOS__HTTP_BIND,UGNOS__HTTP_WRITE_TOKEN,UGNOS__HTTP_READ_TOKEN,UGNOS__SEGMENT_STORE__COMPACTION_CHECK_INTERVAL_SECS; use__for nested keys) - CLI —
--config,--data-dir,--http-bind,--no-config,--validate-config
HTTP auth (deny-by-default): When http_write_token or http_read_token is unset, the corresponding endpoints return 401. To enable auth: set http_write_token and/or http_read_token in TOML, or set UGNOS__HTTP_WRITE_TOKEN and/or UGNOS__HTTP_READ_TOKEN in the environment. Clients must send Authorization: Bearer <token>. http_write_token protects POST /api/v1/write (Remote Write); http_read_token protects the Prometheus read API (query, query_range, labels, series).
Safe startup: Before opening the database, the daemon checks that data_dir exists (creates it if missing) and is writable. If config is invalid, the data directory is unusable, or recovery fails, the process exits with a non-zero status and an error message.
HTTP endpoints: The daemon serves on the address given by http_bind (default 127.0.0.1:8080; use 0.0.0.0:8080 for Docker/Kubernetes):
- Ops:
GET /healthz(liveness),GET /readyz(readiness). Readiness is 200 after DB open and recovery; 503 otherwise. - Prometheus:
POST /api/v1/write(Remote Write),GET /api/v1/query,GET /api/v1/query_range,GET /api/v1/labels,GET /api/v1/label/<name>/values,GET /api/v1/series. Auth is deny-by-default whenhttp_write_token(or gRPC auth) is configured.
Graceful shutdown: On SIGINT (Ctrl+C) or SIGTERM, the daemon stops accepting new HTTP connections, waits for in-flight requests to finish (up to 30s), flushes the database buffer, then sends shutdown to the background flush thread (which performs a final flush and closes the WAL). The segment store’s compaction loop is stopped when the process exits. This guarantees WAL flush and a clean compaction stop as per the acceptance criteria.
Example:
cargo build --release
./target/release/ugnosd --data-dir /var/lib/ugnos
# Bind health server to all interfaces (e.g. for containers):
./target/release/ugnosd --data-dir /var/lib/ugnos --http-bind 0.0.0.0:8080
# Or validate config without starting the DB:
./target/release/ugnosd --validate-config --config /etc/ugnosd.tomlSee ugnosd.toml.example for a full TOML template.
The build produces a single static-ish binary per platform (no separate runtime or config artifacts required):
cargo build --release --bin ugnosd
# Artifact: target/release/ugnosd- systemd: Copy
deploy/systemd/ugnosd.serviceto/etc/systemd/system/, create userugnos, install the binary to/usr/local/bin/ugnosd, setdata_dir(e.g./var/lib/ugnos) and optionally--config /etc/ugnosd.toml. Thensystemctl daemon-reload && systemctl enable --now ugnosd. UseTimeoutStopSec=35so SIGTERM allows WAL flush before kill. - Docker: See Docker below; image runs
ugnosdas PID 1 with config and env overrides. - Kubernetes: Use the manifests under
deploy/k8s/(Deployment, Service, optional ConfigMap). SetlivenessProbetoGET /healthzandreadinessProbetoGET /readyzon port 8080; give the pod aterminationGracePeriodSecondsof at least 35 so graceful shutdown can flush WAL.
Build and run ugnosd in a container for local evaluation or deployment. The image runs as a non-root user, exposes port 8080 for /healthz and /readyz, and uses exec-form entrypoint so the daemon is PID 1 and receives SIGTERM for graceful shutdown (WAL flush, compaction stop).
From the ugnos project root:
docker build -t ugnosd:latest .The image uses Rust 1.93 by default (edition 2024 requires 1.85+). To pin a different version:
docker build --build-arg RUST_VERSION=1.93 -t ugnosd:latest .Default: data in a named volume, health server on 0.0.0.0:8080:
docker run -d --name ugnosd -p 8080:8080 -v ugnos_data:/var/lib/ugnos ugnosd:latest
curl -s http://localhost:8080/healthz
curl -s http://localhost:8080/readyzWith a config file (mount TOML and optional env overrides):
docker run -d --name ugnosd -p 8080:8080 \
-v ugnos_data:/var/lib/ugnos \
-v /path/to/ugnosd.toml:/etc/ugnosd.toml:ro \
-e UGNOS__HTTP_BIND=0.0.0.0:8080 \
-e UGNOS__HTTP_WRITE_TOKEN=secret \
-e UGNOS__HTTP_READ_TOKEN=secret \
ugnosd:latest --config /etc/ugnosd.tomlGive the daemon time to shut down cleanly (default Docker stop timeout is 10s; the daemon may need up to 30s to drain connections and flush WAL):
docker stop -t 35 ugnosdFrom the ugnos project root:
docker compose up -d
curl -s http://localhost:8080/healthz
curl -s http://localhost:8080/readyz
docker compose downCompose defines a healthcheck and stop_grace_period: 35s so docker compose down sends SIGTERM and allows the daemon to flush before exit.
Tag and push to your registry (e.g. GitHub Container Registry or Docker Hub):
# Example: GHCR
docker tag ugnosd:latest ghcr.io/YOUR_ORG/ugnosd:0.5.0
docker push ghcr.io/YOUR_ORG/ugnosd:0.5.0
# Example: Docker Hub
docker tag ugnosd:latest YOUR_USER/ugnosd:0.5.0
docker push YOUR_USER/ugnosd:0.5.0- Liveness/readiness: After
docker compose up -d,GET /healthzandGET /readyzmust return 200. If readiness returns 503, the DB may still be recovering; wait a few seconds. - Graceful shutdown: Run the container, write data (when ingest APIs exist), then
docker compose downordocker stop -t 35 ugnosd. Container should exit 0; on next start, data should still be present (persistence across restart). - Invalid config: Override with invalid
UGNOS_HTTP_BIND(e.g.not-a-host) and confirm the container exits non-zero and does not serve traffic.
From the project root you can run an automated verification script (builds image, brings up compose, checks health endpoints, asserts invalid config fails, then tears down):
./scripts/verify-docker.shFrom the ugnos project root:
cargo build --release
cargo testThe examples/ folder contains runnable demos:
| Example | What it demonstrates |
|---|---|
persistence_demo |
DbConfig, insert, flush, snapshot, query, recover; restart flow |
encoding_compression_demo |
Segment encoding (GorillaXor, Zstd), TagSet, insert/query |
event_listener_demo |
DbEventListener, event stream; asserts on flush/snapshot events |
cardinality_demo |
max_series_cardinality, cardinality_scope_tag_key; limit exceeded error |
retention_demo |
retention_ttl; TTL expiry and compaction |
query_tag_filter_demo |
db.query with tag filter; multiple series, filtered counts |
gen_minimal_write_request |
Emits Snappy WriteRequest to stdout; pipe to curl for Remote Write |
prometheus_api_client_demo |
GET query/query_range/labels/series against ugnosd; Grafana datasource reference |
promql_library_demo |
Run PromQL from Rust: instant/range query, labels, label_values, series; SLO-style request observability |
Library examples (persistence, encoding, event_listener, cardinality, retention, query_tag_filter, promql_library_demo) run standalone: cargo run --example <name>.
Server examples require a running ugnosd with data. Use the scripts:
./scripts/run-prometheus-api-client-demo.sh # Prometheus read API demo
./scripts/verify-remote-write-query.sh # Write → query assertioncargo bench -p ugnos
NOWAL=1 cargo bench -p ugnosBenchmark results are saved in target/criterion/. Benchmarks default to the in-memory engine (segments disabled) to keep IO minimal.
This project is licensed under either of
- MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
at your option.