zw-fast-quantile

A Rust implementation of the Zhang-Wang fast approximate quantile algorithm, as described in An Efficient Quantile Computation Technique for Approximate Query Processing (SSDBM 2007). This is the original algorithm that inspired the quantile techniques used in TensorFlow Boosted Trees. For more context, see this blog post.

Features

Approximate quantile computation with tunable error bound (epsilon)
Two variants: fixed-size (known stream length) and unbounded (unknown stream length)
Generic over any Clone + Ord element type
Sub-linear memory usage — space grows logarithmically with stream size
Optional serde support for serialization/deserialization
Query caching for efficient repeated lookups

Installation

Add this to your Cargo.toml:

[dependencies]
zw-fast-quantile = "1.0"

Usage

FixedSizeEpsilonSummary

Use when the total number of elements is known ahead of time. This allows tighter memory allocation.

use zw_fast_quantile::FixedSizeEpsilonSummary;

let epsilon = 0.01;  // 1% error tolerance
let n = 10000;       // expected stream size
let mut summary = FixedSizeEpsilonSummary::new(n, epsilon).unwrap();

for value in 1..=n {
    summary.update(value);
}

// Query the median (rank 0.5)
let median = summary.query(0.5).unwrap();

// Query any quantile from 0.0 (min) to 1.0 (max)
let p99 = summary.query(0.99).unwrap();

UnboundEpsilonSummary

Use when the stream size is not known in advance. It dynamically manages internal summaries as the stream grows.

use zw_fast_quantile::UnboundEpsilonSummary;

let epsilon = 0.01;
let mut summary = UnboundEpsilonSummary::new(epsilon).unwrap();

// Feed values from any iterator or stream
for value in 1..=10000 {
    summary.update(value);
}

let median = summary.query(0.5).unwrap();
let count = summary.size();  // number of elements inserted

Error Handling

Both constructors and query() return Result<_, QuantileError>:

use zw_fast_quantile::{FixedSizeEpsilonSummary, QuantileError};

// Invalid parameters
assert!(FixedSizeEpsilonSummary::<usize>::new(0, 0.1).is_err());     // n must be > 0
assert!(FixedSizeEpsilonSummary::<usize>::new(10, -0.1).is_err());   // epsilon must be positive
assert!(FixedSizeEpsilonSummary::<usize>::new(10, f64::NAN).is_err()); // epsilon must be finite

// Query errors
let s = FixedSizeEpsilonSummary::<usize>::new(10, 0.1).unwrap();
assert!(s.query(0.5).is_err());  // EmptySummary — no values inserted

Choosing Epsilon

The epsilon parameter controls the trade-off between accuracy and memory:

Epsilon	Max Error	Relative Memory
0.1	10%	Lowest
0.01	1%	Moderate
0.001	0.1%	Higher

A query for rank r on a stream of n elements will return an element whose true rank is within r ± epsilon (i.e., the result is within epsilon * n positions of the exact answer).

Benchmarks

Benchmarked against GK01 from the quantiles crate. Inserting 5000 values with epsilon = 0.01:

Update (insertion) performance — ZW is ~2.6x faster:

zw unbound quantile update
                        time:   [60.780 us 60.855 us 60.936 us]

gk quantile update      time:   [156.84 us 157.02 us 157.24 us]

Query performance (10 quantiles) — ZW is ~1.5x faster:

zw unbound quantile query
                        time:   [229.62 ns 230.16 ns 230.77 ns]

gk quantile query       time:   [350.21 ns 350.48 ns 350.76 ns]

Run benchmarks yourself with:

cargo bench

Optional Features

serde — Enable serialization/deserialization support via serde. Cache fields are skipped during serialization and lazily rebuilt on the next query.
```
[dependencies]
zw-fast-quantile = { version = "1.0", features = ["serde"] }
```

Minimum Supported Rust Version

The MSRV is 1.65.

Documentation

API docs on docs.rs
Algorithm internals — how the data structures and operations work
Architecture overview — code organization and design decisions

Related Projects

zw-fast-quantile-py — Python bindings for this library

License

Licensed under Apache-2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github		.github
benches		benches
docs		docs
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

zw-fast-quantile

Features

Installation

Usage

FixedSizeEpsilonSummary

UnboundEpsilonSummary

Error Handling

Choosing Epsilon

Benchmarks

Optional Features

Minimum Supported Rust Version

Documentation

Related Projects

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

zw-fast-quantile

Features

Installation

Usage

FixedSizeEpsilonSummary

UnboundEpsilonSummary

Error Handling

Choosing Epsilon

Benchmarks

Optional Features

Minimum Supported Rust Version

Documentation

Related Projects

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages