test: add 26 tests and clean up stale docs#27
Merged
Conversation
- 10 evaluate_ab_experiment tests (manual conclude, max/min duration, insufficient samples, Prometheus failures, statistical significance) - 2 CDEvents A/B tests (experiment concluded, A/B initialization) - 5 prometheus_ab edge cases (empty results, zero rates, se=0, consensus A) - 2 Prometheus AB query builder tests - 2 default_service_port tests - 1 validation negative deadline test - 2 status tests (A/B initialization, invalid timestamp) - 3 occurrence tests (missing namespace, Initializing/Experimenting suffixes) - Enhanced MockPrometheusClient with response queue for multi-query tests Test count: 268 → 294 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Delete stale/aspirational documents: - docs/adr/001-004 (unimplemented vision docs) - docs/PLAN.md (outdated roadmap) - docs/CODE_REVIEW.md (stale line references) Update CLAUDE.md: - Architecture diagram with new module structure - A/B testing, FALSE Protocol, clock in feature list - File locations table with all new files - Test count 294+, ulid dependency Update README.md: - A/B Testing: Planned → Yes - Added FALSE Protocol feature - Test badge 294+, Rust 1.85+ requirement Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1570277 to
e0163d4
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Adds A/B testing support and expands controller/test infrastructure while cleaning up documentation and updating CI/tooling for the newer Rust toolchain.
Changes:
- Introduces A/B testing strategy (CRD types, controller strategy, Prometheus-based significance evaluation) plus supporting status/traffic/replicaset utilities.
- Refactors controller dependencies via traits (EventSink, MetricsQuerier, Clock) and adds FALSE Protocol occurrence emission.
- Expands test coverage and updates CRD/README/CI to reflect new features and Rust 1.85+.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tests/stress_test.rs | Updates Rollout test builders for new strategy fields (ab_testing, port). |
| tests/seppo_integration_test.rs | Updates integration tests for new strategy fields (ab_testing, port). |
| sykli.rs | Bumps Rust container image used by the Sykli pipeline to 1.85. |
| src/main.rs | Switches to new HTTP-based sinks/clients and injects Clock into controller Context. |
| src/crd/rollout_test.rs | Adds serialization/deserialization tests for A/B strategy and new phases/status fields. |
| src/crd/rollout.rs | Extends CRD Rust types with A/B strategy, ports, new phases, and A/B experiment status structs. |
| src/crd/conversion_test.rs | Updates conversion tests to account for new strategy/port fields. |
| src/controller/strategies/simple.rs | Updates strategy trait signature and status fields to include clock “now” and ab_experiment. |
| src/controller/strategies/mod.rs | Adds A/B strategy module and updates RolloutStrategy trait to accept now. |
| src/controller/strategies/canary.rs | Wires now through status computation and updates tests for new fields. |
| src/controller/strategies/blue_green.rs | Updates trait signature to accept now and updates tests for new fields. |
| src/controller/strategies/ab_testing.rs | New A/B strategy implementation: ReplicaSets + HTTPRoute rules + status progression. |
| src/controller/rollout/validation.rs | New runtime validation + duration parsing helpers used by reconcile/webhook paths. |
| src/controller/rollout/traffic.rs | New traffic helpers (ports + backend refs/weights + HTTPRoute mutation utilities). |
| src/controller/rollout/status.rs | New status state-machine helpers (init/progress/pause/requeue/deadline checks). |
| src/controller/rollout/replicaset.rs | New ReplicaSet builders, surge parsing, and pod-template hashing. |
| src/controller/rollout/reconcile.rs | New main reconcile loop using the split rollout modules, metrics evaluation, and A/B evaluation. |
| src/controller/prometheus_ab.rs | New Z-test based statistical analysis utilities for A/B experiment conclusions. |
| src/controller/prometheus.rs | Introduces MetricsQuerier trait, HttpPrometheusClient, and queued-response mock client. |
| src/controller/occurrence.rs | New FALSE Protocol occurrence emission (file writer + mapping from phases/strategies). |
| src/controller/mod.rs | Registers new controller submodules (clock/occurrence/prometheus_ab). |
| src/controller/clock.rs | New Clock abstraction (SystemClock + MockClock). |
| src/controller/cdevents_test.rs | Updates tests to use the new EventSink trait + adds A/B concluded/deployed event coverage. |
| src/controller/cdevents.rs | Refactors to EventSink trait with HTTP and mock implementations; adds experiment conclusion event data. |
| deploy/crd.yaml | Regenerates CRD schema with A/B strategy + ports + new status fields/phases. |
| README.md | Updates feature matrix and counts (tests, Rust version, A/B testing, FALSE Protocol). |
| Cargo.toml | Adds ulid dependency for FALSE Protocol occurrence IDs. |
| Cargo.lock | Locks ulid and transitive dependencies. |
| CLAUDE.md | Updates repo/module structure docs and feature list/test counts. |
| .github/workflows/sykli-ci.yml | Reworks CI workflow and installs/runs Sykli. |
Comments suppressed due to low confidence (2)
src/controller/occurrence.rs:169
build_occurrence_typemaps strategy names using underscore variants (e.g. "blue_green", "ab_testing"), but the actual strategy names returned by handlers are hyphenated (e.g. "blue-green", "ab-testing"). This will produce unexpected FALSE Protocol types likeblue-green.rollout.completedinstead ofbluegreen.rollout.completed. Align the mapping (and the doc comment/tests) with the realRolloutStrategy::name()values.
/// Build the full occurrence type from strategy name and phase transition
///
/// Maps strategy names to FALSE Protocol type prefixes:
/// - "canary" → "canary.rollout.*"
/// - "blue_green" → "bluegreen.rollout.*"
/// - "ab_testing" → "abtesting.rollout.*"
/// - "simple" → "rolling.rollout.*"
fn build_occurrence_type(strategy: &str, old_phase: Option<&Phase>, new_phase: &Phase) -> String {
let prefix = match strategy {
"blue_green" => "bluegreen",
"ab_testing" => "abtesting",
"simple" => "rolling",
other => other, // "canary" passes through
};
let suffix = phase_to_occurrence_suffix(old_phase, new_phase);
format!("{}.rollout.{}", prefix, suffix)
src/controller/prometheus_ab.rs:210
determine_experiment_conclusionhas overlapping/likely-unreachable branches: thea_wins > 0 && b_wins == 0andb_wins > 0 && a_wins == 0cases already cover the scenario where all significant results agree, making the subsequenta_wins == significant_results.len() || b_wins == significant_results.len()branch redundant. Consider simplifying the logic to avoid confusion and ensure the chosenABConclusionReasonis intentional.
pub fn determine_experiment_conclusion(
results: &[ABMetricResult],
) -> Option<(ABVariant, ABConclusionReason)> {
if results.is_empty() {
return None;
}
// Check if all significant metrics agree on a winner
let significant_results: Vec<&ABMetricResult> =
results.iter().filter(|r| r.is_significant).collect();
if significant_results.is_empty() {
return None;
}
// Count winners
let mut a_wins = 0;
let mut b_wins = 0;
for result in &significant_results {
match &result.winner {
Some(ABVariant::A) => a_wins += 1,
Some(ABVariant::B) => b_wins += 1,
None => {}
}
}
// All significant metrics agree
if a_wins > 0 && b_wins == 0 {
Some((ABVariant::A, ABConclusionReason::ConsensusReached))
} else if b_wins > 0 && a_wins == 0 {
Some((ABVariant::B, ABConclusionReason::ConsensusReached))
} else if a_wins == significant_results.len() || b_wins == significant_results.len() {
let winner = if a_wins > b_wins {
ABVariant::A
} else {
ABVariant::B
};
Some((winner, ABConclusionReason::SignificanceReached))
} else {
None
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #26 (merged). Addresses test coverage gaps and doc cleanup:
Test plan
cargo test)cargo clippy -- -D warningscleancargo fmtclean🤖 Generated with Claude Code