feat: A/B testing, FALSE Protocol, and architectural improvements#26
Merged
Conversation
Replace pinned v0.1.3 binary with install.sh for always-latest. Simplify workflow: let sykli auto-detect sykli.rs instead of manual cargo build + --emit pipe. Drop develop branch trigger. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement full A/B testing deployment strategy: - ABStrategy with reconcile_replicasets, reconcile_traffic, compute_next_status - Z-test for proportions in prometheus_ab.rs (statistical significance) - CRD types: ABStrategy, ABExperimentStatus, ABMetricResult, ABVariant enum - Configurable port, traffic headers, confidence level, experiment duration - 50+ new tests covering strategy, CRD, and statistical analysis Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace #[cfg(test)] coupling with proper trait abstractions: - Clock trait with SystemClock (production) and MockClock (test) - EventSink trait replacing CDEventsSink's #[cfg(test)] fields - MetricsQuerier trait replacing PrometheusClient's #[cfg(test)] fields - Update Context to hold Arc<dyn Trait> for all dependencies - Wire SystemClock and production implementations in main.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add AI-native observability via FALSE Protocol: - Occurrence struct with Error, Reasoning, History blocks - Strategy-aware types (canary.rollout.*, bluegreen.rollout.*, etc.) - Configurable output dir via KULTA_OCCURRENCE_DIR env var - 10MB file size cap with truncation - Graceful handling of missing metadata, non-fatal write failures - 10 tests covering serialization, phase mapping, edge cases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Break the reconciliation god file into a module directory: - reconcile.rs: main reconcile loop, Context, A/B experiment evaluation - replicaset.rs: RS building (consolidated into build_replicaset_core), hashing - status.rs: phase state machine, status computation, requeue intervals - traffic.rs: Gateway API HTTPRoute weight management - validation.rs: rollout spec validation, duration parsing - rollout.rs becomes thin re-export preserving the public API Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Update canary, blue_green, simple strategies for Clock + trait injection - Add A/B ReplicaSet creation tests, occurrence edge case tests - Update integration and stress tests for new Context construction - Fix Prometheus error handling (replace unwrap_or with proper warn + return) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces A/B testing strategy with statistical significance analysis, FALSE Protocol observability, and significant architectural improvements including trait-based dependency injection replacing #[cfg(test)] coupling.
Changes:
- A/B testing strategy with Z-test statistical analysis, header/cookie-based routing, and experiment lifecycle management
- FALSE Protocol occurrences for AI-native observability with strategy-aware event types
- Clock/EventSink/MetricsQuerier trait abstractions replacing conditional compilation
- Rollout module split from 2137 lines into 5 focused modules (reconcile, replicaset, status, traffic, validation)
- Consolidated ReplicaSet builders and proper error handling
Reviewed changes
Copilot reviewed 28 out of 29 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/crd/rollout.rs | Added A/B testing strategy types, new phases (Experimenting/Concluded), ABExperimentStatus |
| src/controller/strategies/ab_testing.rs | Complete A/B testing strategy implementation with HTTPRoute rules and ReplicaSet management |
| src/controller/prometheus_ab.rs | Z-test statistical analysis for A/B experiment evaluation |
| src/controller/rollout/reconcile.rs | Main reconcile logic extracted to focused module with A/B experiment evaluation |
| src/controller/rollout/replicaset.rs | ReplicaSet building logic extracted and consolidated |
| src/controller/rollout/status.rs | Status computation logic with clock dependency |
| src/controller/rollout/traffic.rs | Traffic routing logic extraction |
| src/controller/rollout/validation.rs | Validation logic extraction |
| src/controller/occurrence.rs | FALSE Protocol occurrence emission |
| src/controller/clock.rs | Clock trait abstraction for testable time |
| src/controller/prometheus.rs | MetricsQuerier trait with Http/Mock implementations |
| src/controller/cdevents.rs | EventSink trait with Http/Mock implementations |
| src/main.rs | Updated to use new trait-based types |
| deploy/crd.yaml | CRD schema updated with A/B testing fields |
| tests/* | Test fixtures updated with ab_testing: None and port: None fields |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
kube 2.0.1 requires Rust edition 2024 which is only available in Cargo 1.85+. The CI was using rust:1.83. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
canary.rollout.*,bluegreen.rollout.*, etc.) with Error/Reasoning/History blocks#[cfg(test)]coupling withEventSink,MetricsQuerier, andClocktraitsrollout.rs(2137 lines) into 5 focused modules: reconcile, replicaset, status, traffic, validationABVariantenum replacing stringly-typed winners, consolidated ReplicaSet builders (~180 lines saved), proper Prometheus error handlingTest plan
cargo test)cargo clippy -- -D warningscleancargo fmtclean.unwrap()in production codeprintln!in production code🤖 Generated with Claude Code