test: Establish test safety net for controller2 refactoring#23
Draft
anj-s wants to merge 4 commits into
Draft
Conversation
This change implements the integration of the Antigravity agent as a harness in AX Controller V2, satisfying the roadmap item for built-in harness support. Key changes: 1. Modified the example Antigravity agent in examples/antigravity_agent to accept prompts via command-line arguments, enabling dynamic interaction. 2. Implemented AntigravityHarness in Go (internal/harness/antigravity.go) which runs the Python agent as a subprocess, captures stdout, and streams the response. Added a TODO to transition this to a gRPC server to avoid subprocess overhead in the future. 3. Created BuildHarness in internal/controller2/builder.go to construct the appropriate harness. It performs environment checks (python3 availability and script existence) and falls back to the test harness (harnesstest) if the environment is not ready. 4. Updated Controller V2 (internal/controller2/controller.go) to use BuildHarness based on the requested AgentId, and made the Antigravity script path configurable. 5. Added a unit test in internal/controller2/controller_test.go to verify the fallback behavior when a non-existent script path is configured. TAG=agy CONV=7b9cc74f-9324-48ed-b259-e9d751866419
Based on user feedback, refactored the harness creation to use a Builder pattern. Key changes: 1. Introduced `harness.HarnessConfig` to hold configuration options for various harnesses. 2. Implemented `harness.AntigravityHarnessBuilder` which builds `AntigravityHarness` using the config. 3. Updated `BuildHarness` in `controller2` to accept `HarnessConfig` and use the builder. 4. Updated `Controller V2` and its unit tests to use the new `HarnessConfig` structure. TAG=agy CONV=7b9cc74f-9324-48ed-b259-e9d751866419
This change implements the feedback to make Registry the Single Source of Truth (SSOT) for both agents and harnesses, and passes the Registry as an input to the Controller (Dependency Injection). Key changes: 1. Updated `internal/config/config.go` to support `Harnesses` configurations in `RegistryConfig` (parsed from `ax.yaml`). 2. Updated `Registry` in `internal/controller2/registry.go` to support registering and retrieving `Harness` instances. 3. Refactored `Controller V2` (`internal/controller2/controller.go`) to receive `Registry` as an input Config, removing internal `HarnessConfig` mapping. 4. Updated `Controller.Exec` to retrieve the harness from the `Registry` using `AgentId`, falling back to the test harness at runtime if not found. 5. Updated all unit tests (`controller_test.go`, `fork_test.go`) to adapt to the new `Registry` injection. Added a new runtime fallback test. 6. Created `e2e.go` in the repository root to demonstrate all three execution paths: Runtime Fallback, Build-time Fallback, and Antigravity Subprocess Execution. TAG=agy CONV=7b9cc74f-9324-48ed-b259-e9d751866419
- Make harnesstest configurable by adding DefaultRunFunc to Harness and RunFunc to Execution. - Refactor controller2 to use HarnessBuilder in Config, enabling registry-based harness creation while maintaining testability. - Migrate comprehensive test suite from controller to controller2_test.go, skipping unimplemented features to document gaps. TAG=agy CONV=6e584df6-30c8-484d-946b-36f477b364d7
cc8a3a2 to
0dd5a31
Compare
e5d6fee to
7b7d8bd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context & Goal
As part of the ongoing Substrate refactoring (initiated in PR #22), a new controller version (
internal/controller2) is being introduced to replace the oldcontrollerandexecutorpackages. However, V2 currently lacks robust tests and key orchestrator features, risking regression during its incremental development.This PR establishes a comprehensive Test Safety Net for
controller2by migrating the entire, robust test suite from the old controller, adapting it to V2's new Registry-based harness lookup design, and clearly mapping out V2's feature gaps using skipped tests.Key Changes
1. Configurable Mock Harness (
internal/harness/harnesstest/harnesstest.go)DefaultRunFunctoHarnessandRunFunctoExecutionto allow tests to inject custom, dynamic run behaviors (e.g., mock confirmation questions, specific text responses, or errors).2. Registry-based Test Injection (
internal/controller2/controller_test.go)Registry.mockHarnessin theRegistry(reg.RegisterHarness("mock-agent", mockHarness)) and request it viaAgentId: "mock-agent". This avoids any temporary dependency-injection clutter in production code.3. Test Suite Migration & Gap Mapping
t.Skip()), establishing a clear, compiler-enforced map of remaining work to reach parity.TestController2_Exec_ResumptionAndIDGeneration(Event Logging & Resumption gap)TestController2_Exec_LastSeq_Empty(History Playback gap)TestController2_Exec_LastSeq(History Playback gap)TestController2_Exec_LastSeq_NotFound(History Playback gap)TestController2_Exec_WaitsForConfirmation(Confirmation Handling gap)TestController2_Exec_InternalOnly(InternalOnly Filtering gap)Verification
internal/controller2/...compile and run cleanly.