Context
Week 8 from taskdeck-12-week-roadmap-v4.md.
Parent: #972
Depends on: #976 , #979
This issue expands evaluation and privacy controls around the new proposal pipeline. It treats exfiltration safety as separate from mutation safety.
Scope
Expand the golden dataset with clarification and safety/refusal cases.
Add Microsoft.Extensions.AI.Evaluation or a documented fallback integrated with dotnet test.
Add prompt regression with promptfoo using --no-share and PR-visible diffs.
Add a WireMock.Net MITM integration test for the full capture -> proposal -> agent flow that fails on outbound hosts outside EgressEnvelope.
Build the disclosure registry/source-generation path for Settings -> Where your data goes.
Add TelemetryGuard.Validate at emit and export boundaries with allowlist and fuzz rejection tests.
Add local Insights metrics for proposal acceptance/edit/reject cohorts without storing user content.
Acceptance Criteria
Eval harness includes happy-path, clarification, refusal, safety, and prompt-injection cases.
Prompt regression runs locally and in CI without sharing data externally.
Egress MITM test fails on an attempted attacker host and passes for configured envelope entries.
Where-your-data-goes registry enumerates every outbound site, payload category, and using tool/agent.
Telemetry guard rejects long strings, URLs, email-like strings, unknown keys, and non-finite metrics.
Local Insights shows content-free acceptance/edit/reject trends by prompt version.
Suggested Verification
dotnet test eval/egress/telemetry filters
promptfoo local command with --no-share
Frontend tests for Settings disclosure and Insights displays
Context
Week 8 from
taskdeck-12-week-roadmap-v4.md.Parent: #972
Depends on: #976, #979
This issue expands evaluation and privacy controls around the new proposal pipeline. It treats exfiltration safety as separate from mutation safety.
Scope
Microsoft.Extensions.AI.Evaluationor a documented fallback integrated withdotnet test.promptfoousing--no-shareand PR-visible diffs.WireMock.NetMITM integration test for the full capture -> proposal -> agent flow that fails on outbound hosts outsideEgressEnvelope.TelemetryGuard.Validateat emit and export boundaries with allowlist and fuzz rejection tests.Acceptance Criteria
Suggested Verification
dotnet testeval/egress/telemetry filterspromptfoolocal command with--no-share