feat: Add QA evaluation structured outputs for Starlight (Brent Council)#5
Open
roshan-vapi wants to merge 1 commit intomainfrom
Open
feat: Add QA evaluation structured outputs for Starlight (Brent Council)#5roshan-vapi wants to merge 1 commit intomainfrom
roshan-vapi wants to merge 1 commit intomainfrom
Conversation
Add 5 structured output YAML files for automated post-call QA evaluation of Brent Council Housing Benefits calls: - starlight-qa-engagement.yml: 7 questions (3 auto-fail: 1.3, 1.4, 1.5) - starlight-qa-right-first-time.yml: 8 questions (3 auto-fail: 2.3, 2.4, 2.5) - starlight-qa-signposting.yml: 2 questions (no auto-fail) - starlight-qa-explaining.yml: 2 questions (no auto-fail) - starlight-wrap-up-code.yml: call classification into 19 wrap-up codes Each QA structured output evaluates per-question with result (yes/no/not_applicable), reasoning, and transcript evidence. Auto-fail logic: if ANY auto-fail question receives "no", the entire evaluation fails across all categories. All outputs include multilingual transcript support, AI agent adaptation notes, and the full Brent Council Housing Benefits glossary. Closes PRO-846 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 5 structured output YAML files for automated post-call QA evaluation of Brent Council Housing Benefits calls (Starlight project).
Linear Issue
PRO-846
Files Created
resources/structuredOutputs/starlight-qa-engagement.ymlresources/structuredOutputs/starlight-qa-right-first-time.ymlresources/structuredOutputs/starlight-qa-signposting.ymlresources/structuredOutputs/starlight-qa-explaining.ymlresources/structuredOutputs/starlight-wrap-up-code.ymlSchema Design
Each QA structured output produces per-question evaluations with:
result:yes/no/not_applicablereasoning: explanation referencing the conversationevidence: array of{ message_text, timestamp }excerptsTop-level fields:
auto_fail:trueif ANY auto-fail question receivednooverall_pass:trueonly ifauto_failisfalsecategory_score: fraction string e.g."5/7"Auto-fail logic: If any auto-fail question in ANY of the 4 categories receives
no, the ENTIRE call evaluation fails. Each structured output sets its ownauto_failflag; the consuming application must check across all 4.Key Design Decisions
gpt-4.1attemperature: 0for deterministic, accurate QA evaluationnot_applicableguidanceassistant_ids: []: Empty because Starlight assistant configs are not yet in the gitops repo; will be populated when they are addedsecondary_classification_notesfield for pending tier definitionsLine Count Note
This PR is 778 lines, which exceeds the 500-line guideline. However, all additions are declarative YAML data files with repetitive per-question schema structure. The 5 files are logically atomic units that cannot be meaningfully split -- each represents a single structured output definition. No code was modified.
How to Test
yamlnpm packageschema.typeis always a simple string (not an array) per AGENTS.md warningnpm run push:dev), verify structured outputs appear in the dashboardValidation
name,type,target,description,model,schema,assistant_ids,workflow_ids)schema.typeconfirmed as simple string"object"in all files (avoids.toLowerCase()crash)result,reasoning, andevidencesub-propertiesnamefields followsnake_caseconvention per AGENTS.md