Skip to content

Validate nested response shapes (content blocks, tool_use, choices) #42

@stackbilt-admin

Description

@stackbilt-admin

Problem

PR #40 added envelope schema validation for top-level fields (`id`, `content`, `model`, `stop_reason`, `usage.*`). Element shapes inside arrays are not validated, and that's exactly where providers reshape most often.

Example — Anthropic `formatResponse`:

```ts
const textContent = data.content
.filter(block => block.type === 'text') // not validated
.map(block => block.text) // not validated
.join('');

const toolUses = data.content.filter(block => block.type === 'tool_use');
const raw: ToolCall[] = toolUses.map(tool => ({
id: tool.id!, // non-null assertion
type: 'function' as const,
function: {
name: tool.name!, // non-null assertion
arguments: JSON.stringify(tool.input)
}
}));
```

If Anthropic:

  • Renames `tool_use` blocks to `function_call` → `filter(block => block.type === 'tool_use')` returns `[]`, tools silently disappear from the response. No error. No hook. No fallback.
  • Moves `tool.id` to `tool.metadata.id` → `tool.id!` becomes `undefined`, passes validation, stringified as `'undefined'` into the downstream ToolCall. Broken tool execution with no signal.
  • Changes `block.text` to `block.content` → `textContent` is empty string, response looks like the model returned nothing. User thinks the model is broken.

The same pattern exists across all providers:

  • OpenAI: `choices[0].message.content`, `choices[0].message.tool_calls[*]`
  • Groq / Cerebras / Cloudflare: same OpenAI-compatible `choices` envelope

The PR #40 hole

The current `validateSchema` only walks top-level paths. When PR #40 claims "defense against silent API deprecations," this is the gap: the validator gives a false sense of security for the nested shapes where most drift actually happens in practice.

Proposal

Extend the schema validator with an `items` field for arrays:

```ts
interface SchemaField {
path: string;
type: SchemaFieldType;
optional?: boolean;
/**

  • For type: 'array' — validate each element against a conditional schema.
  • Element schemas can be discriminated by a field (e.g. content[].type)
  • so each variant gets its own validation.
    */
    items?: {
    discriminator?: string; // e.g. 'type'
    variants?: Record<string, SchemaField[]>; // 'text' | 'tool_use' -> per-variant fields
    shape?: SchemaField[]; // for non-discriminated arrays
    };
    }
    ```

Then Anthropic's schema becomes:

```ts
const ANTHROPIC_RESPONSE_SCHEMA: SchemaField[] = [
{ path: 'id', type: 'string' },
{
path: 'content',
type: 'array',
items: {
discriminator: 'type',
variants: {
text: [{ path: 'text', type: 'string' }],
tool_use: [
{ path: 'id', type: 'string' },
{ path: 'name', type: 'string' },
{ path: 'input', type: 'object' },
],
},
},
},
{ path: 'model', type: 'string' },
{ path: 'stop_reason', type: 'string' },
{ path: 'usage.input_tokens', type: 'number' },
{ path: 'usage.output_tokens', type: 'number' },
];
```

Unknown discriminator values (e.g. a new block type Anthropic adds) should be ignored, not rejected — we want to tolerate additive changes without triggering fallback on every new feature rollout.

Tests

  • Anthropic with reshaped `tool_use.id` → SchemaDriftError at `content[*].id`
  • Anthropic with renamed `tool_use` → `tool_use` type SchemaDriftError with actual = 'function_call'
  • OpenAI with missing `choices[0].message` → SchemaDriftError
  • Unknown content block type in Anthropic response → passes validation (forward-compat)
  • After fix, the non-null assertions in `formatResponse` can be replaced with schema-checked access

Priority

HIGH. This is the main review finding from PR #40. Without it, the drift shield is incomplete for the case that actually matters most.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions