Skip to content

feat(timmy): structured-output + scoped tool dispatch + URL egress for Timmy tool calls (T13 part 2) #384

@ericfitz

Description

@ericfitz

Threat reference

T13 (Prompt injection via attacker-hosted document content) — see docs/THREAT_MODEL.md §4.

Context

This is the remainder of #353 (closed). Part 1 — per-threat-model retrieval isolation + system-prompt + fences for untrusted content — landed in bcbbb37. The mitigations below were deferred because Timmy's LangChainGo config does not yet wire any tools, so there is no tool dispatcher to harden.

What's still missing

When Timmy starts using LLM tool calls (the function-calling / tool_use API path), three guards apply:

  1. Tightly typed tool schemas. Every tool defined for the model must have a JSON Schema with no additionalProperties and explicit enums for any field that has a closed value set. The dispatcher rejects any tool call whose arguments do not validate against the schema.
  2. URL egress through the SafeHTTPClient. Any tool argument that is a URL (e.g., a fetch tool) goes through the same SSRF/egress allowlist as user-supplied URLs (T3, see feat(api): migrate legacy direct http.Client callers to SafeHTTPClient #364 for the legacy http.Client migration). Tools must not call http.DefaultClient.
  3. Invoker-scoped tool authorization. Tools that touch the database use the invoker's effective permissions, not Timmy's service identity — i.e., go through the same access-check helper that the OpenAPI handlers use, not directly through the GORM store.

Acceptance criteria

  • A schema-validation unit test that rejects a synthetic tool call with extra fields, wrong types, or out-of-enum values.
  • A red-team prompt-injection integration test: attacker-supplied document chunk asking Timmy to fetch https://evil.example/exfil?... is blocked at egress, not at the LLM.
  • A test that a tool call attempting to read a threat model the invoker does not have reader access to returns the same authorization decision as a direct GET on that resource.

Effort

S–M (depends on which tool surface ships first).

Related

Metadata

Metadata

Assignees

Labels

Projects

Status

This milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions