Skip to content

Create schemas library #619

@tombeckenham

Description

@tombeckenham

Problem

The existing per-provider model-meta.ts files encode coarse model facts (context windows, modalities, pricing) but do not describe the rich per-endpoint / per-model constraint surface — e.g. which video durations Veo3 accepts, which image sizes Flux2 accepts, prompt length caps per endpoint, valid voice IDs for ElevenLabs.

Type safety covers one axis. Runtime validation is another. Beyond both: schemas let runtime code and LLMs shape data the way models require, give video creators ways to determine which durations are acceptable, expose maximum prompt lengths, and provide a basis for optimising a prompt designed for one model to another.

Solution

A new package @tanstack/ai-schemas, separate from the main client libraries, that ships:

  • JSON Schema definitions per (provider, category, endpoint) — self-contained, $defs-bundled, ready for LLM tool APIs / Ajv / dropdown rendering / cross-model comparison.
  • Zod equivalents (optional zod ^4 peer dep) for runtime validation before requests hit the network.

Sources & automation

Every schema is generated from the provider's official OpenAPI / equivalent spec — nothing hand-maintained. A nightly CI workflow re-fetches the upstream specs and opens a PR with the diff.

Provider sources:

  • FALhttps://api.fal.ai/v1/models?status=active&expand=openapi-3.0 (per-model OpenAPI, paginated, needs FAL_KEY).
  • OpenAIhttps://raw.githubusercontent.com/openai/openai-openapi/master/openapi.yaml (public).
  • Anthropic — official OpenAPI from the anthropic-sdk-typescript repo.
  • Geminihttps://generativelanguage.googleapis.com/$discovery/rest?version=v1beta (Google Discovery doc).
  • ElevenLabshttps://api.elevenlabs.io/openapi.json (public).

OpenAI-compatible providers (Groq, xAI/Grok) reuse OpenAI's spec.

Prior art to port

The architecture is established in fal-ai/fal-js PR #212 (feat(schemas): @fal-ai/schemas Zod + JSON Schema library). This issue ports that pipeline into the TanStack monorepo and extends the same OpenAPI-driven approach to every provider above.

Key components ported verbatim or generalised:

  • scripts/fetch-openapi-models.ts → multi-provider fetch-schemas.ts.
  • scripts/merge-openapi-specs.ts → provider-aware preprocessor (schema renames, missing-$ref injection, default coercion).
  • scripts/generate-endpoint-maps.ts → emits per-category endpoint-{schema,zod}-map.ts + barrels, bundles $defs closures.
  • openapi-ts.config.ts → drives @hey-api/openapi-ts codegen (@hey-api/schemas + zod plugins).
  • libs/schemas/src/openai-strict.tstoOpenAIStrict(schema) helper.
  • .github/workflows/update-openapi-schemas.yml → adapted as .github/workflows/sync-schemas.yml following the existing sync-models.yml PR-bot pattern.

Out of scope for the initial PR

  • Cross-provider prompt-portability helpers (translatePrompt(fromModel, toModel)) — follow-up; the schemas package enables this but does not ship the helper.
  • Adapter integration — @tanstack/ai-schemas stands alone initially; per-adapter consumption (validate before send) lands incrementally.
  • Cross-field validations beyond what OpenAPI expresses (e.g. "Hailuo-02 rejects 10s at 1080p" lives in field descriptions).

Secrets to add to the repo

  • FAL_KEY — required for the FAL fetcher in the nightly workflow.
  • Other providers' specs are public.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions