Skip to content

feat(go/plugins/vertexai/modelgarden): add Claude Opus 4.5, Llama, shared scaffolding#5296

Open
cabljac wants to merge 6 commits into
mainfrom
feat/go-modelgarden-compat-oai
Open

feat(go/plugins/vertexai/modelgarden): add Claude Opus 4.5, Llama, shared scaffolding#5296
cabljac wants to merge 6 commits into
mainfrom
feat/go-modelgarden-compat-oai

Conversation

@cabljac
Copy link
Copy Markdown
Contributor

@cabljac cabljac commented May 11, 2026

Extends the Vertex AI Model Garden Go plugin with new Anthropic Claude models, a new Meta Llama plugin, and shared scaffolding used by all modelgarden plugins.

This PR is the first half of the work originally proposed in #5175. The Mistral plugin (which requires a different transport approach because Vertex does not serve Mistral via the OpenAI-compatible endpoint) is stacked on top of this PR in a follow-up.

Models

  • Anthropic (existing plugin, more models): claude-opus-4-5@20251101, claude-opus-4-1-20250805, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001, plus the previously supported Claude 3.5 / 3.7 / Opus 4 / Sonnet 4 entries.
  • Meta Llama (MaaS, new plugin): meta/llama-4-maverick-17b-128e-instruct-maas, meta/llama-4-scout-17b-16e-instruct-maas, meta/llama-3.3-70b-instruct-maas. Llama 4 variants register as Multimodal; Llama 3.3 70B is text-only. Routed through compat_oai against the Vertex MaaS OpenAI-compatible endpoint.

Plugin scaffolding

  • New Llama plugin type alongside the existing Anthropic, with Init, Name, DefineModel(name, opts), and a top-level LlamaModel(g, id) lookup mirroring the Anthropic shape.
  • Shared resolveVertexMaasEnv helper for GOOGLE_CLOUD_PROJECT / GCLOUD_PROJECT / GOOGLE_CLOUD_LOCATION / GOOGLE_CLOUD_REGION resolution.
  • Shared provider = "vertexai" constant so every modelgarden plugin registers models under the same vertexai/<model> namespace.
  • Anthropic.DefineModel now takes the same mutex and initted guard as Llama.DefineModel, so calling it before Init returns an explicit "not initialized" error instead of using a zero-value client.

Sample

go/samples/modelgarden registers two flows for Dev UI smoke testing:

  • opus45Flowclaude-opus-4-5@20251101
  • llamaFlowmeta/llama-3.3-70b-instruct-maas

Each plugin is constructed with an explicit Location because Vertex MaaS regional availability differs per publisher (Anthropic in us-east5, Llama in us-central1).

Tests

  • define_model_test.go — pre-Init error path for both plugins.
  • internal_test.go — nil ai.ModelOptions branch of DefineModel for both plugins.
  • models_test.goresolveVertexMaasEnv covered for explicit args, primary / secondary env fallback, and panic paths.
  • llama_live_test.go — basic + streaming generation against Llama 3.3 70B (credential-gated).
  • anthropic_live_test.go — Opus 4.5 subtest (credential-gated).

Testing

image image

…ared scaffolding

Extends the Vertex AI Model Garden Go plugin with new Anthropic Claude
models and a new Meta Llama plugin, plus shared scaffolding used by all
modelgarden plugins.

Models:
- Anthropic: claude-opus-4-5@20251101, claude-opus-4-1-20250805,
  claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001, plus the
  previously supported 3.5 / 3.7 / Opus 4 / Sonnet 4 entries.
- Meta Llama (MaaS, new plugin): meta/llama-4-maverick-17b-128e-
  instruct-maas, meta/llama-4-scout-17b-16e-instruct-maas,
  meta/llama-3.3-70b-instruct-maas. Llama 4 variants register as
  Multimodal; Llama 3.3 70B is text-only. Routed through compat_oai
  against the Vertex MaaS OpenAI-compatible endpoint.

Plugin scaffolding:
- New Llama plugin alongside the existing Anthropic, with Init, Name,
  DefineModel(name, opts), and a top-level LlamaModel(g, id) lookup
  mirroring the Anthropic shape.
- Shared resolveVertexMaasEnv helper for GOOGLE_CLOUD_PROJECT /
  GCLOUD_PROJECT / GOOGLE_CLOUD_LOCATION / GOOGLE_CLOUD_REGION
  resolution.
- Shared provider = "vertexai" constant so every modelgarden plugin
  registers models under the same vertexai/<model> namespace.
- Anthropic.DefineModel now takes the same mutex and initted guard as
  Llama.DefineModel, so calling it before Init returns an explicit
  "not initialized" error instead of using a zero-value client.

Sample:
- go/samples/modelgarden registers an Anthropic and a Llama flow for
  Dev UI smoke testing. Each plugin is constructed with an explicit
  Location because Vertex MaaS regional availability differs per
  publisher (Anthropic in us-east5, Llama in us-central1).

Tests:
- define_model_test.go covers the pre-Init error path for both plugins.
- internal_test.go covers the nil ai.ModelOptions branch of DefineModel
  for both plugins.
- models_test.go covers resolveVertexMaasEnv for explicit args, primary
  and secondary env fallback, and panic paths.
- llama_live_test.go exercises basic + streaming generation against
  meta/llama-3.3-70b-instruct-maas (credential-gated).
- anthropic_live_test.go adds a claude-opus-4-5@20251101 subtest
  (credential-gated).
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Llama plugin for Vertex AI Model Garden, which leverages OpenAI-compatible endpoints and Google OAuth2 authentication. It also refactors environment variable resolution into a shared helper function used by both the Anthropic and Llama plugins. Additionally, the Anthropic plugin was updated with initialization checks in DefineModel and support for a new model version. Comprehensive unit, white-box, and live tests were added to ensure the reliability of the new features. Feedback was provided regarding the Llama plugin's initialization, specifically suggesting the use of context.Background() for the OAuth2 client to prevent issues with token refreshes if the initial context is short-lived.

Comment thread go/plugins/vertexai/modelgarden/llama.go Outdated
Match the existing per-plugin convention seen in go/samples/anthropic
and go/samples/compat_oai/{anthropic,custom,openai}. The existing
go/samples/modelgarden sample stays single-flow Anthropic-only; a new
go/samples/modelgarden-llama sample exercises the new Llama plugin in
isolation.
@cabljac cabljac marked this pull request as draft May 11, 2026 16:25
…a oauth2 client

The oauth2.NewClient and TokenSource outlive Init's ctx because every
later generate call goes through them. If a caller passes a short-lived
ctx to Init (e.g. one with a timeout, or one cancelled after plugin
setup), token refresh on later calls would fail with the original ctx
cancelled. Bind both to context.Background().

Reported by gemini-code-assist on #5296.
@cabljac
Copy link
Copy Markdown
Contributor Author

cabljac commented May 11, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Llama plugin for Vertex AI Model Garden, leveraging OpenAI-compatible endpoints, and refactors the Anthropic plugin to share environment variable resolution logic. It also adds initialization guards and mutex locking to model definition methods. Feedback highlights a potential issue with the Llama baseURL construction regarding missing publisher segments and suggests using safe type assertions when registering models to prevent potential panics.

Comment thread go/plugins/vertexai/modelgarden/llama.go
Comment thread go/plugins/vertexai/modelgarden/llama.go
@cabljac
Copy link
Copy Markdown
Contributor Author

cabljac commented May 12, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Llama plugin for Vertex AI Model Garden, leveraging OpenAI-compatible endpoints, and refactors environment variable resolution into a shared utility used by both Llama and Anthropic plugins. Additionally, it updates the Anthropic model list with Claude 4.5 Opus and adds several test suites and a usage sample. Feedback suggests improving the Llama plugin's initialization by using context.WithoutCancel to preserve context values and wrapping errors with more descriptive information for better diagnosability.

Comment thread go/plugins/vertexai/modelgarden/llama.go Outdated
Copy link
Copy Markdown

@adesinah adesinah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a concern about Backwards-incompatible API change with no migration path:
This PR renames from claude-opus-4@20250514claude-opus-4 (and the other 11 ID changes) which will silently breaks every existing caller that pins to a dated ID - same code that worked before now fails for an invisible reason.

My suggestion would be to alias the dated keys with Stage: ModelStageDeprecated — this keeps existing code working and emits a clear deprecation warning per call so users know what to migrate to. Drop the aliases in the next major release:

var AnthropicModels = map[string]ai.ModelOptions{
      // Canonical (floating) IDs:
      "claude-opus-4":   {Label: "Claude Opus 4", Supports: &internal.Multimodal},
      "claude-opus-4-5": {Label: "Claude Opus 4.5", Supports: &internal.Multimodal},
      "claude-sonnet-4": {Label: "Claude Sonnet 4", Supports: &internal.Multimodal},
      // ... rest of the new entries

      // Deprecated dated aliases — preserved for backwards compat with callers
      // that pinned to a specific version before the 2026-05 rename. The
      // ModelStageDeprecated stage causes model_middleware.go to log a WARN on
      // every call, giving users a clear migration signal. Remove in the next
      // major version.
      "claude-opus-4@20250514":         {Label: "Claude Opus 4 (legacy ID)", Supports: &internal.Multimodal, Stage:
  ai.ModelStageDeprecated},
      "claude-opus-4-5@20251101":       {Label: "Claude Opus 4.5 (legacy ID)", Supports: &internal.Multimodal, Stage:
  ai.ModelStageDeprecated},
      "claude-sonnet-4@20250514":       {Label: "Claude Sonnet 4 (legacy ID)", Supports: &internal.Multimodal, Stage:
      "claude-3-5-sonnet@20240620":     {Label: "Claude 3.5 Sonnet (legacy ID)", Supports: &internal.Multimodal, Stage:
  ai.ModelStageDeprecated},
      "claude-3-sonnet@20240229":       {Label: "Claude 3 Sonnet (legacy ID)", Supports: &internal.Multimodal, Stage:
  ai.ModelStageDeprecated},
      "claude-3-haiku@20240307":        {Label: "Claude 3 Haiku (legacy ID)", Supports: &internal.Multimodal, Stage:
  ai.ModelStageDeprecated},
      "claude-3-opus@20240229":         {Label: "Claude 3 Opus (legacy ID)", Supports: &internal.Multimodal, Stage:
  ai.ModelStageDeprecated},
      "claude-3-7-sonnet@20250219":     {Label: "Claude 3.7 Sonnet (legacy ID)", Supports: &internal.Multimodal, Stage:
  ai.ModelStageDeprecated},
      "claude-opus-4-1-20250805":       {Label: "Claude Opus 4.1 (legacy ID)", Supports: &internal.Multimodal, Stage:
  ai.ModelStageDeprecated},
      "claude-sonnet-4-5-20250929":     {Label: "Claude Sonnet 4.5 (legacy ID)", Supports: &internal.Multimodal, Stage:
  ai.ModelStageDeprecated},
      "claude-haiku-4-5-20251001":      {Label: "Claude Haiku 4.5 (legacy ID)", Supports: &internal.Multimodal, Stage:
  ai.ModelStageDeprecated},
  }

…as deprecated aliases

The undated rename in 2c488d4 silently broke callers pinned to date-suffixed
IDs (e.g. claude-opus-4@20250514) because the map key is the Vertex model ID
sent on the wire — looking up the old key now returns nil from the registry.

Re-add the 8 @Date keys that shipped before the rename, marked
ModelStageDeprecated. Vertex accepts both undated and @Date forms for the
same underlying model (verified live + matches JS plugin's KNOWN_MODELS), so
existing callers continue to work and get a deprecation warning via the
model_middleware Warn log.

Adds TestAnthropicModels_DeprecatedAliases to pin the surface.

The 3 bare-hyphen keys that shipped (claude-opus-4-1-20250805 etc.) are
deliberately not re-added — Vertex always required @ not - for the date
suffix, so those keys never worked and have no real callers to preserve.
@cabljac
Copy link
Copy Markdown
Contributor Author

cabljac commented May 27, 2026

@adesinah good catch — addressed in 4d1c130.

Re-added the 8 @DATE keys that shipped before the rename, marked Stage: ai.ModelStageDeprecated. Vertex accepts both undated and @DATE forms for the same underlying model (verified live + matches JS plugin's KNOWN_MODELS), so existing callers keep working and get a Warn log via model_middleware.

One nuance from a live probe: the 3 bare-hyphen keys that shipped (claude-opus-4-1-20250805, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001) were always rejected by Vertex: it seems to require @ not - for the date suffix. So those keys never worked in practice and I haven't aliased them

Copy link
Copy Markdown

@adesinah adesinah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

…lama oauth2

Detach the token source and oauth2 client from Init's cancellation
without dropping ctx values — refresh calls now propagate trace IDs
and loggers from the caller's Init ctx instead of starting from a
fresh background ctx.
@cabljac cabljac requested a review from apascal07 May 28, 2026 12:17
@cabljac cabljac marked this pull request as ready for review May 28, 2026 12:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants