Skip to content

Changes for adding gemma model config#90

Draft
vavarshn wants to merge 2 commits into
mainfrom
vvarshney/gemma4_ai_tier
Draft

Changes for adding gemma model config#90
vavarshn wants to merge 2 commits into
mainfrom
vvarshney/gemma4_ai_tier

Conversation

@vavarshn
Copy link
Copy Markdown
Collaborator

Summary

This MR updates the AI Tier SAIA model deployment config to use Gemma 4 as the primary hosted SAIA v2 model while keeping GPT-OSS 20B available for supporting flows.
Changes include:

  • Replace the GptOss120b Ray Serve application with Gemma431bIt (gemma4_31b_it).
  • Keep GptOss20b deployed for field descriptions, conversation titles, and metadata-description paths used by SAIA service.
  • Update SAIA feature replica defaults to scale Gemma431bIt and GptOss20b.
  • Update the Ray builder config test to expect Gemma431bIt + GptOss20b as the text generation apps.
  • Update k0s quickstart model artifact documentation from gpt-oss-120b to gemma-4-31b-it.

Why

SAIA v2 hosted model selection now defaults to Gemma 4 when use_gpt_oss_120b is false or missing. This operator change makes the corresponding Gemma 4 Ray endpoint available in AI Tier and removes the GPT-OSS 120B app from the local deployment config.
GPT-OSS 20B is intentionally retained because saia-service still uses it for auxiliary CMP flows such as field descriptions, title generation, and metadata descriptions.

@vavarshn vavarshn force-pushed the vvarshney/gemma4_ai_tier branch from 7952ecc to 7c0722d Compare May 21, 2026 06:05
@vavarshn vavarshn force-pushed the vvarshney/gemma4_ai_tier branch from 7c0722d to da63931 Compare May 22, 2026 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants