Skip to content

BUG: Vertex AI Batch Prediction rejects valid url_context payload (Schema validation mismatch with Online Prediction) #2123

@atndk1

Description

@atndk1

Overview

There is a schema validation mismatch between the standard Vertex AI Online Prediction API and the Vertex AI Batch Prediction API regarding tools with no fields, such as url_context.

When utilizing the url_context tool (currently in v1beta1), the Gemini API requires it to be passed as an empty object within the tools array: {"url_context": {}}. While standard Online Prediction accepts this empty object and works correctly, the Vertex AI Batch Prediction service rejects the exact same payload due to strict server-side JSONL schema validation expecting nested fields.
NB: For the google_search tool, adding "timeRangeFilter": null can be used as a workaround (see Community Post 2 linked below) but not for others like url_context.

Environment details

Steps to reproduce

  1. Create a JSONL file for a Vertex AI Batch Prediction job.
  2. Format a row to include the url_context tool, eg:
{"request": {"contents":[{"role": "user", "parts": [{"text": "Summarize the content of this URL: https://example.com"}]}],"tools": [{"url_context": {}}]}}
  1. Upload the JSONL file to Google Cloud Storage.
  2. Submit the Batch Prediction job using the v1beta1 endpoint.

Sample code to reproduce on Colab

from google.colab import auth
auth.authenticate_user()

from google import genai
from google.genai.types import CreateBatchJobConfig

# replace with relevant values
PROJECT_ID = ""
BUCKET_NAME = ""
INPUT_PATH = ""
OUTPUT_PATH = ""
JSONL_PAYLOADS_NAME = ""

LOCATION = "global"
MODEL_ID = "gemini-3-flash-preview"

INPUT_DATA = f"gs://{BUCKET_NAME}/{INPUT_PATH}/{JSONL_PAYLOADS_NAME}"
BUCKET_URI = f"gs://{BUCKET_NAME}/{OUTPUT_PATH}/"

client = genai.Client(vertexai=True, project=PROJECT_ID, location=LOCATION)

gcs_batch_job = client.batches.create(
    model=MODEL_ID,
    src=INPUT_DATA,
    config=CreateBatchJobConfig(dest=BUCKET_URI),
)

Expected Behavior

The Batch Prediction parser should accept the empty url_context object and process the batch job, mirroring the behavior of the online generateContent API.

Actual Behavior

The batch job fails immediately during the initial parsing phase with the following error:
Query error: Cannot store struct 'request.tools.url_context' with no fields at [1:1]

Metadata

Metadata

Labels

priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions