Skip to content

Comments

fix(models): pass (filename, bytes, mime_type) tuple to acreate_file for OpenAI/Azure providers#4603

Open
alejandro-sotoca-bts wants to merge 2 commits intogoogle:mainfrom
alejandro-sotoca-bts:fix/litellm-file-upload-mime-type
Open

fix(models): pass (filename, bytes, mime_type) tuple to acreate_file for OpenAI/Azure providers#4603
alejandro-sotoca-bts wants to merge 2 commits intogoogle:mainfrom
alejandro-sotoca-bts:fix/litellm-file-upload-mime-type

Conversation

@alejandro-sotoca-bts
Copy link

Summary

Fixes #4174

When a user attaches a non-image file (e.g. a PDF) to a chat using adk web or the load_artifacts tool with an OpenAI or Azure model via LiteLLM, the request fails with:

litellm.BadRequestError: OpenAIException - Invalid file data: 'file_id'.
Expected a file with an application/pdf MIME type, but got unsupported MIME type 'None'.

Root cause

In src/google/adk/models/lite_llm.py, the acreate_file call for _FILE_ID_REQUIRED_PROVIDERS (openai, azure) was passing only raw bytes as the file argument:

# Before
file_response = await litellm.acreate_file(
    file=part.inline_data.data,   # raw bytes — no filename or MIME type
    purpose="assistants",
    custom_llm_provider=provider,
)

LiteLLM and the OpenAI Files API expect a multipart upload in the form (filename, bytes, content_type). Without a filename, the Content-Type header is not set, so the stored file ends up with MIME type None and the chat completions API rejects it.

This only affects OpenAI and Azure because other providers (Gemini, Anthropic) send files inline as data URIs and do not use acreate_file.

Fix

Pass the full (filename, bytes, mime_type) tuple to acreate_file, using the part's display_name when available, or a sensible default filename derived from the MIME type via the new _filename_for_mime helper:

# After
display_name = (
    getattr(part.inline_data, "display_name", None)
    or _filename_for_mime(part.inline_data.mime_type)
)
file_response = await litellm.acreate_file(
    file=(display_name, part.inline_data.data, part.inline_data.mime_type),
    purpose="assistants",
    custom_llm_provider=provider,
)

Testing

Validated locally with adk web using openai/gpt-4o and azure/gpt-4o: PDF attachments are now processed correctly and the MIME type error no longer occurs.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @alejandro-sotoca-bts, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical bug preventing the successful upload of non-image files to OpenAI and Azure models when using LiteLLM. By correctly formatting the file upload request to include essential metadata like filename and MIME type, it ensures that these models can properly process attached documents, enhancing the robustness of file handling within the system.

Highlights

  • File Upload Fix for OpenAI/Azure Models: Resolved an issue where non-image files (e.g., PDFs) attached to chats using adk web or load_artifacts with OpenAI or Azure models via LiteLLM would fail due to an 'Invalid file data' error, specifically an 'unsupported MIME type 'None''.
  • Root Cause Identification: Identified that the acreate_file call for _FILE_ID_REQUIRED_PROVIDERS (OpenAI, Azure) was incorrectly passing only raw bytes, instead of the expected (filename, bytes, content_type) tuple, leading to a missing Content-Type header and a None MIME type.
  • Implementation of Correct File Upload Format: Modified the acreate_file call to pass the full (filename, bytes, mime_type) tuple. This includes using the part's display_name or a new _filename_for_mime helper to ensure the correct filename and MIME type are provided during upload.
  • Scope of the Fix: Confirmed that this fix specifically targets OpenAI and Azure providers, as other providers (Gemini, Anthropic) handle files differently by sending them inline as data URIs, not utilizing acreate_file.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • src/google/adk/models/lite_llm.py
    • Added a new dictionary _MIME_TO_FILENAME to map common MIME types to default filenames, crucial for providers that derive MIME types from file extensions.
    • Introduced a helper function _filename_for_mime to retrieve a default filename based on a given MIME type, falling back to 'document.bin' if no specific mapping exists.
    • Updated the acreate_file call within the _get_content function for _FILE_ID_REQUIRED_PROVIDERS (OpenAI, Azure) to pass a tuple containing the filename, raw bytes, and MIME type, instead of just the raw bytes.
    • Integrated the _filename_for_mime helper to supply a filename when the display_name is not available from the inline data.
Activity
  • The author validated the fix locally using adk web with openai/gpt-4o and azure/gpt-4o, confirming that PDF attachments are now processed correctly and the MIME type error no longer occurs.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@google-cla
Copy link

google-cla bot commented Feb 24, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@adk-bot adk-bot added the models [Component] Issues related to model support label Feb 24, 2026
@adk-bot
Copy link
Collaborator

adk-bot commented Feb 24, 2026

Response from ADK Triaging Agent

Hello @alejandro-sotoca-bts, thank you for your contribution!

Before we can merge this PR, you'll need to sign the Contributor License Agreement (CLA). You can do so by following the instructions in the "Details" link of the cla/google check at the bottom of the PR.

Thanks!

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes a bug where file uploads to OpenAI/Azure providers were failing due to missing MIME types. The solution of passing a (filename, bytes, mime_type) tuple to litellm.acreate_file is appropriate. The logic to determine the filename, by preferring display_name and falling back to a generated name based on the MIME type, is robust. My main feedback is regarding testing. The change introduces new logic and modifies an existing call, which should be covered by unit tests to prevent future regressions. I've left a specific comment with suggestions for updating existing tests and adding a new one.

…for OpenAI/Azure providers

When uploading non-image file attachments (e.g. PDFs) via the LiteLLM
integration, the `acreate_file` call for OpenAI and Azure providers was
passing only raw bytes as the `file` argument. LiteLLM and the OpenAI
Files API expect a multipart upload with a `(filename, bytes, content_type)`
tuple so the Content-Type header is set correctly. Passing raw bytes caused
the stored file to have MIME type `None`, which the chat completions API
then rejected with:

  Invalid file data: 'file_id'. Expected a file with an application/pdf
  MIME type, but got unsupported MIME type 'None'.

Fix: pass `(display_name, data, mime_type)` to `acreate_file`, using the
part's `display_name` when available, or a sensible default filename
derived from the MIME type via the new `_filename_for_mime` helper.

Fixes google#4174
@alejandro-sotoca-bts alejandro-sotoca-bts force-pushed the fix/litellm-file-upload-mime-type branch from ecec4f0 to d072250 Compare February 24, 2026 11:54
…roviders

Update existing tests that assert acreate_file is called with raw bytes
to instead expect the (filename, bytes, mime_type) tuple required by the
LiteLLM/OpenAI multipart upload API.

Add test_get_content_pdf_openai_uses_display_name_as_filename to verify
that part.inline_data.display_name is used as the filename when available,
falling back to _filename_for_mime otherwise.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models [Component] Issues related to model support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] LiteLLM (OpenAI/gpt-4o-mini): load_artifacts PDF attachment fails with “Invalid file data: 'file_id' … expected application/pdf, got None”

2 participants