Skip to content

fix: agent managed identity support in semantic_kernel_loader (#769)#770

Open
vivche wants to merge 9 commits intomicrosoft:Developmentfrom
vivche:fix/agent-managed-identity-sk-loader
Open

fix: agent managed identity support in semantic_kernel_loader (#769)#770
vivche wants to merge 9 commits intomicrosoft:Developmentfrom
vivche:fix/agent-managed-identity-sk-loader

Conversation

@vivche
Copy link
Contributor

@vivche vivche commented Mar 5, 2026

Pull Request: fix/agent-managed-identity-sk-loader

Title: fix: agent managed identity support in semantic_kernel_loader (#769)

GitHub Issue: #769 — Agents fail silently when using Managed Identity authentication


Summary

Fixes #769 — agents configured with only a deployment name silently fail to load when the application uses Azure Managed Identity (MI) for Azure OpenAI authentication.

Problem

When an agent has only the deployment name set in its Model & Connection configuration, resolve_agent_config() merges the partial agent config with global settings (case 4 of the decision tree). After the merge, key is None because MI auth stores no API key. The gate condition in semantic_kernel_loader.py requires key to be truthy — so the agent is never constructed and chat silently falls back to plain GPT-4.1 with no tools or instructions.

Changes

application/single_app/semantic_kernel_loader.py

  • Import DefaultAzureCredential and get_bearer_token_provider from azure.identity
  • Detect MI auth before each AzureChatCompletion creation gate:
    use_managed_identity = (auth_type == 'managed_identity') and not apim_enabled and not agent_config.get("key")
  • Update gate condition to accept MI in place of a key:
    if AzureChatCompletion and agent_config["endpoint"] and (agent_config["key"] or use_managed_identity) and agent_config["deployment"]:
  • Add elif use_managed_identity: branch at all 3 AzureChatCompletion construction sites (single agent, multi-agent specialist, multi-agent orchestrator) using ad_token_provider with DefaultAzureCredential
  • Auto-detect Azure Government vs commercial cloud scope from endpoint URL

application/single_app/config.py

  • Version bump: 0.238.0240.238.025

Testing

  1. Configure app to use Managed Identity for Azure OpenAI (no API key in settings)
  2. Create an agent with only the deployment name set in Model & Connection
  3. Start a chat — agent should load with full instructions and tools
  4. Verify API actions are called (not fabricated responses)

Files Changed

  • application/single_app/semantic_kernel_loader.py
  • application/single_app/config.py
  • docs/explanation/fixes/v0.238.025/AGENT_MANAGED_IDENTITY_SK_LOADER_FIX.md

Copilot AI review requested due to automatic review settings March 5, 2026 03:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a silent failure mode in semantic_kernel_loader.py where agents configured with only a deployment name would fail to load when the application uses Azure Managed Identity (MI) for Azure OpenAI authentication (issue #769). The root cause was a gate condition requiring agent_config["key"] to be truthy, but MI authentication stores no API key. The fix adds MI detection before each of the three AzureChatCompletion construction sites and introduces a new elif use_managed_identity: branch at each site.

Changes:

  • semantic_kernel_loader.py: Imports DefaultAzureCredential/get_bearer_token_provider with a try/except fallback, adds MI detection and a new MI construction branch for single-agent, multi-agent specialist, and multi-agent orchestrator code paths.
  • config.py: Version bump from 0.238.024 to 0.238.025.
  • docs/explanation/fixes/v0.238.025/AGENT_MANAGED_IDENTITY_SK_LOADER_FIX.md: New fix documentation following the established versioned-docs convention.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
application/single_app/semantic_kernel_loader.py Core fix: adds MI import guard, MI detection flags, updated gate conditions, and new elif MI branches at all 3 AzureChatCompletion construction sites
application/single_app/config.py Version bump to 0.238.025
docs/explanation/fixes/v0.238.025/AGENT_MANAGED_IDENTITY_SK_LOADER_FIX.md Fix documentation with root-cause analysis, code samples, and auth flow diagram

Chen, Vivien added 2 commits March 5, 2026 00:28
When azure.identity fails to import, DefaultAzureCredential and
get_bearer_token_provider are set to None. Previously, use_managed_identity
could still evaluate to True based solely on auth_type and absence of a key,
causing the gate condition to pass and eventually calling
get_bearer_token_provider(None(), scope)  raising:
  TypeError: 'NoneType' object is not callable

Fix: add 'and bool(DefaultAzureCredential)' guard to all three managed
identity flag computations (lines 767, 1552, 1675) so managed identity
is only attempted when the credential class is actually available.

Identified by GitHub Copilot code review on PR microsoft#770.
Copilot AI review requested due to automatic review settings March 5, 2026 05:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

Chen, Vivien added 2 commits March 5, 2026 00:47
Security fix identified by GitHub Copilot code review on PR microsoft#770.

When allow_user_custom_agent_endpoints / allow_group_custom_agent_endpoints
is enabled, agent_config['endpoint'] can be an attacker-controlled URL.
With azure_openai_gpt_authentication_type=managed_identity, the code was
calling get_bearer_token_provider(DefaultAzureCredential(), scope) and
passing the token to AzureChatCompletion with that attacker-supplied
endpoint  leaking the managed identity bearer token to an external host.

Fix:
- Track endpoint_is_user_supplied in resolve_agent_config for each of the
  6 endpoint resolution branches (cases 1/3/4 = user-supplied True,
  cases 2/5/6 = system-controlled False)
- Add 'and not agent_config.get(endpoint_is_user_supplied, False)' guard
  to all three managed identity flag computations (use_managed_identity,
  _ma_use_mi, _orch_use_mi) so managed identity tokens are only ever
  sent to system-controlled Azure endpoints.
…detection duplication

The managed identity scope detection and token-provider creation logic was
duplicated verbatim at three call sites (lines ~814, ~1602, ~1730). Extract
into a single _build_mi_token_provider(endpoint) helper that selects the
correct Azure Cognitive Services scope based on whether the endpoint is in
the US Government cloud (.azure.us) or the commercial cloud (.azure.com),
then replace all three sites with a call to the helper.

This makes future sovereign cloud changes (e.g. China .azure.cn) a
single-line update instead of a three-place change.
Copilot AI review requested due to automatic review settings March 5, 2026 14:52
The parenthetical '(matches config.py app.config[VERSION])' was accurate
when the doc was first written at v0.238.025 but config.py has since moved
to v0.239.003, making the claim misleading. Remove the parenthetical;
the version number itself (0.238.025) remains correct.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

- Expand code example to show all five use_managed_identity guards
  (bool(DefaultAzureCredential) and not endpoint_is_user_supplied)
- Add corresponding bullet points explaining the two new guards
- Correct auth-flow example from case 4 to case 6 (global GPT fallback,
  endpoint_is_user_supplied=False) which is the actual path that permits MI
@paullizer paullizer requested review from Bionic711 and paullizer March 5, 2026 16:01
…ity finding

Functional test (functional_tests/test_agent_managed_identity_endpoint_flag.py):
- Covers all three logic paths per Copilot review finding CodingGuidelineID 1000005
- endpoint_is_user_supplied=False for Cases 2/5/6 and True for Cases 1/3/4 (6 assertions)
- Group agent scenarios: no custom fields -> MI permitted; custom endpoint -> MI blocked (2 assertions)
- use_managed_identity evaluates correctly across all five guard conditions (6 assertions)
- AzureChatCompletion gate admits MI and blocks when user_supplied=True (5 assertions)

Fix documentation (docs/explanation/fixes/v0.239.002/AGENT_MANAGED_IDENTITY_SK_LOADER_FIX.md):
- Moved from v0.238.025/ to v0.239.002/ to match current config.py VERSION
- Corrected Fixed/Implemented version header to 0.239.002
- Added 'Security Vulnerability' section documenting the Copilot-identified
  credential-theft risk: 4-guard use_managed_identity could send MI bearer tokens
  to attacker-controlled endpoints when allow_group_custom_agent_endpoints=True
- Added Case 1-6 table showing which cases are system-controlled vs agent-supplied
- Numbered all five guards in Fix section 2 with cross-reference to security section
Copilot AI review requested due to automatic review settings March 5, 2026 17:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants