fix(auth): fail-fast on invalid or non-workload certificate configs in agent identity discovery#17116
fix(auth): fail-fast on invalid or non-workload certificate configs in agent identity discovery#17116nbayati wants to merge 3 commits into
Conversation
…n agent identity discovery The `GOOGLE_API_CERTIFICATE_CONFIG` environment variable is shared between Managed Workload Identity (MWLID) token-binding and other flows like Enterprise Certificate Provider (ECP) configs (e.g., PKCS#11). When this env var is set, agent identity discovery is triggered. If the config file exists but lacks the `"workload"` section (as with ECP configurations),we should exit early and return `None` to avoid delaying non-workload flows. In addition, if the config file on disk had syntax errors or invalid JSON, the previous logic entered a 30-second blocking retry loop before failing with `RefreshError`. To resolve this, the lookup logic now assumes that if the config file exists on disk, it is in its final format. If the file exists but lacks a `"workload"` section, has syntax errors, or is unreadable, we return `None` immediately to fail-fast and avoid startup delays.
There was a problem hiding this comment.
Code Review
This pull request refactors the get_agent_identity_certificate_path function to support a well-known workload directory fallback and improves the robustness of configuration parsing with better type checking and fail-fast logic. New test cases were added to cover various invalid configuration scenarios. Feedback was provided to add an explicit type check for the cert_configs dictionary to prevent a potential AttributeError and to raise an error instead of returning None for malformed configurations.
| # The config was parsed, but the cert file is not ready yet | ||
| target_path = cert_path | ||
|
|
||
| # Path B: Config is NOT set, fallback to the the well-known path |
There was a problem hiding this comment.
nit: # Path B: Config is NOT set, fallback to the well-known path
There was a problem hiding this comment.
I'm not sure I'm following what this nit is about. Can you please clarify?
There was a problem hiding this comment.
Ah I see, thanks for flagging it, Done!
agrawalradhika-cell
left a comment
There was a problem hiding this comment.
Provided some minor comments, rest looks good!
The
GOOGLE_API_CERTIFICATE_CONFIGenvironment variable is shared between Managed Workload Identity (MWLID) token-binding and other flows like Enterprise Certificate Provider (ECP) configs (e.g., PKCS#11).When this env var is set, agent identity discovery is triggered. If the config file exists but lacks the
"workload"section (as with ECP configurations),we should exit early and returnNoneto avoid delaying non-workload flows.In addition, if the config file on disk had syntax errors or invalid JSON, the previous logic entered a 30-second blocking retry loop before failing with
RefreshError. To resolve this, the lookup logic now assumes that if the config file exists on disk, it is in its final format. If the file exists but lacks a"workload"section, has syntax errors, or is unreadable, we returnNoneimmediately to fail-fast and avoid startup delays.b/512912028