Skip to content

auth: extract ?o=/?a= from DATABRICKS_HOST env var#5337

Merged
simonfaltum merged 6 commits into
mainfrom
simonfaltum/databricks-host-env-normalize
May 27, 2026
Merged

auth: extract ?o=/?a= from DATABRICKS_HOST env var#5337
simonfaltum merged 6 commits into
mainfrom
simonfaltum/databricks-host-env-normalize

Conversation

@simonfaltum
Copy link
Copy Markdown
Member

Why

Pasting a SPOG URL from the Databricks UI (e.g. https://acme.azuredatabricks.net/?o=12345) into DATABRICKS_HOST drops the workspace identifier before any API call goes out. The SDK strips path and query from Host in fixHostIfNeeded without promoting ?o= to WorkspaceID, so the request goes to the SPOG hostname without an X-Databricks-Org-Id header. The server can't route it and answers with the login HTML page, which surfaces as:

$ DATABRICKS_HOST=https://acme.azuredatabricks.net/?o=12345 databricks bundle validate
Error: received HTML response instead of JSON

The bundle YAML workspace.host field is already normalized via NormalizeHostURL in bundle/config/workspace.go, and databricks api handles ?o= per call (#5137). The env-var path was the remaining gap.

Changes

Before: DATABRICKS_HOST=https://acme.databricks.net/?o=12345 reached the SDK with the query intact, the SDK dropped it, and WorkspaceID stayed empty.

Now: auth.NormalizeDatabricksHostEnv runs once at the top of root.Execute, before the SDK reads anything. It uses the existing ExtractHostQueryParams helper to promote ?o= / ?workspace_id= to DATABRICKS_WORKSPACE_ID and ?a= / ?account_id= to DATABRICKS_ACCOUNT_ID (only when those env vars are unset), then rewrites DATABRICKS_HOST without the query string.

A follow-up PR will push the same normalization into the SDK's fixHostIfNeeded so Python/Java/JS SDK users and any direct Go-SDK callers get the same fix without going through the CLI.

Test plan

  • go test ./libs/auth/ covers the new helper with table-driven cases: ?o= promotion, ?a= + ?o= together, existing DATABRICKS_WORKSPACE_ID is preserved, hosts without query are untouched, non-numeric ?o= is dropped, unset DATABRICKS_HOST is a no-op.
  • go test ./cmd/root/ passes.
  • ./task checks.
  • Manual end-to-end repro with a local SPOG-shaped test server: before the fix the SDK sent no X-Databricks-Org-Id header and got HTML back; after the fix the header is 258628866953061 and bundle validate proceeds to the workspace API.

Pasting a SPOG URL from the Databricks UI into DATABRICKS_HOST drops the
workspace identifier on the way into the SDK: the SDK strips query
parameters in fixHostIfNeeded without promoting them to WorkspaceID. API
calls then hit the SPOG without an X-Databricks-Org-Id header and the
server answers with HTML, surfacing as "received HTML response instead
of JSON". The bundle YAML host field is already normalized via
NormalizeHostURL, but the env var bypassed that path.

Run the existing ExtractHostQueryParams normalization on DATABRICKS_HOST
at the top of root.Execute before any SDK code runs. Promote ?o= and ?a=
to DATABRICKS_WORKSPACE_ID and DATABRICKS_ACCOUNT_ID when those env vars
are unset, and rewrite DATABRICKS_HOST without the query string.

Co-authored-by: Isaac
Co-authored-by: Isaac
Copy link
Copy Markdown
Contributor

@shreyas-goenka shreyas-goenka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this is how we want to approach this? Should we not fix the parser in the SDK? This seems a bit hacky?

Comment thread libs/auth/host_env.go Outdated
os.Setenv(envHost, params.Host)
if params.WorkspaceID != "" {
if _, set := os.LookupEnv(envWorkspaceID); !set {
os.Setenv(envWorkspaceID, params.WorkspaceID)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the environment variables are set but conflict with the ?o parameter? Should we error in that case?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today existing env vars always win: if DATABRICKS_WORKSPACE_ID is set, we leave it alone and just strip the query from DATABRICKS_HOST. Same behavior as the SDK PR (existing Config.WorkspaceID is preserved).

I'd lean against erroring, the env var is the more explicit signal, and a hard error here would also diverge from the SDK behavior. Happy to revisit if there's a concrete scenario where silent-preserve bites users.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, lets add a todo here so some future agent or human can remove this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 9db4b3b, right next to the env-var-precedence branch so it's visible at the relevant code site (the top-of-file TODO covers deleting the whole normalizer on the next SDK bump).

One small behavior fix piggybacked: an explicitly-empty DATABRICKS_WORKSPACE_ID / DATABRICKS_ACCOUNT_ID is now treated the same as unset, so a paste-empty env doesn't block ?o= / ?a= promotion. That also makes the t.Setenv switch in the test cleaner (see the other thread).

@eng-dev-ecosystem-bot
Copy link
Copy Markdown
Collaborator

eng-dev-ecosystem-bot commented May 27, 2026

Commit: 8a44c5e

Run: 26516869827

@simonfaltum
Copy link
Copy Markdown
Member Author

Yeah, I agree the SDK is the right long-term home, and I've got that fix open in databricks/databricks-sdk-go#1699, same logic, in fixHostIfNeeded, promoting ?o= / ?a= into Config.WorkspaceID / Config.AccountID.

Doing both because:

  • The SDK fix only reaches CLI users after we bump the SDK pin, which is on a slower cadence.
  • Other SDK consumers (Python, Java, JS, direct Go callers) get the fix via the SDK PR; CLI users get unblocked immediately via this PR.

Happy to add a TODO in host_env.go flagging this as a stopgap and pointing at #1699 so we delete it on the next SDK bump.

Stopgap until the SDK fix lands and the CLI bumps to a version that
includes it; then this normalizer can be deleted.

Co-authored-by: Isaac
Copy link
Copy Markdown
Contributor

@shreyas-goenka shreyas-goenka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving to unblock, thanks!

Comment thread libs/auth/host_env_test.go Outdated
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if tt.preserveHostUnset {
os.Unsetenv(envHost)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use t.Setenv and set to "".

Benefits:

  1. blocks parallel runs for this test.
  2. Automatically cleaned up once hte test runs. os.UnsetEnvs can handle around and might affect integration tests.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 9db4b3b, t.Setenv throughout, os.Unsetenv gone. Also collapsed the redundant wantXxxSet bools since the function now treats empty and unset identically and we can just compare against os.Getenv directly.

Comment thread libs/auth/host_env.go Outdated
os.Setenv(envHost, params.Host)
if params.WorkspaceID != "" {
if _, set := os.LookupEnv(envWorkspaceID); !set {
os.Setenv(envWorkspaceID, params.WorkspaceID)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, lets add a todo here so some future agent or human can remove this.

- Treat an explicitly-empty DATABRICKS_WORKSPACE_ID/ACCOUNT_ID the same
  as unset, so a paste-empty env doesn't block ?o=/?a= promotion.
- Add a TODO at the env-var-precedence branch so the silent-preserve
  behavior is easy to revisit when the SDK fix lands.
- Use t.Setenv exclusively in the table-driven test (parallel-safe,
  auto-cleanup) and assert against os.Getenv directly.

Co-authored-by: Isaac
Comment thread libs/auth/host_env.go Outdated
Comment on lines +22 to +23
// NormalizeDatabricksHostEnv extracts ?o=/?workspace_id= and ?a=/?account_id=
// from DATABRICKS_HOST and promotes them to DATABRICKS_WORKSPACE_ID and
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's already do w=?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does ?w= come from? I checked the codebase and don't see it referenced anywhere (?o= is the SPOG convention, ?workspace_id= is the explicit alias). The SDK PR (#1699) only handles ?o= and ?workspace_id= too. Happy to add ?w= as another alias if it's a real URL pattern users paste, just want to make sure I'm picking up the right convention.

Comment thread cmd/root/root.go Outdated
Comment on lines +134 to +137
// Promote ?o=/?a= query parameters in DATABRICKS_HOST to
// DATABRICKS_WORKSPACE_ID/DATABRICKS_ACCOUNT_ID before the SDK reads the
// env var. Without this, SPOG URLs pasted from the Databricks UI lose
// their workspace routing identifier and API calls return HTML.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the normalization be scoped to the CLI execution rather than changing the actual environment value? Do we see this potentially changing environment variable in unintended ways?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, refactored in 8a44c5e.

Before: NormalizeDatabricksHostEnv() mutated os.Setenv for DATABRICKS_HOST/WORKSPACE_ID/ACCOUNT_ID at the top of root.Execute.

Now: NormalizeDatabricksConfigFromEnv(ctx, cfg) sets the SDK Config fields directly. Same end result (?o=/?a= promoted, host stripped), but the process environment is left untouched. Called from the three places that actually build a Config from env: MustAccountClient, MustWorkspaceClient, and cmd/api's makeCommand.

Subprocesses (terraform) still get the normalized values via auth.Env(cfg) -> b.AuthEnv(), since terraform's env is derived from cfg, not os.Environ.

Confirmed the SDK env loader (config_attributes.go:84) skips fields that are already set, so our pre-populated cfg.Host survives without being clobbered by the unmodified DATABRICKS_HOST env value.

Replaces NormalizeDatabricksHostEnv (which called os.Setenv on
DATABRICKS_HOST/WORKSPACE_ID/ACCOUNT_ID at the top of root.Execute) with
NormalizeDatabricksConfigFromEnv(ctx, cfg), which sets the SDK Config
fields directly. Same end result (?o=/?a= promoted, host stripped), but
the process environment is left untouched.

Called from the three places that build a Config from env: MustAccountClient,
MustWorkspaceClient, and cmd/api's makeCommand. Subprocesses (terraform)
still get the normalized values via auth.Env(cfg) -> b.AuthEnv(), which
derives env from cfg rather than reading os.Environ.

Co-authored-by: Isaac
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 27, 2026 14:16 — with GitHub Actions Inactive
@simonfaltum simonfaltum temporarily deployed to test-trigger-is May 27, 2026 14:16 — with GitHub Actions Inactive
@simonfaltum simonfaltum added this pull request to the merge queue May 27, 2026
Merged via the queue into main with commit a57fe14 May 27, 2026
26 of 27 checks passed
@simonfaltum simonfaltum deleted the simonfaltum/databricks-host-env-normalize branch May 27, 2026 15:42
@eng-dev-ecosystem-bot
Copy link
Copy Markdown
Collaborator

Commit: a57fe14

Run: 26521824636

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants