Skip to content

[CJ-75153][lakebox] CLI: follow pagination on ListSandboxes#3

Open
akshaysingla-db wants to merge 2 commits into
shuochen0311:lakebox-clifrom
akshaysingla-db:akshay/cj-75153-lakebox-cli-pagination
Open

[CJ-75153][lakebox] CLI: follow pagination on ListSandboxes#3
akshaysingla-db wants to merge 2 commits into
shuochen0311:lakebox-clifrom
akshaysingla-db:akshay/cj-75153-lakebox-cli-pagination

Conversation

@akshaysingla-db
Copy link
Copy Markdown

Summary

Depends on databricks-eng/universe#1941694, which makes handlers::sandbox::list actually paginate instead of collapsing to one response (default page_size=100, clamped to 1000, real next_page_token from ESM).

Today (*lakeboxAPI).list does a single GET and returns result.Sandboxes. That works only because the manager pre-aggregates server-side; once the server PR lands, callers that ignore next_page_token silently cap at 100 sandboxes per user.

Loop on the token, asking for page_size=100&page_token=... each round. doRequest learned to split a leading ?... off path into RawQuery so it merges with the host's ?o=<wsid> workspace selector instead of getting URL-encoded into the path.

Stacked on #2 ([CJ-75152]). Until that lands, this PR's diff shows both commits; the pagination changes are the second commit, isolated to cmd/lakebox/api.go.

Test plan

  • lakebox list works with fleets ≤ 100 (single round-trip).
  • lakebox list works with fleets > 100 once universe#1941694 is deployed (verify in staging).
  • lakebox list --json | jq '.sandboxes | length' returns full count, not 100.

Jira: CJ-75153

This pull request and its description were written by Isaac.

… surface name/timestamps

The auto-create path on `lakebox ssh` fails with:

  INVALID_PARAMETER_VALUE: JSON decode error:
  unknown field `public_key`, expected `sandbox`

`CreateSandboxRequest { Sandbox sandbox = 1 }` has `body: "*"`, so the
wire body must be wrapped as `{"sandbox": {...}}`. The CLI was sending an
unwrapped `{"public_key": "..."}` payload. `Sandbox` has no `public_key`
field anywhere on the create path (proto or handler), so the field was
dead end-to-end. `lakebox create` worked only because its default empty
publicKey was stripped by `omitempty` before the request went out.

While here, plug a few related gaps in the CLI surface:

- Wrap the create body as `{"sandbox": {...}}`; drop dead `public_key`
  field and `--public-key-file` flag.
- Surface `name`, `createTime`, `lastStartTime` on `sandboxEntry` so
  `lakebox status --json` and `lakebox list --json` stop silently
  dropping these fields.
- Add `--name` to `lakebox create` and `lakebox config` (proto + handler
  accept name on create + update; CLI had no way to set it).
- Print `name` in human `lakebox status` output when set.

Jira: CJ-75152

Co-authored-by: Isaac
Depends on universe#1941694, which makes `handlers::sandbox::list` pass
through pagination instead of collapsing to one response (default
`page_size=100`, clamped to `1000`, real `next_page_token` from ESM).

Today `(*lakeboxAPI).list` does a single GET and returns
`result.Sandboxes`. That works only because the manager pre-aggregates;
once the server PR lands, callers that ignore `next_page_token` silently
cap at 100 sandboxes per user.

Loop on the token, asking for `page_size=100&page_token=...` each round.
`doRequest` learned to split a leading `?...` off `path` into `RawQuery`
so it merges with the host's `?o=<wsid>` workspace selector instead of
getting URL-encoded into the path.

Jira: CJ-75153

Co-authored-by: Isaac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant