fix(create): clean up orphan vectoria kb on pipeline failure (0.5.2)#5
Merged
Conversation
InitTask's insufficient_credits handler returned a plain fmt.Errorf which cobra mapped to its default exit 1, while the same condition surfacing later in the stream path explicitly os.Exit(5). The documented contract (AGENTS.md, README) is 5 = business failure; agents branching on exit codes had to know about two paths. Caught during real-backend smoke for 0.5.0 (depleted credits during testing made the inconsistency observable). Match the stream-side pattern verbatim: print the i18n message to stderr, os.Exit(5). Add an integration test that mocks /v1/tasks/init returning envelope code 100001 and asserts the binary exits 5.
Before this change, every failed `vk create` invocation left a
freshly-created vectoria knowledgebase orphaned. Real-backend smoke
during 0.5.0 alone accumulated 6+ such kbs from a handful of test
runs; the user's tenant had accrued 424 kbs in total.
Three failure boundaries now clean up the kb the CLI just created:
1. uploadFile / uploadURL — kb cleanup deferred if the subsequent
doc upload or doc-status poll fails inside the helper. The kb
leak window is closed before the function returns ("","",err).
2. RunE — kb cleanup deferred from the moment kbID is known until
InitTask succeeds. Covers script_invalid (returns clerr.Validation
→ defer runs), generic InitTask errors (returns err → defer runs),
and the `--mode script` no-doc reject path.
3. insufficient_credits handler — os.Exit(5) skips defers, so the
same cleanup is invoked inline before os.Exit. The comment in
create.go documents this asymmetry.
Past InitTask success, the backend task owns the kb's lifecycle and
the CLI must not delete it (even on task.failed, stream interrupted,
--async detach, etc.) because the backend may still be reading from
it. The `initTaskClaimed` flag fences this.
Adds client/vectoria.Client.DeleteKB wrapping the backend's existing
DELETE /v1/knowledgebases/{id} endpoint. Best-effort: errors are
swallowed so cleanup hygiene doesn't mask the real failure.
Extends the 0.5.1 integration test to drive the upload path and
assert DELETE /v1/knowledgebases/{id} fires after the 100001
rejection.
…ence Apply findings from /simplify review on the orphan-kb cleanup: - cmd/create.go: add 5s context timeout to every DeleteKB cleanup callsite. Previously context.Background() had no deadline; a hung backend during cleanup would silently stall the user, especially on the synchronous inline path before os.Exit(5). - cmd/create.go: replace the standalone `initTaskClaimed bool` flag with `task != nil` (declaring `task` earlier so the defer can reference it). Same semantics, one less state variable to reason about. Trim the over-narrated comments around the defer. - tests/integration/video_flow_test.go: introduce `buildProfile(t, endpoints map[string]string)` as the core helper; `buildVideoProfile` becomes a one-line wrapper. Removes 90%-duplicate YAML/file-writing logic that buildProfileFigVect (now deleted) had reinvented. - tests/integration/create_credits_test.go: use buildProfile directly, drop buildProfileFigVect. Tighten DELETE-path assertion to full path equality (loud failure if upload/poll ever mis-routes), add inline comments on the exact-vs-prefix mux pattern. No behavior change; tests still green.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stacked on #4 (0.5.1). Fixes the orphan-kb leak that has been silently accumulating on every failed
vk createinvocation since this command shipped.Problem:
vk create --from <file>always creates a fresh kb in vectoria, uploads the doc, then callstasks/init. If anything fails between kb creation andtask.succeeded, the kb is left orphaned forever. Real-backend smoke during 0.5.0 alone left 6+ orphans; the user's tenant had accrued 424 kbs in total.Fix: Three failure boundaries now clean up:
uploadFile/uploadURL— kb cleanup deferred if doc upload or status poll fails before the helper returns. Previously the helper returned("", "", err)and the outer caller never knew a kb was orphaned.script_invalid, genericInitTaskerrors, the--mode scriptno-doc reject. Cleared onceInitTaskreturns OK (backend task now owns kb lifecycle).os.Exit(5)— theinsufficient_creditspath needs explicit cleanup becauseos.Exitskips defers. Comment documents this asymmetry.Past
InitTasksuccess — task.failed, stream interrupted,--asyncdetach, etc. — the CLI does not delete the kb. Backend may still be reading.Diff
client/vectoria/knowledgebase.go+ test: newDeleteKB(ctx, kbID)wrapping the existing backend endpoint (probed live, returns 204).cmd/create.go: 3 cleanup boundaries described above +cleanupOrphanKBbest-effort helper.tests/integration/create_credits_test.go: extends the 0.5.1 test to drive the upload path and assertDELETE /v1/knowledgebases/{id}fires.CHANGELOG.md+ version bump 0.5.1 → 0.5.2.Test plan
go test ./...green including extended integration testgo build ./...cleanDELETE /v1/knowledgebases/<orphan-id>verified to work (204 No Content) before writing the fixvk create --mode scriptwith invalid docx, verify kb count inGET /v1/knowledgebasesdoesn't incrementStacking note
Branches off #4 (
fix/credits-exit-code). If #4 merges first, this rebases trivially onto main. If both are open, review #4 first.Out of scope
vk kb list/deleteCLI subcommands; tracked as follow-up.vk kb prunecommand in 0.6.0.