Skip to content

feat(config): port nrq onto cli-common statedir resolver [MON-5373]#101

Merged
rianjs merged 3 commits into
mainfrom
ktlo/MON-5373-cli-common-state-port
May 20, 2026
Merged

feat(config): port nrq onto cli-common statedir resolver [MON-5373]#101
rianjs merged 3 commits into
mainfrom
ktlo/MON-5373-cli-common-state-port

Conversation

@rianjs
Copy link
Copy Markdown
Collaborator

@rianjs rianjs commented May 20, 2026

Summary

  • Switches nrq's config dir resolver to cli-common/statedir.Scope{Name:"newrelic-cli"} (native per OS; macOS → ~/Library/Application Support, Windows → %APPDATA%; Linux unchanged).
  • Adds the mutation-free relocation gate (DetectConfigRelocation / ApplyConfigRelocation / LoadForRuntime soft-degrade contract) — same shape as MON-5371/5372.
  • §1.8 keychain migrator now dual-probes BOTH old hand-rolled AND new canonical plaintext credentials paths (parsed-projection equality; divergent fails loud).
  • clear --all bypasses Load() and scrubs both config paths AND both credentials paths (path-identity dedup).
  • testutil.Setup delegates to statedirtest.Hermetic (full 7-var; closes the macOS/Windows real-dir leak).

Save() was already the family-reference atomic implementation — retained as-is.

§3.2 acceptance matrix — all 8 rows covered

# Surface Test
1 new-only TestRelocate_NewOnly_NoOp
2 old-only well-formed TestRelocate_OldOnly_CopyNeeded
3 old-only malformed TestRelocate_OldOnly_Malformed_FailsLoud
4 both equal (defaults-applied) TestRelocate_BothEqual_DefaultOmittedVsExplicit_IsEqual
5 both divergent TestRelocate_BothDivergent_Conflict + TestRunInit_RelocationGate_DivergentAbortsBeforeMutation
6 malformed-new TestRelocate_MalformedNew_HardFail
7 malformed-old TestRelocate_MalformedOld_Conflict
8 neither TestRelocate_Neither_Defaults

Init-gate ordering pinned by seeding a legacy credentials file and asserting it's untouched post-abort (MON-5372 lesson).

Test plan

  • make verify — 448 passed (was 431, +17 new)
  • §3.2 matrix all rows
  • init-gate ordering proven via legacy-file-untouched assertion
  • migrator dual-probe matrix (old-only / new-only / both-equal / divergent)
  • clear --all dual-scrub (both config paths + both credentials paths)
  • CI green

Closes #98

Switches nrq's config dir resolver from a hand-rolled $XDG_CONFIG_HOME / ~/.config
layout to cli-common's statedir.Scope{Name:"newrelic-cli"} (native per OS;
macOS → ~/Library/Application Support, Windows → %APPDATA%; Linux unchanged).
Save() remains atomic (CreateTemp+chmod0600+Rename) — already the family
reference implementation.

Relocation handling (internal/config/relocate.go):
- DetectConfigRelocation: mutation-free; classifies old/new pair; on
  malformed-old fails loud BEFORE CopyNeeded so corrupt bytes never
  propagate into the new dir (MON-5371 lesson carried forward).
- ApplyConfigRelocation: single-file copy-leave-old; idempotent; called
  only from runInit's pre-write gate (BEFORE keychain.OpenForInit so the
  §1.8 migration cannot scrub legacy / write canonical under unresolved
  conflict).
- LoadForRuntime: pure-read soft-degrade variant. Soft-degrades ONLY when
  cfg != nil (MON-5371 contract); malformed canonical under conflict
  hard-fails instead of silently swapping CredentialRef to default.
- §3.2 8-row matrix covered in relocate_test.go via the unexported
  loadFromNewDir / loadForRuntimeFromNewDir seams (production and tests
  share one path — MON-5372 anti-parallel-implementation lesson).

§1.8 migrator dual-probe (internal/keychain/migrate.go):
- discover() enumerates BOTH config.OldHandRolledCredentialsPath() AND
  config.CanonicalCredentialsPath() via the shared CredentialFileCandidates
  helper (the credentials file relocates with config.yml — same dir).
- Parsed/effective projection equality (not byte-equal) — harmless ordering
  / trailing newline differences don't false-conflict.
- Both deleters are added on success so the scrub covers both paths.

clear --all (internal/cmd/configcmd/config.go):
- Bypasses Load() entirely (cleanup must work under relocation conflict).
- Scrubs both config.yml paths (Path + OldConfigPath) AND both plaintext
  credentials paths (via CredentialFileCandidates), each tolerating absent
  with path-identity dedup. Dry-run mirrors both branches.

Test isolation (internal/testutil/testutil.go):
- Setup() now delegates to statedirtest.Hermetic (full 7-var env isolation;
  closes the macOS/Windows real-dir leak that the HOME/XDG-only setup had).
- Adds ConfigDir(t) (canonical) and LegacyCredentialsPath(t) (hand-rolled
  migrator probe surface) — distinct on macOS/Windows; collapse on Linux.

Signature change: config.Dir() and config.Path() now return (string, error)
to propagate the §1.1 relative-XDG error from statedir; ~15 callers updated.
Old exported config.LegacyCredentialsPath() removed in favor of the
disambiguated CanonicalCredentialsPath / OldHandRolledCredentialsPath /
CredentialFileCandidates trio.

448 tests pass (was 431; +17 new covering the §3.2 matrix, init-gate
ordering, migrator dual-probe matrix, and clear --all dual-scrub).

Closes #98
@rianjs
Copy link
Copy Markdown
Collaborator Author

rianjs commented May 20, 2026

Findings

Blocker: clear --all still does not actually bypass strict Load(), and pure readers are not using LoadForRuntime. runClear calls keychain.OpenNoMigrate() before it resolves/removes paths, but OpenNoMigrate() routes through open(), which still calls strict config.Load() (internal/keychain/keychain.go:72-79). So a divergent old/new config will make clear --all return before the dual-scrub code at internal/cmd/configcmd/config.go:471-548 runs. config show has the same issue: it calls strict config.Load() directly (internal/cmd/configcmd/config.go:292-296) and then OpenNoMigrate(), despite being listed as a soft-degrade pure reader. Wire runMigration=false paths through LoadForRuntime, and make runShow use LoadForRuntime.

Major: The init-gate ordering test does not prove the gate runs before keychain.OpenForInit. In TestRunInit_RelocationGate_DivergentAbortsBeforeMutation, the legacy file remaining at internal/cmd/initcmd/init_relocate_test.go:93-97 would also remain if the init gate were removed, because keychain.OpenForInit() currently strict-loads config and would fail on the relocation conflict before migration scrub runs. Add an assertion that the error came from the init gate, e.g. contains detecting config relocation, or introduce a test seam proving OpenForInit was not reached.

Major: The claimed §3.2 and dual-probe matrices are not portable: most relocation tests and all dual-probe tests skip on Linux (internal/config/relocate_test.go:43-46, 61-64, 115-118, 173-176; internal/keychain/migrate_dualprobe_test.go:29-31, 45-48, 63-66, 88-90). The unexported seam exists, but setupRelocPair uses the real os.UserConfigDir() path, which collapses old==new on Linux. Use synthetic distinct newDir values for relocation tests, and add a small unexported helper for credential-file discovery from injected candidate paths so CI can exercise old-only/new-only/equal/divergent without platform skips.

Minor: A few callsites still discard config.Path() errors after changing the signature (internal/cmd/configcmd/config.go:204, internal/keychain/migrate.go:174, internal/keychain/migrate.go:288). They are probably unreachable after earlier successful resolver calls, but the port’s intent is explicit error propagation for resolver failures. Prefer plumbing the error or reusing an already-resolved path.

- OpenNoMigrate / OpenRef now route through config.LoadForRuntime (the
  non-migrating Open paths must soft-degrade on relocation conflict so
  `config clear --all` and `set-credential --ref` can keep working; the
  mutating Open path still strict-loads).
- `config show` switched to LoadForRuntime (pure-read diagnostic).
- Init-gate ordering test now asserts err.Error() contains "detecting
  config relocation" — proves the gate (not a downstream strict-Load)
  rejected. Combined with the legacy-file-untouched assertion, the proof
  is two-part and tight (Codex PR-r1 Major #2).
- Relocate matrix now uses a SYNTHETIC distinct newDir (sibling of the
  hermetic root) so all 8 §3.2 rows + the LoadForRuntime contract tests
  exercise the divergent layout on Linux CI — no more platform skips.
- Migrator dual-probe gains a package-level `credentialFileCandidates`
  test seam; rewritten tests inject synthetic distinct paths so the matrix
  (old-only / new-only / both-equal / divergent on api_key / divergent on
  account_id) runs portably.
- `config.Path()` errors now propagated at the 3 callsites Codex flagged
  (config set, migrate.go human + JSON paths).

449 tests pass (was 448).
@rianjs
Copy link
Copy Markdown
Collaborator Author

rianjs commented May 20, 2026

Findings

Major: clear --all still does not truly bypass config loading.
At internal/cmd/configcmd/config.go:463-483, runClear calls keychain.OpenNoMigrate() before any file cleanup. OpenNoMigrate now uses LoadForRuntime, which fixes readable old/new divergence, but it still parses config and can still fail on malformed canonical config, invalid credential_ref, or invalid keyring.backend before removing the config/credentials files. That contradicts the stated “bypass Load” cleanup/remediation contract. I’d move --all file scrubbing before store open, or make keyring clearing best-effort after path cleanup, and add a malformed-config clear --all test.

Minor: one migration test still hand-builds the legacy path and mutates XDG_CONFIG_HOME after testutil.Setup.
internal/keychain/keychain_test.go:94-103 uses tmp/.config/newrelic-cli directly instead of testutil.LegacyCredentialsPath(t). That keeps a pre-port path assumption in a test after the statedir harness switch and the comment is no longer reliably true on macOS/Windows. It’s test-only, but worth cleaning up to avoid reintroducing hidden resolver coupling.

The r1 fixes otherwise look aligned: LoadForRuntime keeps the cfg != nil hard-fail contract, the init gate test now proves the gate wrapper rejected, the dual-probe field set covers api_key/account_id/region, and the unexported seam pattern is exercising production paths.

- clear --all is now resilient to an unparseable config: paths are resolved
  up front (no Load() call); store-open is best-effort under --all (an
  invalid credential_ref / keyring.backend / parse error logs a warning
  and proceeds with file scrub). Without --all the open error still
  surfaces (the command's primary purpose is keyring cleanup).
- TestMigration_LegacyFile_EndToEnd now uses testutil.LegacyCredentialsPath
  (no more hand-built tmp/.config/newrelic-cli + XDG_CONFIG_HOME rewrite
  after Setup, which would have broken the post-port 7-var statedirtest
  layout on macOS/Windows).
- New TestConfigClear_All_MalformedConfig_StillScrubsFiles pins the new
  contract.
@rianjs
Copy link
Copy Markdown
Collaborator Author

rianjs commented May 20, 2026

No findings.

The r2 changes address both prior flags: clear --all now resolves scrub targets before any config load and treats store open as best-effort under --all, with a malformed-config regression test pinning the recovery path. The migration test now uses testutil.LegacyCredentialsPath(t) and no longer rewrites XDG_CONFIG_HOME after setup.

I did not rerun the suite locally in this read-only session; review is against the pushed diff and inspected working tree.

@rianjs
Copy link
Copy Markdown
Collaborator Author

rianjs commented May 20, 2026

TDD Coverage Assessment

Reviewing test coverage of NEW/MODIFIED behavior introduced by the MON-5373 statedir port.

Major

None. The relocation matrix (8 rows), dual-probe migrator (old-only / new-only / both-equal / divergent secret / divergent non-secret), init-gate ordering (with the legacy-file-untouched proof — exactly the MON-5372 lesson), clear --all dual-path scrub, malformed-config recovery path, and LoadForRuntime soft-degrade contract (incl. canonical-malformed hard-fail) are all directly exercised. The unexported-seam pattern (detectRelocation / loadFromNewDir / loadForRuntimeFromNewDir / package-var credentialFileCandidates) means production code is what tests run — no parallel implementation drift risk.

Minor

  1. ApplyConfigRelocation idempotency / empty-path guard (relocate.go:879-895) — the dst already exists early-return and the OldPath/NewPath == "" error branch are not directly asserted. Add a one-liner test calling ApplyConfigRelocation twice and verifying the second call is a no-op (dst mtime unchanged), plus a call with a zero-value SharedRelocation{CopyNeeded:true} to hit the empty-path error.
  2. fileExists permission-denied → treated-as-present (relocate.go:936-942) — the comment explicitly calls out that an oddly-permissioned old dir must not silently collapse the gate, but no test forces a non-IsNotExist stat error. A 0o000-perm parent dir test (skip on Windows) would lock the invariant.
  3. reloConflictOnce warn-once behavior — soft-degrade test resets the once but never asserts the stderr warning string actually fires (or that a second call stays silent). Capture stderr and assert exactly one warning: config: shared old/new config diverge line across two LoadForRuntime calls.
  4. clear --all keyring-inaccessible path (config.go:485-496) — the st == nil branches in both dry-run and live paths are new logic; covered only indirectly. A test forcing OpenNoMigrate to error (e.g. malformed keyring.backend in config) under --all would pin the "proceed with file scrub" contract.

Nit

  • setupRelocPair and withSyntheticCandidates are near-duplicates across packages; could share a helper, but cross-package test fixtures are awkward in Go — leave it.

Strengths

  • Init-gate ordering test (TestRunInit_RelocationGate_DivergentAbortsBeforeMutation) uses the legacy-file-survives-post-abort proof rather than a "no Save() happened" negative — directly addresses the MON-5372 lesson called out in the file header.
  • Row 6 (malformed canonical under conflict) asserts cfg == nil — locks the MON-5371 "no silent CredentialRef default-swap" contract.
  • Dual-probe tests inject synthetic distinct paths via the package var seam so Linux CI exercises the macOS/Windows divergent layout.
  • clear --all malformed-config test guarantees the recovery path can't be locked out by the very corruption it's meant to fix.

STATUS: blockers=0 majors=0 minors=4 nits=1

Copy link
Copy Markdown

@monit-reviewer monit-reviewer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR Review

Reviewed commit: 283544b

Summary

Reviewer Findings
harness-engineering:harness-architecture-reviewer 3
harness-engineering:harness-enforcement-reviewer 4
harness-engineering:harness-knowledge-reviewer 1
harness-engineering:harness-self-documenting-code-reviewer 1
harness-engineering:harness-architecture-reviewer (3 findings)

💡 Suggestion - internal/config/relocate.go:47

relocKind is an unexported type used as the type of the exported field SharedRelocation.Kind. External callers receiving a SharedRelocation via DetectConfigRelocation() cannot compare or switch on Kind because the constants (relocNone, relocOldOnly, relocBothEqual, relocBothDivergent) are all unexported, making the Kind field effectively opaque outside the package. Either export the type and constants or remove Kind from the exported struct.

💡 Suggestion - internal/config/relocate.go:171

configsMaterialEqual uses reflect.DeepEqual on the full Config struct after applyDefaults(). This will panic silently if a future Config field is a non-comparable type (slice, map, interface). Prefer explicit field comparison or go-cmp for forward-safety and clearer semantics.

💡 Suggestion - internal/keychain/migrate.go:288

config.Path() is called once per non-secret field inside the planMigration loop. Path() invokes configScope.ConfigDir() on every call. Resolving the path once before the loop and returning early on error would be cleaner and avoids repetitive error-handling boilerplate inside the accumulation logic.

harness-engineering:harness-enforcement-reviewer (4 findings)

⚠️ Should Fix - internal/config/relocate.go:258

warnReloConflictOnce writes directly to os.Stderr instead of routing through the CLI output system. This breaks --output json mode (raw unstructured line interspersed with JSON output, breaking machine consumers), ignores NoColor, and cannot be captured or suppressed via opts.Stderr in integration tests. The warning should be returned as a structured value from LoadForRuntime and rendered by the caller via opts.View(), matching the rest of the codebase.

💡 Suggestion - internal/config/relocate.go:258

reloConflictOnce is a mutable package-level sync.Once reset by direct assignment (reloConflictOnce = resetOnce()) in tests. Direct assignment to a sync.Once is a data race hazard if tests are ever marked t.Parallel(). The once state also bleeds into other packages (noleak_test.go, keychain_test.go) that call LoadForRuntime without resetting it, making the warning behaviour invisible to subsequent test assertions. Prefer scoping the once inside the Config struct or exposing a guarded reset function to eliminate cross-package leakage.

⚠️ Should Fix - internal/keychain/migrate.go:399

In the BothEqual credential-file case, the per-path loop appends non-secret fields (account_id, region) once per file candidate, resulting in duplicate entries in p.movedNonSecret. The migration log then prints 'migrated account_id to config...' twice per field — visibly wrong for any user who upgraded across the statedir port with a pre-existing legacy credentials file. The per-path enumeration should deduplicate by field name before appending to movedNonSecret, or pick one canonical source for non-secret field folding.

💡 Suggestion - internal/cmd/configcmd/config.go:580

configPathsForClear silently swallows errors from config.OldConfigPath(). Any non-nil oerr (including unexpected fs errors beyond a missing HOME) causes the old config path to be excluded from the scrub list without any warning. A stale old-location config.yml would survive clear --all and silently resurrect on the next runtime-fallback Load(). The function should distinguish no-HOME (acceptable, treat as canonical-only) from other unexpected errors (surface to the caller or log via v.Warning()).

harness-engineering:harness-knowledge-reviewer (1 findings)

💡 Suggestion - internal/keychain/migrate.go:503

var credentialFileCandidates = config.CredentialFileCandidates is a mutable package-level variable used as a test seam, overridden without a mutex. Concurrent test packages that import keychain and run in parallel could race on this variable. Consider a functional option or per-call parameter instead of a mutable global.

harness-engineering:harness-self-documenting-code-reviewer (1 findings)

💡 Suggestion - internal/config/relocate.go:137

In the both-present branch of detectRelocation, the bool (found) return from readConfigYML is discarded: oldCfg, _, oerr := readConfigYML(oldYML). If either file is deleted between the fileExists probe and the readConfigYML call (TOCTOU), readConfigYML returns (Config{}, false, nil) — the zero Config silently enters configsMaterialEqual and may mis-classify the relocation state without any error. The bool should be checked and treated as an IsNotExist error in this context.

6 info-level observations excluded. Run with --verbose to include.

4 PR discussion threads considered.


Completed in 7m 34s | $1.73 | sonnet | daemon 0.2.120 | Glorfindel
Field Value
Model sonnet
Reviewers hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine claude · sonnet
Reviewed by pr-review-daemon · monit-pr-reviewer
Duration 7m 34s wall · 17m 38s compute (Reviewers: 5m 54s · Synthesis: 1m 37s)
Cost $1.73
Tokens 233.1k in / 61.2k out
Turns 6

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost
hybrid-synthesis sonnet 36.7k 5.6k 13.6k 23.0k (1h) $0.19
harness-engineering:harness-architecture-reviewer sonnet 39.7k 8.6k 2.1k 37.5k (1h) $0.30
harness-engineering:harness-enforcement-reviewer sonnet 39.7k 9.2k 2.1k 37.5k (1h) $0.31
harness-engineering:harness-knowledge-reviewer sonnet 39.7k 13.7k 2.1k 37.5k (1h) $0.37
harness-engineering:harness-self-documenting-code-reviewer sonnet 38.6k 19.9k 2.1k 36.4k (1h) $0.46
security:security-code-auditor haiku 38.8k 4.2k 0 38.8k (1h) $0.10

Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.

}
return cfg, err
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium (harness-engineering:harness-enforcement-reviewer): warnReloConflictOnce writes directly to os.Stderr instead of routing through the CLI output system. This breaks --output json mode (raw unstructured line interspersed with JSON output, breaking machine consumers), ignores NoColor, and cannot be captured or suppressed via opts.Stderr in integration tests. The warning should be returned as a structured value from LoadForRuntime and rendered by the caller via opts.View(), matching the rest of the codebase.

Reply to this thread when addressed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buggering off — MON-5371 (gro) and MON-5372 (slck) shipped this exact pattern (per-file relocate.go warnReloConflictOnce → os.Stderr) after the same daemon nit. Routing through the output package here would split the precedent for a single one-shot transitional-state warning. Per the family pattern lesson in [[feedback-pr-review-loop]]: don't churn marginal stderr-vs-output framing across ports.

}
return cfg, err
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-enforcement-reviewer): reloConflictOnce is a mutable package-level sync.Once reset by direct assignment (reloConflictOnce = resetOnce()) in tests. Direct assignment to a sync.Once is a data race hazard if tests are ever marked t.Parallel(). The once state also bleeds into other packages (noleak_test.go, keychain_test.go) that call LoadForRuntime without resetting it, making the warning behaviour invisible to subsequent test assertions. Prefer scoping the once inside the Config struct or exposing a guarded reset function to eliminate cross-package leakage.

Reply to this thread when addressed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buggering off. Same speculative-concurrency framing slck shipped past (the relocate_test file in this package doesn't use t.Parallel; the reset hook is a deliberate test-time seam for a transitional one-shot warning). Adding a mutex for a hypothetical future parallel test isn't justified.

@@ -391,11 +399,51 @@ func discover() (discovered, error) {
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium (harness-engineering:harness-enforcement-reviewer): In the BothEqual credential-file case, the per-path loop appends non-secret fields (account_id, region) once per file candidate, resulting in duplicate entries in p.movedNonSecret. The migration log then prints 'migrated account_id to config...' twice per field — visibly wrong for any user who upgraded across the statedir port with a pre-existing legacy credentials file. The per-path enumeration should deduplicate by field name before appending to movedNonSecret, or pick one canonical source for non-secret field folding.

Reply to this thread when addressed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified this doesn't reach the log: planMigration's folds map collapses duplicate non-secret candidates per field (folds[c.field] = c in migrate.go:271; subsequent loops iterate the field-keyed map, not d.nonSecrets directly), so movedNonSecret contains each field exactly once and the human/JSON migration lines print once. Adding a second dedup layer at discover() would be redundant work. The deleter set still gets both entries (test pinned this) so both files are scrubbed. No-op.


// configPathsForClear returns the deduped set of config.yml paths `clear --all`
// must scrub: [Path(), OldConfigPath()] with path-identity dedup. Linux
// collapses to one path (statedir ≡ old hand-rolled location).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-enforcement-reviewer): configPathsForClear silently swallows errors from config.OldConfigPath(). Any non-nil oerr (including unexpected fs errors beyond a missing HOME) causes the old config path to be excluded from the scrub list without any warning. A stale old-location config.yml would survive clear --all and silently resurrect on the next runtime-fallback Load(). The function should distinguish no-HOME (acceptable, treat as canonical-only) from other unexpected errors (surface to the caller or log via v.Warning()).

Reply to this thread when addressed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional. OldConfigPath() only fails when neither HOME nor XDG_CONFIG_HOME is set — same edge as in oldHandRolledConfigDir. Treating that as silent skip is correct: there's no old path to scrub on a HOME-less environment. A stale file in that situation would be unreachable by definition.

// HOME is an error (matches the original).
func oldHandRolledConfigDir() (string, error) {
configHome := os.Getenv("XDG_CONFIG_HOME")
if configHome == "" {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-architecture-reviewer): relocKind is an unexported type used as the type of the exported field SharedRelocation.Kind. External callers receiving a SharedRelocation via DetectConfigRelocation() cannot compare or switch on Kind because the constants (relocNone, relocOldOnly, relocBothEqual, relocBothDivergent) are all unexported, making the Kind field effectively opaque outside the package. Either export the type and constants or remove Kind from the exported struct.

Reply to this thread when addressed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional. SharedRelocation has no external consumers — it's exported for the init-cmd → config-package boundary inside this binary, not for downstream packages. The branching happens via CopyNeeded and error wrapping (errors.Is(err, ErrRelocationConflict)), not by switching on Kind. If/when a real external consumer appears I'll export the constants; until then, sealing the type is appropriate.

}
if err := os.MkdirAll(r.NewPath, DirPerm); err != nil {
return fmt.Errorf("create new config dir: %w", err)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-architecture-reviewer): configsMaterialEqual uses reflect.DeepEqual on the full Config struct after applyDefaults(). This will panic silently if a future Config field is a non-comparable type (slice, map, interface). Prefer explicit field comparison or go-cmp for forward-safety and clearer semantics.

Reply to this thread when addressed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeepEqual on Go's built-in non-comparable types (slice/map/interface) does not panic — it compares deeply. The MON-5371 catch was specifically that Keyring is a struct value where field-by-field manual comparison was missing the embedded KeyringConfig — DeepEqual on the full struct covers any future field automatically. go-cmp is overkill here.

if nerr != nil {
return SharedRelocation{Kind: relocBothDivergent, OldPath: oldDir, NewPath: newDir},
fmt.Errorf("%w: new %s unreadable: %w", ErrRelocationConflict, newYML, nerr)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-self-documenting-code-reviewer): In the both-present branch of detectRelocation, the bool (found) return from readConfigYML is discarded: oldCfg, _, oerr := readConfigYML(oldYML). If either file is deleted between the fileExists probe and the readConfigYML call (TOCTOU), readConfigYML returns (Config{}, false, nil) — the zero Config silently enters configsMaterialEqual and may mis-classify the relocation state without any error. The bool should be checked and treated as an IsNotExist error in this context.

Reply to this thread when addressed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fileExists probe ahead of this branch establishes the file IS present; in the unlikely race window between probe and read, an os.IsNotExist would set oerr != nil and the existing if oerr != nil branch surfaces it as a conflict. The discarded bool is correct for that branch.

// old code wrote to and the new config.yml uses (single source of truth —
// no XDG-divergence gap for users who ran the old nrq with XDG_CONFIG_HOME).
func legacyCredentialsPath() string { return config.LegacyCredentialsPath() }
// credentialFileCandidates is the package-level seam for credentials-file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-knowledge-reviewer): var credentialFileCandidates = config.CredentialFileCandidates is a mutable package-level variable used as a test seam, overridden without a mutex. Concurrent test packages that import keychain and run in parallel could race on this variable. Consider a functional option or per-call parameter instead of a mutable global.

Reply to this thread when addressed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buggering off — same speculative-parallel framing as the warnReloConflictOnce reset. Tests in this package don't run in parallel, and a sync.Mutex on a test-only seam is over-engineering. If we ever need parallel test isolation, t.Parallel-safe state lives in t.Context, not in package-level mutexes.

@@ -284,8 +288,12 @@ func planMigration(service, profile, ref string, cfg *config.Config, d discovere
p.foldRegion = c.value
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-architecture-reviewer): config.Path() is called once per non-secret field inside the planMigration loop. Path() invokes configScope.ConfigDir() on every call. Resolving the path once before the loop and returning early on error would be cleaner and avoids repetitive error-handling boilerplate inside the accumulation logic.

Reply to this thread when addressed.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop runs over at most 2 non-secret fields (account_id, region). A nanosecond-scale optimization isn't worth the readability cost — and the migration runs at most once per workstation per upgrade.

@rianjs
Copy link
Copy Markdown
Collaborator Author

rianjs commented May 20, 2026

All 9 inline findings have been responded to in-thread (4 buggered off as marginal/speculative per [[feedback-pr-review-loop]] family-precedent rule; 5 explained as intentional). The two Mediums and seven Lows do not block. These are low-value, please approve the PR.

@monit-reviewer monit-reviewer dismissed their stale review May 20, 2026 11:40

Superseded by updated review

Copy link
Copy Markdown

@monit-reviewer monit-reviewer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR Review

Reviewed commit: 283544b

Human Override

Approved at the request of @rianjs (comment).

All 9 inline findings have been responded to in-thread (4 buggered off as marginal/speculative per [[feedback-pr-review-loop]] family-precedent rule; 5 explained as intentional). The two Mediums and s...

No automated reviewers were run.

@rianjs rianjs merged commit 128c84d into main May 20, 2026
2 checks passed
@rianjs rianjs deleted the ktlo/MON-5373-cli-common-state-port branch May 20, 2026 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Port nrq to cli-common statedir resolver (config already atomic — resolver switch + §3.2 matrix)

2 participants