Problem
Sibling ticket to BS#990 (per-call-site LML timeouts for /proxy/metadata/album). The other two synchronous proxy endpoints — /proxy/metadata/artist and /proxy/entity/resolve — face the same regression vector (cold-cache LML cascade > iOS's URLSession ceiling → user-visible NSURLErrorDomain Code=-1001 timeout) but the synthesized-URL fallback that closes BS#990 doesn't apply here:
Neither endpoint has a try/catch today; LmlClientError bubbles to the global handler and surfaces as 502. So unlike the album endpoint (where the fix is just plumbing the timeout knob), these need both a timeout AND a controller reshape — but the reshape's target shape is the open question.
Design question (must decide before implementing)
When LmlClient times out from a synchronous proxy endpoint that can't fall back to synthesized search URLs, what does iOS see?
Candidate shapes:
| Option |
Status |
Body |
iOS UX |
| (a) empty / null body |
200 |
{ discogsArtistId: null, bio: null, ... } |
Listener renders empty bio + no avatar; reads next time may succeed |
| (b) explicit fallback flag |
200 |
{ ..., _lookupFailed: true } |
Same UX but client can show retry affordance |
| (c) 504 + Retry-After hint |
504 |
{ error: "lml_timeout", retryAfterMs: 30000 } |
Listener handles 5xx explicitly; can retry or show error |
| (d) keep current bubble-to-502 |
502 |
(global error shape) |
Today's behavior; iOS shows generic "couldn't load" |
| (e) hybrid |
200 if any partial result, 504 otherwise |
varies |
Most listener-friendly; most controller complexity |
Inputs needed before picking:
- Current iOS-side handling of 200-with-empty vs 504 vs 502 — check
wxyc-ios-64 for getArtistMetadata / resolveEntity callers and what expectations they encode.
- Whether the client wants to differentiate "no data exists" (genuinely unknown artist) from "we tried but LML didn't answer in time" — (b) or (c) make that explicit; (a) collapses them.
The right shape isn't obvious from the BS side alone; this needs an iOS-side coordination pass.
Where
Constraints
- Same as BS#990's project-board placement: not a feature add within Epic A's blast radius — defense-in-depth tuning of an existing client-side budget, not a behavior change to the LML cascade itself. Cleared against the "stop adding features in the touched files" rule of project #32.
- Decide before implementing: file a comment on this ticket documenting the chosen option (a/b/c/d/e) with the iOS-side context, then write the PR against that decision. Don't ship the controller-reshape PR without that comment landed.
- Keep the timeout value coordinated with BS#990 (start at 8s; if BS#990 measures something else, match).
Acceptance criteria
Related
- Sibling: BS#990 — same regression for
/proxy/metadata/album, but with a clean synthesized-URL fallback. Implement-then-measure independently of this ticket.
- Parent / context: BS#873 (closed) — the original cold-cascade incident; full prod measurements live there.
- Predecessor PR: BS#971 — raised the global timeout to 30s.
- Upstream fix that would make this unnecessary: LML#338 (Epic A in project #32). When LML's cold-cascade lands in <5s, the 30s ceiling is fine everywhere.
- Trigger event: discogs-etl#223 (closed) — Railway PG image swap dropped LML's connection pool; cold cache → user-visible timeouts.
Problem
Sibling ticket to BS#990 (per-call-site LML timeouts for
/proxy/metadata/album). The other two synchronous proxy endpoints —/proxy/metadata/artistand/proxy/entity/resolve— face the same regression vector (cold-cache LML cascade > iOS'sURLSessionceiling → user-visibleNSURLErrorDomain Code=-1001timeout) but the synthesized-URL fallback that closes BS#990 doesn't apply here:getArtistMetadataatproxy.controller.ts:381takesartistId(integer);SearchUrlProvider.getAllSearchUrlsneeds a name string. Nothing to synthesize.resolveEntityatproxy.controller.ts:409takestype + id. Same shape.Neither endpoint has a try/catch today;
LmlClientErrorbubbles to the global handler and surfaces as 502. So unlike the album endpoint (where the fix is just plumbing the timeout knob), these need both a timeout AND a controller reshape — but the reshape's target shape is the open question.Design question (must decide before implementing)
When
LmlClienttimes out from a synchronous proxy endpoint that can't fall back to synthesized search URLs, what does iOS see?Candidate shapes:
{ discogsArtistId: null, bio: null, ... }{ ..., _lookupFailed: true }{ error: "lml_timeout", retryAfterMs: 30000 }Inputs needed before picking:
wxyc-ios-64forgetArtistMetadata/resolveEntitycallers and what expectations they encode.The right shape isn't obvious from the BS side alone; this needs an iOS-side coordination pass.
Where
apps/backend/services/lml/lml.client.ts:55— globalTIMEOUT_MS. Same change as BS#990 (default + per-call override). If BS#990 lands first, this ticket inherits the knob.apps/backend/controllers/proxy.controller.ts:381(getArtistMetadata) — currently no try/catch around the LML call.apps/backend/controllers/proxy.controller.ts:409(resolveEntity) — currently no try/catch around the LML call.wxyc-ios-64proxy callers — pin which expectation the chosen shape needs to match.Constraints
Acceptance criteria
LmlClient.lookupArtist(and any other entry point used by these endpoints) accepts the per-calltimeoutMsparameter (inherits from BS#990 if landed first; otherwise add it here)./proxy/metadata/artistand/proxy/entity/resolvecap LML calls at the chosen timeout.Related
/proxy/metadata/album, but with a clean synthesized-URL fallback. Implement-then-measure independently of this ticket.