release: v0.6.0 — RAG singleton, retry/backoff, parallel polish, on-disk polish cache by silversurfer562 · Pull Request #7 · Smart-AI-Memory/attune-author

silversurfer562 · 2026-05-01T09:20:37Z

Summary

Four perf and resilience improvements on the polish/RAG path, plus a new cache CLI subcommand for the on-disk cache that gets introduced.

rag_hook: process-level RagPipeline singleton (thread-safe via threading.Lock + double-checked locking). Corpus loads once per process instead of once per template kind — measurable win on --all-kinds runs.
doc_gen/_anthropic: retry with 1s / 2s / 4s exponential backoff for 429, 529, and APIConnectionError. Non-retryable SDK errors raise immediately. Credential redaction and __cause__ stripping preserved.
generator: three-phase render → polish → write. Polish is now ThreadPoolExecutor-parallel (max 4 workers) so wall-clock time drops to roughly the slowest single LLM call instead of the sum.
polish: on-disk cache at ~/.attune/polish_cache/ (overridable via env). Key includes content + source_summary + template_type + system_prompt + augmented_context + model, so any input or model change invalidates entries. _cache_get bumps mtime on hit so heat is observed reliably even on noatime mounts. Lazy mtime-based prune (default TTL 30d, env-tunable, 0 disables) piggybacked on _cache_put. Manual nuke via attune-author cache clear.

Why

Re-runs of attune-author regenerate --all-kinds were paying full LLM cost even on unchanged source. The cache, plus parallel polish, plus prompt caching downstream in attune-rag#3, bring the bill and the wall-clock down by an order of magnitude on the warm path. The retry/backoff and singleton are robustness wins independent of cost.

New CLI surface

attune-author cache clear    # delete every cached polish entry

New env vars

Var	Default	Purpose
`ATTUNE_AUTHOR_POLISH_CACHE`	`~/.attune/polish_cache`	Cache directory override
`ATTUNE_AUTHOR_POLISH_CACHE_TTL_SECONDS`	`2592000` (30d)	Mtime TTL; `0` disables prune

Version

0.5.1 → 0.6.0 (minor — additive but new behavior + new subcommand).

Dependencies

Allows but doesn't require attune-rag#3 (>=0.1.0,<0.2 pin satisfies 0.1.10).
Allows but doesn't require attune-help#4 (>=0.10.0 pin satisfies 0.10.0).

Test plan

tests/test_polish_cache.py — new, 12 tests: hit/miss, mtime bump on hit, model in key, prune by mtime, TTL=0 disables, invalid TTL falls back to default, clear_cache, end-to-end polish_template cache hit skips LLM
tests/test_anthropic_retry.py — new, 9 tests: 429/529/APIConnectionError retries, exponential schedule, gives up after _MAX_RETRIES, non-retryable raises immediately, credential redaction, __cause__ stripped
tests/conftest.py — autouse fixture resets RagPipeline singleton between tests
Full suite: 518 passed, 37 skipped (was 497, +21 new tests)
Smoke-tested attune-author cache clear against the live cache (deleted 16 entries)
Smoke-tested TTL prune via _cache_put against a 1-second TTL — hot entries survive, expired entries die

🤖 Generated with Claude Code

…isk polish cache A four-pronged perf and resilience pass on the polish/RAG path, plus a new "cache" CLI subcommand for the on-disk cache it introduces. rag_hook: process-level RagPipeline singleton --------------------------------------------- RagPipeline construction loads the corpus, which is heavy enough that doing it once per template kind (15+ times in --all-kinds runs) was visibly slow. _get_pipeline() now caches the pipeline behind a threading.Lock with double-checked locking, so cost is paid once per process. tests/conftest.py resets the singleton between tests so existing patches still intercept construction. doc_gen/_anthropic: retry with exponential backoff -------------------------------------------------- call_anthropic now distinguishes retryable (429, 529, APIConnectionError) from non-retryable SDK errors and retries the former up to 3 times with 1s/2s/4s backoff. Non-retryable errors raise immediately. Credential redaction and __cause__ stripping are preserved. generator: parallel polish -------------------------- generate_feature_templates is now a three-phase pipeline — render (sequential, fast), polish (concurrent via ThreadPoolExecutor, max 4 workers), write (sequential, ordered). Saturates LLM-bound wall time for --all-kinds runs while staying under Anthropic rate limits. polish: on-disk cache with mtime TTL prune + clear --------------------------------------------------- polish_template now consults a sha256-keyed on-disk cache before calling the LLM. Key includes content + source_summary + template_type + system_prompt + augmented_context + model so any input change invalidates the entry. Default location is ~/.attune/polish_cache/ (overridable via env). _cache_get bumps mtime on hit so the prune sweeper treats hot entries as hot even on noatime mounts. _cache_prune deletes entries older than the TTL (default 30d, env-tunable, 0 disables) and runs lazily piggybacked on _cache_put. clear_cache() is exposed for manual nukes; the new "attune-author cache clear" subcommand calls it. Tests ----- - tests/test_polish_cache.py (new, 12 tests): hit/miss, mtime bump on hit, model in key, prune by mtime, TTL=0 disables, invalid TTL falls back, clear_cache, polish_template skips LLM on cache hit. - tests/test_anthropic_retry.py (new, 9 tests): retries on 429 / 529 / APIConnectionError, exponential schedule, gives up after _MAX_RETRIES, non-retryable raises immediately, credential redaction, __cause__ stripped. Full suite: 518 passed, 37 skipped (was 497, +21 new tests). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

silversurfer562 merged commit f32efba into main May 1, 2026
12 checks passed

silversurfer562 deleted the release/0.6.0-polish-cache-and-perf branch May 1, 2026 10:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release: v0.6.0 — RAG singleton, retry/backoff, parallel polish, on-disk polish cache#7

release: v0.6.0 — RAG singleton, retry/backoff, parallel polish, on-disk polish cache#7
silversurfer562 merged 1 commit into
mainfrom
release/0.6.0-polish-cache-and-perf

silversurfer562 commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

silversurfer562 commented May 1, 2026

Summary

Why

New CLI surface

New env vars

Version

Dependencies

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant