feat: multi-provider LLM support (#545–549)#552
Conversation
Completes the remaining subissues of #542 (multi-provider LLM): ## #546/#547 — Worker agents use LLMProvider abstraction - Add async_complete() + common exceptions (LLMAuthError, LLMRateLimitError, LLMConnectionError) to LLMProvider ABC in base.py - Implement async_complete() in AnthropicProvider (via AsyncAnthropic) and OpenAIProvider (via openai.AsyncOpenAI) with error mapping - Refactor worker_agent.py: remove AsyncAnthropic import, add llm_provider constructor param, use llm_provider.async_complete() in execute_task - Refactor frontend_worker_agent.py and test_worker_agent.py: same pattern ## #545 — CLI flags --llm-provider and --llm-model - Add --llm-provider / --llm-model to `cf work start` and `cf work batch run` - Thread params through execute_agent() in runtime.py - Add llm_provider/llm_model fields to BatchRun dataclass and start_batch() - Pass flags through _execute_task_subprocess() subprocess invocations ## #549 — Workspace config llm: block - Add LLMConfig dataclass to config.py with provider/model/base_url fields - Add llm: Optional[LLMConfig] = None to EnvironmentConfig - Update from_dict() to deserialise llm: block - runtime.py resolves provider with priority: CLI flag > env var > config > default Priority chain: config file < CODEFRAME_LLM_PROVIDER env < --llm-provider CLI Closes #545, #546, #547, #549 Part of #542
|
Caution Review failedPull request was closed or merged during review WalkthroughAdds async completion APIs and unified LLM error types; implements provider-based async calls in Anthropic/OpenAI adapters; refactors agents to use an LLMProvider abstraction; exposes per-run CLI LLM flags and workspace LLM config; updates runtime/conductor wiring and tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as Agent / Worker
participant Runtime as Runtime / Conductor
participant Provider as LLMProvider
participant SDK as Async SDK
participant Parser as Response Parser
Agent->>Runtime: start/execute_task(..., llm_provider/llm_model)
Runtime->>Provider: async_complete(messages, purpose, max_tokens, system)
Provider->>SDK: await messages.create(...) / chat.completions.create(...)
SDK-->>Provider: raw SDK response
Provider->>Parser: _parse_response(raw)
Parser-->>Provider: LLMResponse (content, usage)
Provider-->>Agent: return LLMResponse
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Code Review — PR #552: Multi-provider LLM support (#545–549)This is a well-structured refactor that successfully decouples agent code from the Anthropic SDK. The provider abstraction, exception hierarchy, config layering, and test updates are all correct in intent. A few issues need attention before merge. Bug:
|
| Issue | Severity |
|---|---|
get_event_loop() → get_running_loop() |
High — deprecated, can fail in Python 3.10+ |
| Timeout removed from LLM calls | Medium — potential hang under slow/unresponsive providers |
Fragile hasattr guard for _async_client |
Low — cleanup |
| Missing fallback in frontend/test agents | Low — document the intent clearly |
| Stale test name | Low — cosmetic |
Fix the get_running_loop() issue and add back timeout protection before merging.
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
tests/agents/test_frontend_worker_agent.py (1)
186-198:⚠️ Potential issue | 🟡 MinorMake the fallback deterministic.
_generate_react_component()now goes throughself.llm_provider.async_complete(), sofrontend_agent.client = Noneno longer triggers the fallback path. As written, this can pass without ever exercising the exception handling the test is named for.🧪 Suggested change
async def test_generate_component_api_fallback(self, frontend_agent): """Test component generation falls back on API failure.""" - # Set client to None to trigger fallback - frontend_agent.client = None + failing_provider = Mock() + failing_provider.async_complete = AsyncMock(side_effect=RuntimeError("boom")) + frontend_agent._llm_provider = failing_provider spec = {"name": "FallbackComponent", "description": "Component with API failure"} code = await frontend_agent._generate_react_component(spec)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/agents/test_frontend_worker_agent.py` around lines 186 - 198, The test test_generate_component_api_fallback is no longer exercising the fallback because setting frontend_agent.client = None doesn't make _generate_react_component() raise; instead force the LLM call to fail by making frontend_agent.llm_provider.async_complete raise an exception (e.g., set a side_effect or patch it to raise) so _generate_react_component(spec) enters the exception/fallback branch and returns the basic template; update the test to patch/mock frontend_agent.llm_provider.async_complete (or the relevant async method) to throw an error before calling _generate_react_component.codeframe/cli/app.py (1)
2393-2404:⚠️ Potential issue | 🔴 CriticalProvider-specific auth is still hardcoded to Anthropic.
Both preflight checks call
require_anthropic_api_key()for every non-external engine before the LLM provider is resolved. That meanscf work start --execute --llm-provider openai ...and the batch equivalent still fail unless an Anthropic key is set, which breaks the new multi-provider path. Validate credentials against the resolved provider (CLI > env > config), not just the engine type.Also applies to: 3765-3775
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/cli/app.py` around lines 2393 - 2404, The preflight auth checks hardcode Anthropic for any non-external engine; instead resolve the LLM provider the same way the CLI does (check CLI flag, then env, then config) and call the corresponding validator (require_openai_api_key, require_e2b_api_key, require_anthropic_api_key) based on that resolved provider; update the execute preflight block (the code around the engine check and is_external_engine(...) usage) and the similar block at the other location (lines ~3765-3775) to compute provider = resolve_llm_provider(...) and switch on provider to invoke the right require_* validator rather than always calling require_anthropic_api_key for non-external engines.codeframe/adapters/llm/base.py (1)
45-50:⚠️ Potential issue | 🟠 MajorUpdate the GENERATION default model.
DEFAULT_GENERATION_MODELis stillclaude-3-5-haiku-20241022, so callers using the shared selector won't follow the repository's required GENERATION mapping.As per coding guidelines, `codeframe/adapters/llm/**/*.py`: Model selection must follow the pattern: PLANNING → claude-sonnet-4-20250514, EXECUTION → claude-sonnet-4-20250514, GENERATION → claude-haiku-4-20250514.♻️ Suggested change
-DEFAULT_GENERATION_MODEL = "claude-3-5-haiku-20241022" +DEFAULT_GENERATION_MODEL = "claude-haiku-4-20250514"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/adapters/llm/base.py` around lines 45 - 50, The DEFAULT_GENERATION_MODEL constant is out of date; update DEFAULT_GENERATION_MODEL in the adapter constants (alongside DEFAULT_PLANNING_MODEL, DEFAULT_EXECUTION_MODEL, DEFAULT_CORRECTION_MODEL, DEFAULT_SUPERVISION_MODEL) to use the required generation mapping "claude-haiku-4-20250514" so that callers using the shared selector follow the repository's GENERATION model policy; locate and replace the current value of DEFAULT_GENERATION_MODEL with "claude-haiku-4-20250514".codeframe/agents/frontend_worker_agent.py (1)
33-63:⚠️ Potential issue | 🟠 Major
FrontendWorkerAgentstill has no public provider injection path.This refactor switched generation to
self.llm_provider, but the constructor still doesn't accept/forwardllm_provider. Callers and tests now have to mutate_llm_providerprivately to control the backend, andapi_keyis only retained as an attribute here. Please expose provider injection in__init__, and ifapi_keyremains supported, consume it during provider creation instead of just storing it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/agents/frontend_worker_agent.py` around lines 33 - 63, The FrontendWorkerAgent constructor lacks a public injection for the LLM provider and currently only stores api_key; update __init__ to accept an llm_provider parameter (or provider_config) and use it to set up self.llm_provider instead of requiring callers to mutate _llm_provider privately, and if api_key is still supported consume it when constructing the provider; specifically, add an llm_provider argument to FrontendWorkerAgent.__init__, pass/forward that into the provider creation logic used by the base class or set self.llm_provider appropriately (using api_key to initialize the provider if provided), and remove the expectation that callers must mutate _llm_provider directly.codeframe/core/conductor.py (1)
2174-2194:⚠️ Potential issue | 🟠 MajorPersist
llm_provider/llm_modelwith the batch record.
BatchRunnow carries these fields, but_save_batch()still writes only the old schema and_row_to_batch()still reconstructs only the old fields. After any DB round-trip—most notablyresume_batch()—the override is lost and tasks rerun with env/default selection instead.Also applies to: 2201-2216
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/core/conductor.py` around lines 2174 - 2194, The DB schema persistence and reconstruction for BatchRun miss the new llm_provider and llm_model fields: update _save_batch() to include llm_provider and llm_model in the INSERT OR REPLACE INTO batch_runs columns and values (add corresponding entries to the parameter tuple using batch.llm_provider and batch.llm_model) and update _row_to_batch() to read these two columns from the cursor row and set them on the reconstructed BatchRun (or pass them into the BatchRun constructor), ensuring resumed batches retain their overridden provider/model; also apply the same additions for the duplicate block referenced around lines 2201–2216.codeframe/agents/test_worker_agent.py (1)
39-70:⚠️ Potential issue | 🟠 MajorExpose
llm_provideron this subclass too.
WorkerAgentnow supports provider injection, butTestWorkerAgent.__init__()still only acceptsprovider/api_keyand never forwards an injected provider. That forces callers to mutate_llm_provideror rely on env state, which defeats the new multi-provider path.Suggested change
-from codeframe.adapters.llm.base import Purpose +from codeframe.adapters.llm.base import Purpose, LLMProvider ... def __init__( self, agent_id: str, provider: str = "anthropic", maturity: AgentMaturity = AgentMaturity.D1, api_key: Optional[str] = None, + llm_provider: Optional[LLMProvider] = None, websocket_manager=None, max_correction_attempts: int = 3, db=None, ): ... super().__init__( agent_id=agent_id, agent_type="test", provider=provider, maturity=maturity, system_prompt=self._build_system_prompt(), db=db, + llm_provider=llm_provider, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@codeframe/agents/test_worker_agent.py` around lines 39 - 70, TestWorkerAgent.__init__ should accept and forward an injected llm_provider so callers can use multi-provider injection: add a new parameter llm_provider: Optional[...] = None to the TestWorkerAgent.__init__ signature (keep existing provider and api_key for backwards compatibility), pass that through to super().__init__ as llm_provider=llm_provider (instead of relying solely on provider/env), and if necessary set self.llm_provider = llm_provider when provided so the subclass exposes the injected provider; update any docstring/parameter list accordingly.
🧹 Nitpick comments (1)
tests/agents/test_worker_agent.py (1)
424-428: Rename this test to match its actual expectation.The test currently validates successful execution with injected
MockProvider, but its name still saysrejects_invalid_format, which is misleading.Suggested rename
-async def test_api_key_validation_rejects_invalid_format(self, db): +async def test_api_key_validation_ignored_with_injected_provider(self, db):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/agents/test_worker_agent.py` around lines 424 - 428, The test function currently named something like test_rejects_invalid_format in tests/agents/test_worker_agent.py is misleading because it actually verifies successful execution when a MockProvider is injected; rename the test to a clear name such as test_executes_with_injected_MockProvider or test_accepts_injected_MockProvider_without_key_validation, update the function definition and any references/assert messages to the new name, and run the tests to ensure no other code relies on the old name.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@codeframe/adapters/llm/anthropic.py`:
- Around line 123-173: The sync method complete() currently fails to map
Anthropic SDK exceptions to the adapter's base exceptions (AuthenticationError →
LLMAuthError, RateLimitError → LLMRateLimitError, APIConnectionError →
LLMConnectionError) unlike async_complete(), causing inconsistent contracts;
update complete() to import the same SDK exceptions (AuthenticationError,
RateLimitError, APIConnectionError) and wrap its call to
self._client.messages.create(...) in a try/except that catches those three
exceptions and re-raises them as LLMAuthError, LLMRateLimitError, and
LLMConnectionError respectively (preserve original message and use "from exc"),
mirroring the async_complete() behavior and keeping function name complete() and
parsing logic (_parse_response) unchanged.
In `@codeframe/adapters/llm/openai.py`:
- Around line 149-197: The sync path (complete()) currently maps OpenAI SDK
exceptions to ValueError while async_complete() maps them to LLMAuthError /
LLMRateLimitError / LLMConnectionError, causing divergent contracts; update
complete() to import and raise the same adapter exceptions (LLMAuthError,
LLMRateLimitError, LLMConnectionError) when catching
_openai.AuthenticationError, _openai.RateLimitError, and
_openai.APIConnectionError respectively so both complete() and async_complete()
present the same exception types to callers (refer to the methods complete and
async_complete and the exception classes LLMAuthError, LLMRateLimitError,
LLMConnectionError).
In `@codeframe/agents/worker_agent.py`:
- Around line 279-284: The issue is that model_name overrides are not propagated
into the actual LLM call, causing cost/audit/metadata drift; update
_call_llm_with_retry (and the other call sites noted around execute_task) to
pass the model_name override through to self.llm_provider.async_complete (add
the model_name argument to the async_complete call) so the provider uses the
correct model, and ensure execute_task uses that same model_name when computing
cost limits, populating audit logs, and building the returned payload/metadata
(so reported model and cost guardrails match the real request).
- Around line 104-111: The llm_provider property currently ignores the
constructor-provided self.provider when lazily creating the adapter; update the
llm_provider getter so that when self._llm_provider is None it first uses
self.provider (if truthy) as the provider_type, otherwise falls back to
os.getenv("CODEFRAME_LLM_PROVIDER", "anthropic"), and then calls
get_provider(provider_type) to set self._llm_provider; reference the
llm_provider property, self.provider field, and get_provider to locate and
change the logic.
In `@codeframe/core/runtime.py`:
- Around line 672-679: The Anthropic branch ignores llm_model because
provider_kwargs (with model/base_url) are only passed into get_provider for
non-external engines; update the Anthropic path so the constructed provider
receives those kwargs — either by calling get_provider(provider_type,
**provider_kwargs) for provider_type == "anthropic" or by passing
provider_kwargs into the AnthropicProvider constructor used in the else branch;
look for is_external_engine, provider_type, provider_kwargs, get_provider, and
AnthropicProvider and ensure model/base_url from
model_override/base_url_override are forwarded to the provider.
- Around line 643-661: The precedence for resolving base_url is inconsistent:
update base_url_override so environment variables win over workspace config
(match provider_type and model_override) by using the OPENAI_BASE_URL env var
before falling back to _cfg_base_url; locate base_url_override (and related
symbols _cfg_base_url, provider_type, model_override) in runtime.py and swap the
order so the env var is checked first, e.g., use the OPENAI_BASE_URL env value
or else _cfg_base_url.
In `@tests/agents/test_worker_agent_provider.py`:
- Around line 3-6: The test module imports unused symbols causing lint
failures—remove the unused imports MagicMock and patch from the unittest.mock
import, and remove AgentMaturity from the codeframe.core.models import so only
the actually used imports (MockProvider, WorkerAgent, and Task) remain; update
the import lines that contain MagicMock, patch, and AgentMaturity accordingly to
keep the file lint-clean.
In `@tests/agents/test_worker_agent.py`:
- Around line 665-677: The test currently only verifies that a warning was
logged for prompt sanitization but doesn't assert that execution continued;
update the test around agent.execute_task(task) to explicitly assert the
non-blocking outcome by checking the returned result (variable result) or the
expected success indicator (e.g., result.success or result.status ==
"completed") or that the downstream execution method was invoked, using the
existing agent.execute_task and task objects; add a concise assertion after the
logger check to ensure the task completed successfully.
- Around line 19-20: This test module is missing the required v2 pytest marker;
add the marker by either setting module-level pytestmark = pytest.mark.v2 at the
top of the module or decorating the test functions with `@pytest.mark.v2` so the
tests that import MockProvider, LLMRateLimitError, and LLMConnectionError are
correctly marked as v2; ensure you import pytest if adding pytestmark or using
the decorator.
In `@tests/core/test_cli_llm_flags.py`:
- Line 4: The test imports unused mocks: remove the dead imports "patch" and
"MagicMock" from the import statement so the file no longer imports unused
symbols; either delete the entire "from unittest.mock import patch, MagicMock"
line or replace it with only mocks actually used in the file (e.g., remove both
symbols if none are used) to satisfy ruff.
In `@tests/core/test_config_llm.py`:
- Around line 3-4: Remove the unused imports tempfile and Path from the import
block at the top of the test module; locate the import line that currently reads
"import tempfile" and "from pathlib import Path" and delete those import tokens
so the test file no longer imports unused symbols (tempfile, Path) and Ruff/CI
will stop failing.
---
Outside diff comments:
In `@codeframe/adapters/llm/base.py`:
- Around line 45-50: The DEFAULT_GENERATION_MODEL constant is out of date;
update DEFAULT_GENERATION_MODEL in the adapter constants (alongside
DEFAULT_PLANNING_MODEL, DEFAULT_EXECUTION_MODEL, DEFAULT_CORRECTION_MODEL,
DEFAULT_SUPERVISION_MODEL) to use the required generation mapping
"claude-haiku-4-20250514" so that callers using the shared selector follow the
repository's GENERATION model policy; locate and replace the current value of
DEFAULT_GENERATION_MODEL with "claude-haiku-4-20250514".
In `@codeframe/agents/frontend_worker_agent.py`:
- Around line 33-63: The FrontendWorkerAgent constructor lacks a public
injection for the LLM provider and currently only stores api_key; update
__init__ to accept an llm_provider parameter (or provider_config) and use it to
set up self.llm_provider instead of requiring callers to mutate _llm_provider
privately, and if api_key is still supported consume it when constructing the
provider; specifically, add an llm_provider argument to
FrontendWorkerAgent.__init__, pass/forward that into the provider creation logic
used by the base class or set self.llm_provider appropriately (using api_key to
initialize the provider if provided), and remove the expectation that callers
must mutate _llm_provider directly.
In `@codeframe/agents/test_worker_agent.py`:
- Around line 39-70: TestWorkerAgent.__init__ should accept and forward an
injected llm_provider so callers can use multi-provider injection: add a new
parameter llm_provider: Optional[...] = None to the TestWorkerAgent.__init__
signature (keep existing provider and api_key for backwards compatibility), pass
that through to super().__init__ as llm_provider=llm_provider (instead of
relying solely on provider/env), and if necessary set self.llm_provider =
llm_provider when provided so the subclass exposes the injected provider; update
any docstring/parameter list accordingly.
In `@codeframe/cli/app.py`:
- Around line 2393-2404: The preflight auth checks hardcode Anthropic for any
non-external engine; instead resolve the LLM provider the same way the CLI does
(check CLI flag, then env, then config) and call the corresponding validator
(require_openai_api_key, require_e2b_api_key, require_anthropic_api_key) based
on that resolved provider; update the execute preflight block (the code around
the engine check and is_external_engine(...) usage) and the similar block at the
other location (lines ~3765-3775) to compute provider =
resolve_llm_provider(...) and switch on provider to invoke the right require_*
validator rather than always calling require_anthropic_api_key for non-external
engines.
In `@codeframe/core/conductor.py`:
- Around line 2174-2194: The DB schema persistence and reconstruction for
BatchRun miss the new llm_provider and llm_model fields: update _save_batch() to
include llm_provider and llm_model in the INSERT OR REPLACE INTO batch_runs
columns and values (add corresponding entries to the parameter tuple using
batch.llm_provider and batch.llm_model) and update _row_to_batch() to read these
two columns from the cursor row and set them on the reconstructed BatchRun (or
pass them into the BatchRun constructor), ensuring resumed batches retain their
overridden provider/model; also apply the same additions for the duplicate block
referenced around lines 2201–2216.
In `@tests/agents/test_frontend_worker_agent.py`:
- Around line 186-198: The test test_generate_component_api_fallback is no
longer exercising the fallback because setting frontend_agent.client = None
doesn't make _generate_react_component() raise; instead force the LLM call to
fail by making frontend_agent.llm_provider.async_complete raise an exception
(e.g., set a side_effect or patch it to raise) so
_generate_react_component(spec) enters the exception/fallback branch and returns
the basic template; update the test to patch/mock
frontend_agent.llm_provider.async_complete (or the relevant async method) to
throw an error before calling _generate_react_component.
---
Nitpick comments:
In `@tests/agents/test_worker_agent.py`:
- Around line 424-428: The test function currently named something like
test_rejects_invalid_format in tests/agents/test_worker_agent.py is misleading
because it actually verifies successful execution when a MockProvider is
injected; rename the test to a clear name such as
test_executes_with_injected_MockProvider or
test_accepts_injected_MockProvider_without_key_validation, update the function
definition and any references/assert messages to the new name, and run the tests
to ensure no other code relies on the old name.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e023dfd8-9d47-4d37-810d-09eb12b30009
📒 Files selected for processing (16)
codeframe/adapters/llm/anthropic.pycodeframe/adapters/llm/base.pycodeframe/adapters/llm/openai.pycodeframe/agents/frontend_worker_agent.pycodeframe/agents/test_worker_agent.pycodeframe/agents/worker_agent.pycodeframe/cli/app.pycodeframe/core/conductor.pycodeframe/core/config.pycodeframe/core/runtime.pytests/adapters/test_llm_async.pytests/agents/test_frontend_worker_agent.pytests/agents/test_worker_agent.pytests/agents/test_worker_agent_provider.pytests/core/test_cli_llm_flags.pytests/core/test_config_llm.py
| async def async_complete( | ||
| self, | ||
| messages: list[dict], | ||
| purpose: Purpose = Purpose.EXECUTION, | ||
| tools: Optional[list[Tool]] = None, | ||
| max_tokens: int = 4096, | ||
| temperature: float = 0.0, | ||
| system: Optional[str] = None, | ||
| ) -> LLMResponse: | ||
| """True async completion via AsyncAnthropic. | ||
|
|
||
| Raises LLMAuthError / LLMRateLimitError / LLMConnectionError on failure. | ||
| """ | ||
| from anthropic import AsyncAnthropic | ||
| from anthropic import ( | ||
| AuthenticationError, | ||
| RateLimitError, | ||
| APIConnectionError, | ||
| ) | ||
| from codeframe.adapters.llm.base import ( | ||
| LLMAuthError, | ||
| LLMRateLimitError, | ||
| LLMConnectionError, | ||
| ) | ||
|
|
||
| if not hasattr(self, "_async_client") or self._async_client is None: | ||
| self._async_client = AsyncAnthropic(api_key=self.api_key) | ||
|
|
||
| model = self.get_model(purpose) | ||
| kwargs: dict = { | ||
| "model": model, | ||
| "max_tokens": max_tokens, | ||
| "messages": self._convert_messages(messages), | ||
| } | ||
| if temperature > 0: | ||
| kwargs["temperature"] = temperature | ||
| if system: | ||
| kwargs["system"] = system | ||
| if tools: | ||
| kwargs["tools"] = self._convert_tools(tools) | ||
|
|
||
| try: | ||
| response = await self._async_client.messages.create(**kwargs) | ||
| return self._parse_response(response) | ||
| except AuthenticationError as exc: | ||
| raise LLMAuthError(str(exc)) from exc | ||
| except RateLimitError as exc: | ||
| raise LLMRateLimitError(str(exc)) from exc | ||
| except APIConnectionError as exc: | ||
| raise LLMConnectionError(str(exc)) from exc | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the current state of the complete() method in anthropic.py
cd codeframe/adapters/llm && wc -l anthropic.pyRepository: frankbria/codeframe
Length of output: 79
🏁 Script executed:
# Get the full anthropic.py to see both complete() and async_complete() methods
cat -n codeframe/adapters/llm/anthropic.py | head -150Repository: frankbria/codeframe
Length of output: 5551
🏁 Script executed:
# Search for the complete() method to see if it has exception mapping
rg "def complete" codeframe/adapters/llm/anthropic.py -A 30Repository: frankbria/codeframe
Length of output: 1003
🏁 Script executed:
# Check the base adapter interface to understand the contract
cat codeframe/adapters/llm/base.pyRepository: frankbria/codeframe
Length of output: 10579
🏁 Script executed:
# Search for SupervisorResolver and _classify_with_supervision
fd -t f "*.py" | xargs rg "SupervisorResolver" --files-with-matchesRepository: frankbria/codeframe
Length of output: 459
🏁 Script executed:
# Look at the SupervisorResolver implementation
rg "class SupervisorResolver" -A 50Repository: frankbria/codeframe
Length of output: 3215
🏁 Script executed:
# Search for _classify_with_supervision in conductor.py
rg "_classify_with_supervision" codeframe/core/conductor.py -A 20Repository: frankbria/codeframe
Length of output: 1830
🏁 Script executed:
# Get more context around SupervisorResolver methods
rg "def " codeframe/core/conductor.py | grep -A 5 -B 5 "SupervisorResolver"Repository: frankbria/codeframe
Length of output: 664
🏁 Script executed:
# Check what exceptions the base interface expects complete() to handle or raise
rg "def complete" codeframe/adapters/llm/base.py -A 15Repository: frankbria/codeframe
Length of output: 599
🏁 Script executed:
# Get the full _classify_with_supervision method to see exception handling
rg "_classify_with_supervision" codeframe/core/conductor.py -A 40Repository: frankbria/codeframe
Length of output: 3896
Apply the same exception mapping to complete().
The sync complete() method (lines 77–121) does not normalize Anthropic SDK exceptions to the base adapter interface, while async_complete() implements full exception mapping (AuthenticationError → LLMAuthError, RateLimitError → LLMRateLimitError, APIConnectionError → LLMConnectionError).
This creates divergent error contracts between sync and async paths. SupervisorResolver._classify_with_supervision() calls complete() and catches all exceptions broadly, falling back to a heuristic—behavior that masks authentication and rate-limit failures, which should fail fast rather than silently degrade. Per coding guidelines, adapters must implement the base interface uniformly for integration consistency.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/adapters/llm/anthropic.py` around lines 123 - 173, The sync method
complete() currently fails to map Anthropic SDK exceptions to the adapter's base
exceptions (AuthenticationError → LLMAuthError, RateLimitError →
LLMRateLimitError, APIConnectionError → LLMConnectionError) unlike
async_complete(), causing inconsistent contracts; update complete() to import
the same SDK exceptions (AuthenticationError, RateLimitError,
APIConnectionError) and wrap its call to self._client.messages.create(...) in a
try/except that catches those three exceptions and re-raises them as
LLMAuthError, LLMRateLimitError, and LLMConnectionError respectively (preserve
original message and use "from exc"), mirroring the async_complete() behavior
and keeping function name complete() and parsing logic (_parse_response)
unchanged.
| async def async_complete( | ||
| self, | ||
| messages: list[dict], | ||
| purpose: Purpose = Purpose.EXECUTION, | ||
| tools: Optional[list[Tool]] = None, | ||
| max_tokens: int = 4096, | ||
| temperature: float = 0.0, | ||
| system: Optional[str] = None, | ||
| ) -> LLMResponse: | ||
| """True async completion via openai.AsyncOpenAI. | ||
|
|
||
| Raises LLMAuthError / LLMRateLimitError / LLMConnectionError on failure. | ||
| """ | ||
| import openai as _openai | ||
| from codeframe.adapters.llm.base import ( | ||
| LLMAuthError, | ||
| LLMRateLimitError, | ||
| LLMConnectionError, | ||
| ) | ||
|
|
||
| if not hasattr(self, "_async_client") or self._async_client is None: | ||
| self._async_client = _openai.AsyncOpenAI( | ||
| api_key=self.api_key, base_url=self.base_url | ||
| ) | ||
|
|
||
| converted = self._convert_messages(messages) | ||
| if system: | ||
| converted = [{"role": "system", "content": system}] + converted | ||
|
|
||
| kwargs: dict = { | ||
| "model": self.get_model(purpose), | ||
| "max_tokens": max_tokens, | ||
| "messages": converted, | ||
| "temperature": temperature, | ||
| } | ||
| if tools: | ||
| kwargs["tools"] = self._convert_tools(tools) | ||
| kwargs["tool_choice"] = "auto" | ||
|
|
||
| try: | ||
| response = await self._async_client.chat.completions.create(**kwargs) | ||
| return self._parse_response(response) | ||
| except _openai.AuthenticationError as exc: | ||
| raise LLMAuthError(str(exc)) from exc | ||
| except _openai.RateLimitError as exc: | ||
| raise LLMRateLimitError(str(exc)) from exc | ||
| except _openai.APIConnectionError as exc: | ||
| raise LLMConnectionError(str(exc)) from exc | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the complete() method in openai.py
rg -A 50 "def complete\(" codeframe/adapters/llm/openai.pyRepository: frankbria/codeframe
Length of output: 1820
🏁 Script executed:
# Also check the base adapter interface to understand the contract
cat -n codeframe/adapters/llm/base.py | head -100Repository: frankbria/codeframe
Length of output: 4170
🏁 Script executed:
# Let's get the full structure of the openai.py file to understand the class
wc -l codeframe/adapters/llm/openai.pyRepository: frankbria/codeframe
Length of output: 99
🏁 Script executed:
# Let's check the base class structure to see if there are abstract method definitions
rg -B 5 -A 20 "def complete|def async_complete" codeframe/adapters/llm/base.pyRepository: frankbria/codeframe
Length of output: 1840
🏁 Script executed:
# Let's also verify the OpenAIProvider class definition to see if it implements the interface
rg -B 2 "class OpenAIProvider" codeframe/adapters/llm/openai.pyRepository: frankbria/codeframe
Length of output: 99
🏁 Script executed:
# Let's verify what the LLMProvider base class looks like to confirm the contract
rg -B 5 -A 30 "class LLMProvider" codeframe/adapters/llm/base.pyRepository: frankbria/codeframe
Length of output: 1109
🏁 Script executed:
# Let's check if there are other adapter implementations to see how they handle exceptions
fd "\.py$" codeframe/adapters/llm/ | grep -v base.py | grep -v openai.pyRepository: frankbria/codeframe
Length of output: 164
🏁 Script executed:
# Check the Anthropic adapter's exception handling for both sync and async
rg -B 5 -A 30 "def complete\(|def async_complete\(" codeframe/adapters/llm/anthropic.py | head -150Repository: frankbria/codeframe
Length of output: 2277
🏁 Script executed:
# Check the mock adapter as well to understand the pattern
rg -B 5 -A 30 "def complete\(|def async_complete\(" codeframe/adapters/llm/mock.pyRepository: frankbria/codeframe
Length of output: 1209
🏁 Script executed:
# Let's also check if there are any callers of the sync complete() method to understand the impact
rg "\.complete\(" codeframe/ --type py | grep -v "async_complete" | head -20Repository: frankbria/codeframe
Length of output: 1498
Sync and async paths have divergent exception contracts.
async_complete() raises LLMAuthError, LLMRateLimitError, and LLMConnectionError, but complete() catches the same OpenAI SDK exceptions and converts them to ValueError. Callers using the sync path will handle different exception types than those using the async path, violating the uniform adapter interface requirement stated in the coding guidelines: "Adapters (LLM providers in codeframe/adapters/llm/) must implement the base adapter interface for uniform integration."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/adapters/llm/openai.py` around lines 149 - 197, The sync path
(complete()) currently maps OpenAI SDK exceptions to ValueError while
async_complete() maps them to LLMAuthError / LLMRateLimitError /
LLMConnectionError, causing divergent contracts; update complete() to import and
raise the same adapter exceptions (LLMAuthError, LLMRateLimitError,
LLMConnectionError) when catching _openai.AuthenticationError,
_openai.RateLimitError, and _openai.APIConnectionError respectively so both
complete() and async_complete() present the same exception types to callers
(refer to the methods complete and async_complete and the exception classes
LLMAuthError, LLMRateLimitError, LLMConnectionError).
| @property | ||
| def llm_provider(self) -> LLMProvider: | ||
| """Return the LLM provider, initialising from env vars if not supplied.""" | ||
| if self._llm_provider is None: | ||
| import os | ||
| provider_type = os.getenv("CODEFRAME_LLM_PROVIDER", "anthropic") | ||
| self._llm_provider = get_provider(provider_type) | ||
| return self._llm_provider |
There was a problem hiding this comment.
Honor the constructor provider when lazily creating the adapter.
When _llm_provider is absent, this property ignores self.provider and resolves only from CODEFRAME_LLM_PROVIDER or "anthropic". That makes WorkerAgent(provider="openai") instantiate the wrong backend unless the env var happens to match.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/agents/worker_agent.py` around lines 104 - 111, The llm_provider
property currently ignores the constructor-provided self.provider when lazily
creating the adapter; update the llm_provider getter so that when
self._llm_provider is None it first uses self.provider (if truthy) as the
provider_type, otherwise falls back to os.getenv("CODEFRAME_LLM_PROVIDER",
"anthropic"), and then calls get_provider(provider_type) to set
self._llm_provider; reference the llm_provider property, self.provider field,
and get_provider to locate and change the logic.
codeframe/agents/worker_agent.py
Outdated
| return await self.llm_provider.async_complete( | ||
| messages=messages, | ||
| purpose=Purpose.EXECUTION, | ||
| max_tokens=max_tokens, | ||
| system=system, | ||
| messages=messages, | ||
| timeout=timeout, | ||
| ) |
There was a problem hiding this comment.
model_name is now bookkeeping only.
execute_task() still uses model_name for cost limits, audit logs, and the returned payload, but _call_llm_with_retry() never passes that override into async_complete(). The request uses whatever model the provider was constructed with, so cost guardrails and reported metadata can drift from the actual call.
Also applies to: 419-478
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/agents/worker_agent.py` around lines 279 - 284, The issue is that
model_name overrides are not propagated into the actual LLM call, causing
cost/audit/metadata drift; update _call_llm_with_retry (and the other call sites
noted around execute_task) to pass the model_name override through to
self.llm_provider.async_complete (add the model_name argument to the
async_complete call) so the provider uses the correct model, and ensure
execute_task uses that same model_name when computing cost limits, populating
audit logs, and building the returned payload/metadata (so reported model and
cost guardrails match the real request).
| # Resolve LLM provider: CLI flag → env var → workspace config → default "anthropic" | ||
| from codeframe.core.config import load_environment_config as _load_cfg | ||
| _env_cfg = _load_cfg(workspace.repo_path) | ||
| _cfg_provider = _env_cfg.llm.provider if (_env_cfg and _env_cfg.llm) else None | ||
| _cfg_model = _env_cfg.llm.model if (_env_cfg and _env_cfg.llm) else None | ||
| _cfg_base_url = _env_cfg.llm.base_url if (_env_cfg and _env_cfg.llm) else None | ||
|
|
||
| provider_type = ( | ||
| llm_provider | ||
| or os.getenv("CODEFRAME_LLM_PROVIDER") | ||
| or _cfg_provider | ||
| or "anthropic" | ||
| ) | ||
| model_override = ( | ||
| llm_model | ||
| or os.getenv("CODEFRAME_LLM_MODEL") | ||
| or _cfg_model | ||
| ) | ||
| base_url_override = _cfg_base_url or os.getenv("OPENAI_BASE_URL") |
There was a problem hiding this comment.
Keep base_url precedence consistent with the rest of LLM resolution.
base_url_override = _cfg_base_url or os.getenv("OPENAI_BASE_URL") makes workspace config win over env, while provider_type and model_override use env over config. That blocks env-level overrides for local endpoints.
Suggested change
- base_url_override = _cfg_base_url or os.getenv("OPENAI_BASE_URL")
+ base_url_override = os.getenv("OPENAI_BASE_URL") or _cfg_base_url📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Resolve LLM provider: CLI flag → env var → workspace config → default "anthropic" | |
| from codeframe.core.config import load_environment_config as _load_cfg | |
| _env_cfg = _load_cfg(workspace.repo_path) | |
| _cfg_provider = _env_cfg.llm.provider if (_env_cfg and _env_cfg.llm) else None | |
| _cfg_model = _env_cfg.llm.model if (_env_cfg and _env_cfg.llm) else None | |
| _cfg_base_url = _env_cfg.llm.base_url if (_env_cfg and _env_cfg.llm) else None | |
| provider_type = ( | |
| llm_provider | |
| or os.getenv("CODEFRAME_LLM_PROVIDER") | |
| or _cfg_provider | |
| or "anthropic" | |
| ) | |
| model_override = ( | |
| llm_model | |
| or os.getenv("CODEFRAME_LLM_MODEL") | |
| or _cfg_model | |
| ) | |
| base_url_override = _cfg_base_url or os.getenv("OPENAI_BASE_URL") | |
| # Resolve LLM provider: CLI flag → env var → workspace config → default "anthropic" | |
| from codeframe.core.config import load_environment_config as _load_cfg | |
| _env_cfg = _load_cfg(workspace.repo_path) | |
| _cfg_provider = _env_cfg.llm.provider if (_env_cfg and _env_cfg.llm) else None | |
| _cfg_model = _env_cfg.llm.model if (_env_cfg and _env_cfg.llm) else None | |
| _cfg_base_url = _env_cfg.llm.base_url if (_env_cfg and _env_cfg.llm) else None | |
| provider_type = ( | |
| llm_provider | |
| or os.getenv("CODEFRAME_LLM_PROVIDER") | |
| or _cfg_provider | |
| or "anthropic" | |
| ) | |
| model_override = ( | |
| llm_model | |
| or os.getenv("CODEFRAME_LLM_MODEL") | |
| or _cfg_model | |
| ) | |
| base_url_override = os.getenv("OPENAI_BASE_URL") or _cfg_base_url |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@codeframe/core/runtime.py` around lines 643 - 661, The precedence for
resolving base_url is inconsistent: update base_url_override so environment
variables win over workspace config (match provider_type and model_override) by
using the OPENAI_BASE_URL env var before falling back to _cfg_base_url; locate
base_url_override (and related symbols _cfg_base_url, provider_type,
model_override) in runtime.py and swap the order so the env var is checked
first, e.g., use the OPENAI_BASE_URL env value or else _cfg_base_url.
| from codeframe.adapters.llm import MockProvider | ||
| from codeframe.adapters.llm.base import LLMRateLimitError, LLMConnectionError |
There was a problem hiding this comment.
Add the required v2 marker to this test module.
This updated test module under tests/** is missing @pytest.mark.v2 (or module-level pytestmark = pytest.mark.v2), which violates the repo test-marking rule.
Suggested patch
import pytest
import os
from unittest.mock import Mock, AsyncMock, patch
from codeframe.agents.worker_agent import WorkerAgent
from codeframe.core.models import Task, AgentMaturity, CallType, TaskStatus
from codeframe.persistence.database import Database
from codeframe.adapters.llm import MockProvider
from codeframe.adapters.llm.base import LLMRateLimitError, LLMConnectionError
+pytestmark = pytest.mark.v2
+As per coding guidelines: tests/**/*.py: New v2 Python tests must be marked with @pytest.mark.v2 decorator or pytestmark = pytest.mark.v2.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from codeframe.adapters.llm import MockProvider | |
| from codeframe.adapters.llm.base import LLMRateLimitError, LLMConnectionError | |
| import pytest | |
| import os | |
| from unittest.mock import Mock, AsyncMock, patch | |
| from codeframe.agents.worker_agent import WorkerAgent | |
| from codeframe.core.models import Task, AgentMaturity, CallType, TaskStatus | |
| from codeframe.persistence.database import Database | |
| from codeframe.adapters.llm import MockProvider | |
| from codeframe.adapters.llm.base import LLMRateLimitError, LLMConnectionError | |
| pytestmark = pytest.mark.v2 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/agents/test_worker_agent.py` around lines 19 - 20, This test module is
missing the required v2 pytest marker; add the marker by either setting
module-level pytestmark = pytest.mark.v2 at the top of the module or decorating
the test functions with `@pytest.mark.v2` so the tests that import MockProvider,
LLMRateLimitError, and LLMConnectionError are correctly marked as v2; ensure you
import pytest if adding pytestmark or using the decorator.
| # Should log warning but still execute (sanitization is defensive, not blocking) | ||
| with patch("codeframe.agents.worker_agent.logger") as mock_logger: | ||
| result = await agent.execute_task(task) | ||
|
|
||
| # Check that warning was logged | ||
| mock_logger.warning.assert_any_call( | ||
| "Potential prompt injection detected", | ||
| extra={ | ||
| "event": "prompt_injection_attempt", | ||
| "phrase": "ignore all previous instructions", | ||
| "agent_id": "test-001" | ||
| } | ||
| ) |
There was a problem hiding this comment.
Assert the non-blocking outcome explicitly.
Line 665 says sanitization should warn and still execute, but the test only checks logging. Add a completion assertion so the behavior is actually enforced.
Suggested patch
with patch("codeframe.agents.worker_agent.logger") as mock_logger:
result = await agent.execute_task(task)
+ assert result["status"] == "completed"
# Check that warning was logged
mock_logger.warning.assert_any_call(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/agents/test_worker_agent.py` around lines 665 - 677, The test currently
only verifies that a warning was logged for prompt sanitization but doesn't
assert that execution continued; update the test around agent.execute_task(task)
to explicitly assert the non-blocking outcome by checking the returned result
(variable result) or the expected success indicator (e.g., result.success or
result.status == "completed") or that the downstream execution method was
invoked, using the existing agent.execute_task and task objects; add a concise
assertion after the logger check to ensure the task completed successfully.
PR Review: feat: multi-provider LLM support (#545–549)Solid implementation overall. The abstraction is clean, tests are well-refactored, and the priority chain (CLI > env > config > default) is exactly right. A few issues worth addressing before merge: Bug: Empty
|
Follow-up Review — open items from previous reviewsThe latest commits update test files, CI config, and pytest.ini. The test refactoring looks clean. However, the three issues flagged in my prior two reviews are still unresolved in the diff: Still open: get_event_loop() deprecation codeframe/adapters/llm/base.py still uses asyncio.get_event_loop() inside an async def method. This is deprecated in Python 3.10+ and can raise RuntimeError in Python 3.12+. Since a running loop is always present inside an async method, change to asyncio.get_running_loop(). Still open: empty content fallback In codeframe/agents/worker_agent.py the refactored block sets content = response.content and logs a warning if falsy, but no longer assigns content = "" as the original did. Downstream code receives None where it expects a string. Fix: content = response.content or "" Still open: timeout regression needs a comment The CRITICAL-1 adaptive timeout has been removed with no replacement or tracking note. The TimeoutError in the retry decorator is now dead weight for the WorkerAgent path. If intentionally deferred, add a comment referencing the tracking issue so it does not look accidental. The lifecycle test updates and pytestmark additions in the latest push look correct. The two blockers above (get_running_loop and content or "") are one-liner fixes worth landing before merge. |
- base.py: use asyncio.get_running_loop() (not deprecated get_event_loop()), move asyncio import to top-level - worker_agent.py: content = response.content or "" (prevent None downstream), restore CRITICAL-1 adaptive timeout via asyncio.wait_for() - anthropic.py + openai.py: declare _async_client=None in __init__, simplify lazy-init guard (hasattr not needed) - test_worker_agent.py: match async_complete() signatures in error mock subclasses; rename stale test to test_injected_provider_bypasses_key_validation
Review — Multi-Provider LLM SupportGood architectural work overall. CodeRabbit's review already covers the main structural issues (hardcoded Anthropic preflight auth, missing DB persistence of Here are a few additional issues I found that weren't flagged: 1.
2. Timeout regression — tasks can now hang indefinitely The original The stall-detector operates at a coarser granularity and won't catch a hung HTTP request mid-flight. At minimum the timeout should be reinstated — either as an optional 3. Both 4.
Minor:
The |
Follow-up Review — final open itemsGood progress on this PR. The main blockers from my previous reviews have all been addressed:
Two items from my last review are still open: 1.
@property
def llm_provider(self) -> LLMProvider:
if self._llm_provider is None:
provider_type = os.getenv("CODEFRAME_LLM_PROVIDER", "anthropic")
self._llm_provider = get_provider(provider_type) # ← model/base_url never applied
return self._llm_provider
model = os.getenv("CODEFRAME_LLM_MODEL")
base_url = os.getenv("OPENAI_BASE_URL")
self._llm_provider = get_provider(
provider_type,
**({"model": model} if model else {}),
**({"base_url": base_url} if base_url else {}),
)2.
agent._llm_provider = ErrorMockProvider(default_response="Task completed")
result = await agent.execute_task(task)
assert result["status"] == "failed"
assert "Failed after 3 retry attempts" in result["output"]
# ← no call_count assertion here
3. Both Overall: The PR is very close to merge-ready. Item 1 is a genuine behavioral gap; items 2 and 3 are lower-priority but should not be forgotten. |
Review: feat: multi-provider LLM support (#545–549)The abstraction is solid — clean exception hierarchy, proper async implementations for both providers, and MockProvider making tests much simpler. The threading of Bug: In base_url_override = _cfg_base_url or os.getenv("OPENAI_BASE_URL")This makes the config file win over the env var, contradicting the documented priority (CLI > env > config). Fix: base_url_override = os.getenv("OPENAI_BASE_URL") or _cfg_base_urlDev artifact committed to the repo
Missing catch-all for base
Fallback behavior change is subtle but works The old Minor: lifecycle test comment still references
What's good
The base_url priority inversion and the demo file are the two things that need to land before this ships. Everything else is informational. |
Summary
Completes subissues #545–549 of the multi-provider LLM initiative (#542). Issues #543 and #544 (OpenAI adapter and factory) were already merged.
worker_agent.py→LLMProviderabstraction (no moreimport anthropic)frontend_worker_agent.py,test_worker_agent.py→ same refactor--llm-provider/--llm-modelCLI flags oncf work startandcf work batch runllm:block in.codeframe/config.yaml(provider, model, base_url)How it works
Provider priority (lowest → highest): config file <
CODEFRAME_LLM_PROVIDERenv <--llm-providerCLIWhat was added to the LLMProvider ABC
LLMError,LLMAuthError,LLMRateLimitError,LLMConnectionErrorcommon exceptionsasync_complete()default implementation (thread-executor wrappingcomplete())AnthropicProvider.async_complete()— true async viaAsyncAnthropicOpenAIProvider.async_complete()— true async viaopenai.AsyncOpenAIIssue #548 (streaming_chat.py)
Not included in this PR. The issue notes that making streaming abstract is acceptable as a future step;
streaming_chat.pystill usesAsyncAnthropicinternally but its lazy import is already isolated. Will be tracked separately.Test plan
uv run pytest tests/adapters/test_llm_async.py— async provider methods (7 new tests)uv run pytest tests/agents/test_worker_agent_provider.py— WorkerAgent + MockProvider (2 new)uv run pytest tests/core/test_cli_llm_flags.py— CLI flag presence (4 new)uv run pytest tests/core/test_config_llm.py— config.yaml llm: block (4 new)uv run pytest tests/agents/test_worker_agent.py tests/agents/test_frontend_worker_agent.py— 57 existing tests passuv run ruff check .— no lint errorsCloses #545
Closes #546
Closes #547
Closes #549
Part of #542
Summary by CodeRabbit
New Features
Improvements
Tests
Documentation