refactor: Strangler Stage 1 -- tame worst complexity outliers#1464
Conversation
Tighten Ruff thresholds to Stage 1 targets and refactor all functions that exceed them. All changes are behaviour-preserving. Complexity decomposition (C901/PLR0912/PLR0915): - commands/compile/cli.py: extract _validate_project, _run_validation_mode, _run_watch_mode, _run_compilation (C901 64->~45) - integration/mcp_integrator_install.py: extract _resolve_target_runtimes, _install_self_defined_deps, _print_mcp_summary (C901 78->~45) - integration/mcp_integrator.py: extract _clean_json_mcp_config, _clean_toml_mcp_config, _clean_claude_config (C901 70->~45) - install/validation.py: extract per-host validators (C901 78->~45) - install/phases/resolve.py: extract 6 phase helpers (C901 59->~20) - commands/pack.py: extract _parse_path_overrides, _parse_marketplace_filter (C901 51->~45) Argument reduction (PLR0913): - install/services.py: introduce IntegratorBundle frozen dataclass, replace 6 integrator params with single bundle (18->13 args) - commands/install.py: remove redundant dry_run param (16->15) - install/mcp/command.py: derive verbose/manifest_path from context (17->15) - install/mcp/conflicts.py: merge transport flags into any_transport_flag (16->14) Return reduction (PLR0911): - marketplace/publisher.py: extract _load_consumer_manifest, _check_ref_guards (16->9 returns) - marketplace/semver.py: table-driven _CMP_OPS dispatch (14->11) - policy/policy_checks.py: consolidate tail checks into loop (13->12) Threshold updates: - pyproject.toml: max-complexity=50, max-branches=60, max-statements=200, max-args=15, max-returns=12 - ci.yml: MAX_LINES 2450->2000 Test helpers: - Convert 18-arg _make_dep_ref to **kwargs pass-through in 2 test files Closes #1077 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tage-1-tame-the-worst-outlier-97a585
There was a problem hiding this comment.
Pull request overview
This PR implements “Strangler Stage 1” by refactoring the worst Ruff complexity outliers (complexity/branches/statements/args/returns) into smaller helpers and tightening the repo’s Ruff thresholds + CI file-length gate to prevent regressions.
Changes:
- Refactors several high-complexity functions into smaller private helpers (install/compile/integration/marketplace/policy paths) while aiming to preserve behavior.
- Introduces
IntegratorBundleto reduce integrator plumbing argument counts and updates call sites + tests accordingly. - Tightens Ruff thresholds (PLR/C901) and reduces CI’s max-file-length guardrail (2450 -> 2000).
Reviewed changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/integration/test_data_driven_dispatch.py | Updates dispatch tests to pass integrators via IntegratorBundle. |
| tests/unit/integration/test_command_integrator.py | Migrates command integrator tests to IntegratorBundle calls. |
| tests/unit/install/test_services.py | Updates service tests to pass integrators=IntegratorBundle(...) instead of **integrators. |
| tests/unit/install/test_services_rendering.py | Updates rendering tests to use IntegratorBundle conversion helper. |
| tests/unit/install/test_services_phase3.py | Updates phase3 service tests for new integrator bundling. |
| tests/unit/install/test_services_hook_scope.py | Updates hook-scope tests to use IntegratorBundle. |
| tests/unit/install/test_services_branches.py | Updates branch tests to use IntegratorBundle conversion helper. |
| tests/unit/install/test_mcp_conflicts.py | Adjusts tests for transport-flag conflict validation to use any_transport_flag. |
| tests/unit/commands/test_pack.py | Adds unit tests for new pack CLI parsing helpers. |
| tests/integration/test_install_services_phase3w5.py | Updates integration tests to pass integrators as a bundle. |
| tests/integration/test_install_services_orchestration.py | Updates orchestration integration tests to use IntegratorBundle. |
| tests/integration/test_deps_models_validation.py | Simplifies test helper _make_dep_ref using kwargs + defaults. |
| tests/integration/test_deps_models_phase3c.py | Same _make_dep_ref refactor for phase3c coverage suite. |
| tests/integration/test_coverage_phase4.py | Updates MCP install command invocations after signature simplification. |
| src/apm_cli/policy/policy_checks.py | Consolidates tail checks into a loop to reduce return-count pressure. |
| src/apm_cli/marketplace/semver.py | Replaces chained comparison-operator parsing with table-driven dispatch. |
| src/apm_cli/marketplace/publisher.py | Extracts manifest loading + ref-guard checks into helpers to reduce returns/complexity. |
| src/apm_cli/integration/mcp_integrator.py | Extracts per-runtime stale MCP config cleanup into reusable helpers. |
| src/apm_cli/integration/mcp_integrator_install.py | Splits MCP install into runtime-resolution + self-defined install + summary helpers. |
| src/apm_cli/install/validation.py | Breaks package existence validation into backend-specific helpers (local/virtual/git/api/fallback). |
| src/apm_cli/install/template.py | Updates template integration call sites to pass IntegratorBundle. |
| src/apm_cli/install/services.py | Adds IntegratorBundle dataclass and updates integration entry points to consume it. |
| src/apm_cli/install/phases/resolve.py | Refactors resolve phase into focused helpers and restores --only/intended-key steps as separate functions. |
| src/apm_cli/install/mcp/conflicts.py | Reduces args by collapsing transport flags into any_transport_flag. |
| src/apm_cli/install/mcp/command.py | Derives verbose/manifest_path from logger/apm_dir to reduce args. |
| src/apm_cli/install/drift.py | Updates drift replay integration to pass integrators via IntegratorBundle. |
| src/apm_cli/commands/pack.py | Extracts parsing helpers for --marketplace-path and --marketplace filter. |
| src/apm_cli/commands/install.py | Removes redundant dry_run plumbing (use logger.dry_run) and updates MCP conflict call site. |
| src/apm_cli/commands/compile/cli.py | Decomposes compile command flow into validation/watch/compile helpers to reduce complexity. |
| pyproject.toml | Tightens Stage 1 Ruff thresholds and adds per-file ignore for pack_cmd arg-count. |
| .github/workflows/ci.yml | Tightens MAX_LINES guardrail (2450 -> 2000) as part of Stage 1. |
APM Review Panel:
|
| Persona | B | R | N | Takeaway |
|---|---|---|---|---|
| Python Architect | 0 | 0 | 2 | Sound, principled complexity reduction; extract-helper and frozen-dataclass patterns correctly applied; one nit on IntegratorBundle field typing; thresholds are staged and honest. |
| CLI Logging Expert | 0 | 0 | 2 | All user-visible output paths preserved faithfully; one minor pattern inconsistency in the new TOML stale-cleanup helper; no correctness regression. |
| DevX UX Expert | 0 | 0 | 1 | No user-facing CLI surface changes; --dry-run flag preserved; error messages improved; one nit on transport flag collapse losing per-flag identity. |
| Supply Chain Security Expert | 0 | 0 | 1 | All security-critical paths preserved with correct ordering; file-level PLR0913 suppression on pack.py is the only nit. |
| OSS Growth Hacker | 0 | 2 | 1 | Stage 1 is a net positive for contributor onboarding; max-complexity=50 is still high and Stage 2 work needs a public tracking issue to avoid silent drift. |
| Auth Expert | 0 | 0 | 1 | Pure structural refactor; all auth logic faithfully preserved in extracted functions; one pre-existing private-method coupling nit, not a regression. |
| Test Coverage Expert | 0 | 1 | 2 | All critical-promise surfaces have regression traps at the right tier; recommended: integration-tier smoke for the consolidated policy-check loop. |
B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.
Top 5 follow-ups
- [Test Coverage Expert] Add integration-tier smoke test for the consolidated policy_checks.py loop covering hash-pin, fetch-failure, and mcp check categories through the real install pipeline. -- Surface is tagged secure-by-default and governed-by-policy; missing integration evidence on a governed surface ranks above recommended-tier opinion findings, and without it the next refactor could silently drop a check category with no automated guardrail.
- [OSS Growth Hacker] Open a public Stage 2 tracking issue (max-complexity <= 20, max-branches <= 25) before or immediately after merge, tagged help-wanted or good-first-issue. -- No tracking artifact means the pyproject.toml roadmap comment is the only signal; silent drift from Stage 1 thresholds is the most likely failure mode for a staged refactor program.
- [OSS Growth Hacker] Add a CHANGELOG.md entry under 'Internal / DX' for this PR and add a one-line comment above max-complexity explaining the staged roadmap. -- Contributors and adopters reading the changelog should be able to find this milestone; without it the contributor-friendliness story is invisible externally.
- [Python Architect] Replace Any-typed IntegratorBundle fields with a BaseIntegrator protocol (or concrete imports under TYPE_CHECKING) in Stage 2. -- The frozen dataclass loses most of its static-analysis value with Any fields; this is the right Stage 2 companion to the structural work done here.
- [CLI Logging Expert] Align _clean_toml_mcp_config output contract with sibling _clean_json_mcp_config by adding optional logger/use_rich parameters. -- Asymmetric output contracts in sibling helpers accumulate into inconsistent UX over time; a small fix now prevents a larger cleanup later.
Architecture
classDiagram
direction LR
class IntegratorBundle {
<<ValueObject>>
+prompt Any
+agent Any
+skill Any
+instruction Any
+command Any
+hook Any
}
class integrate_package_primitives {
<<Pure>>
+integrators IntegratorBundle
+targets Any
+force bool
+managed_files Any
+diagnostics DiagnosticCollector
}
class InstallContext {
<<DataObject>>
+existing_lockfile LockFile
+apm_modules_dir Path
+downloader GitHubPackageDownloader
+auth_resolver AuthResolver
+deps_to_install list
+dependency_graph Any
}
class _load_lockfile {
<<IOBoundary>>
+ctx InstallContext
}
class _ensure_modules_dir {
<<IOBoundary>>
+ctx InstallContext
}
class _setup_downloader {
<<IOBoundary>>
+ctx InstallContext
}
class _resolve_dependencies {
<<IOBoundary>>
+ctx InstallContext
}
class _clean_json_mcp_config {
<<Pure>>
+config_path Path
+stale_names set
}
class _clean_toml_mcp_config {
<<Pure>>
+config_path Path
+stale_names set
}
class _clean_claude_config {
<<Pure>>
+config_path Path
+stale_names set
}
class _CMP_OPS {
<<TableDispatch>>
}
integrate_package_primitives *-- IntegratorBundle : receives
integrate_package_primitives ..> InstallContext : reads targets
_load_lockfile ..> InstallContext : mutates
_ensure_modules_dir ..> InstallContext : mutates
_setup_downloader ..> InstallContext : mutates
_resolve_dependencies ..> InstallContext : mutates
class IntegratorBundle:::touched
class _load_lockfile:::touched
class _ensure_modules_dir:::touched
class _setup_downloader:::touched
class _resolve_dependencies:::touched
class _clean_json_mcp_config:::touched
class _clean_toml_mcp_config:::touched
class _clean_claude_config:::touched
class _CMP_OPS:::touched
classDef touched fill:#fff3b0,stroke:#d47600
flowchart TD
A(["apm install CLI entry"]) --> B["run_install_pipeline\ninstall/pipeline.py"]
B --> C["resolve_phase\ninstall/phases/resolve.py"]
C --> C1["_load_lockfile\nreads apm.lock.yaml"]
C --> C2["_ensure_modules_dir\nmkdir apm_modules/"]
C --> C3["_setup_downloader\nAuthResolver + GitHubPackageDownloader"]
C --> C4["_resolve_dependencies\nAPMDependencyResolver"]
C1 & C2 & C3 & C4 --> D["InstallContext populated"]
D --> E["integrate_phase\ninstall/services.py"]
E --> F["integrate_package_primitives\nreceives IntegratorBundle"]
F --> G1["prompt integrator"]
F --> G2["agent integrator"]
F --> G3["skill integrator"]
F --> G4["instruction integrator"]
F --> G5["command integrator"]
F --> G6["hook integrator"]
G1 & G2 & G3 & G4 & G5 & G6 --> H["deployed_files list"]
H --> I["lockfile write"]
B2(["mcp uninstall path"]) --> J["MCPIntegrator.clean\nmcp_integrator.py"]
J --> J1["_clean_json_mcp_config"]
J --> J2["_clean_toml_mcp_config"]
J --> J3["_clean_claude_config"]
B3(["semver constraint check"]) --> K["_satisfies_single\nsemver.py"]
K --> K1["_CMP_OPS table dispatch\nge/gt/le/lt lambdas"]
K --> K2["caret range handler"]
K --> K3["wildcard handler"]
K --> K4["exact match"]
Recommendation
Merge when ready. The panel found no blocking issues and strong test evidence that all security-critical and policy-enforcement paths are preserved. Before or immediately after merge, file two follow-up issues: (1) integration-tier smoke for the consolidated policy_checks.py loop (test-coverage-expert finding, elevated by secure-by-default / governed-by-policy tagging), and (2) Stage 2 complexity roadmap tracking issue tagged help-wanted (oss-growth-hacker). Everything else is a nit addressable in Stage 2 or a future PR.
Full per-persona findings
Python Architect
-
[nit] IntegratorBundle fields typed as Any, losing the static-analysis benefit of the dataclass at
src/apm_cli/install/services.py:43
All six fields (prompt, agent, skill, instruction, command, hook) are typed Any. A frozen dataclass is a value-object pattern -- its primary benefit is discoverability and type-safety at call sites. With Any on every field, mypy/pyright cannot catch incorrect integrator assignments.
Suggested: Introduce a minimal BaseIntegrator protocol or import concrete integrator types under TYPE_CHECKING and annotate each field with its real type, keeping runtime imports lazy. -
[nit] max-complexity=50 / max-branches=60 are Stage 1 but still 3-5x industry defaults at
pyproject.toml:99
The thresholds are honestly labelled Stage 1 with prior values documented. The staged approach is pragmatic and correct. Noting for the Stage 2 planning horizon so targets continue to ratchet downward rather than ossify.
Suggested: File a Stage 2 issue targeting max-complexity <= 20 / max-branches <= 25 (McCabe standard) once the six over-length files are split.
CLI Logging Expert
-
[nit] _clean_toml_mcp_config hardwires _rich_success directly while sibling _clean_json_mcp_config exposes logger/use_rich parameters at
src/apm_cli/integration/mcp_integrator.py
Stale-removal notices from _clean_toml_mcp_config bypass the logger contract entirely and can never be suppressed by a NullCommandLogger or future quiet-mode flag. No regression -- but the extracted helper is the right place to normalise the pattern.
Suggested: Add optional logger and use_rich: bool = True parameters to _clean_toml_mcp_config, matching _clean_json_mcp_config's signature. -
[nit] Gemini stale-removal dead-code _rich_success fallback silently dropped -- behaviour identical but confirm integration test covers gemini stale output with real logger at
src/apm_cli/integration/mcp_integrator.py
The old gemini stale-removal block had a dead-code else: _rich_success(...) branch. The new code silently drops the fallback. Behaviour is identical.
Suggested: No code change needed; confirm an integration test covers the gemini stale-removal output path with a real logger.
DevX UX Expert
- [nit] any_transport_flag collapses three distinct flags into one boolean, losing per-flag identity for future error messaging at
src/apm_cli/commands/install.py
The refactor merges use_ssh, use_https, and allow_protocol_fallback into a single any_transport_flag boolean. No current regression -- but it pre-emptively narrows the recovery message space.
Suggested: If a transport conflict error is ever added, thread the original flag names alongside the boolean, or keep the three booleans inside a small TransportFlags dataclass.
Supply Chain Security Expert
- [nit] File-level PLR0913 suppression for pack.py silences arg-count lint for the entire module at
pyproject.toml
The suppression targets src/apm_cli/commands/pack.py at file scope rather than the specific pack_cmd function. Any future function added to pack.py would inherit the suppression silently.
Suggested: Prefer an inline # noqa: PLR0913 on pack_cmd itself, or add a comment in pyproject.toml explicitly listing the function name.
OSS Growth Hacker
-
[recommended] max-complexity=50 is still twice the industry-standard contributor-friendly threshold of 25 at
pyproject.toml
Projects like httpie, ruff, and uv target C901<=20. Setting 50 as the new ceiling signals 'we tolerate very complex functions' to first-time contributors. Ratcheting to 25 as the Stage 2 target and advertising it in CONTRIBUTING.md would make good-first-issue candidates self-evident.
Suggested: Add a comment above max-complexity explaining the stage roadmap (Stage 1=50, Stage 2=25). -
[recommended] Stage 2 deferred work has no tracking artifact -- risk of silent drift
The PR body says 'File-length reduction to 1400 lines -- tracked for Stage 2' but the PR closes Strangler Stage 1: tame the worst outliers #1077. If no open issue exists for Stage 2, the work will fall through the cracks.
Suggested: Open a follow-up issue referencing this PR before merging, tag it good-first-issue or help-wanted, and link it from the PR description. -
[nit] The PR title and body contain no user-facing story angle for the CHANGELOG
Release notes written 'refactor: Strangler Stage 1' are maintainer-readable, not user-readable.
Suggested: Add a CHANGELOG entry under an 'Internal / DX' section naming the concrete benefit to contributors.
Auth Expert
- [nit] auth_resolver._build_git_env calls a private method across module boundary -- pre-existing, not introduced by this PR at
src/apm_cli/install/validation.py
The extracted _validate_ado_git_package function calls auth_resolver._build_git_env(...) directly. This pre-existed in the monolithic version. Extracting the function makes the coupling more visible.
Suggested: Add a public build_bearer_env(token, host_kind) method to AuthResolver, or document the private usage with a TODO in auth.py.
Doc Writer -- inactive
Pure refactor with no doc-writer surface touched. CONTRIBUTING.md, CHANGELOG.md, and docs/ contain no references to the changed thresholds (MAX_LINES=2450, max-complexity). No README claims, skill prose, or natural-language artifacts are affected by this PR.
Test Coverage Expert
-
[recommended] policy_checks.py loop consolidation: no integration-tier smoke for the consolidated loop at
tests/unit/policy/test_run_dependency_policy_checks.py
20+ unit scenarios cover individual check paths. The PR consolidated these into a unified loop. Grepped tests/integration/ for 'policy_check' and 'run_dependency_policy_checks' -- no integration-tier hit.
Proof (missing at integration-with-fixtures):tests/integration/test_policy_install_e2e.py-- proves: Consolidated policy-check loop catches every check category (hash-pin, fetch-failure, mcp) through the real install pipeline [secure-by-default, governed-by-policy] -
[nit] IntegratorBundle frozen constraint is not explicitly tested (mutation attempt) at
tests/unit/install/test_services.py
test_services.py constructs IntegratorBundle and exercises field access. No test asserts that attempting to mutate a field raises AttributeError. Grepped for 'frozen', 'TypeError', 'immutable' -- no hits.
Proof (missing at unit):tests/unit/install/test_services.py::test_integrator_bundle_is_immutable-- proves: IntegratorBundle fields cannot be mutated after construction (frozen=True contract) [DevX] -
[nit] semver _CMP_OPS dispatch table: all four operators (>=, >, <=, <) exercised by existing tests (affirming) at
tests/unit/marketplace/test_semver.py
test_gte, test_gt, test_lte, test_lt cover all four prefix entries in the dispatch table. No gap found.
Proof (passed, unit):tests/unit/marketplace/test_semver.py::test_gte / test_gt / test_lte / test_lt-- proves: All four comparison operators in _CMP_OPS dispatch table produce correct results [DevX, Vendor-neutral]
This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.
Generated by PR Review Panel for issue #1464 · ● 2.9M · ◷
Copilot findings: - pack.py: remove __error__ sentinel from _parse_marketplace_filter; function now returns None after _emit_json_error_or_raise (which always raises or exits). Fix non-ASCII chars in docstrings. - publisher.py: tighten _check_ref_guards new_sv type from object to SemVer | None with TYPE_CHECKING import. Panel R (recommended): - pyproject.toml: add Stage 2 roadmap comments above thresholds (max-complexity<=20, max-branches<=25 targets). - Add integration regression test for consolidated policy_checks tail_checks loop covering mcp, compilation-target, and explicit-includes categories in a single call. Panel N (nits): - IntegratorBundle: type fields as BaseIntegrator instead of Any, using TYPE_CHECKING import. - _clean_toml_mcp_config: add logger/use_rich params matching _clean_json_mcp_config signature. - Move PLR0913 suppression from file-level pyproject.toml to inline noqa on pack_cmd. - Add IntegratorBundle frozen immutability test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
TL;DR
Tighten Ruff complexity thresholds to Stage 1 targets and refactor all 13 functions that exceed them. All changes are behaviour-preserving; 14,987 unit tests pass, 0 failures.
Problem
Issue #1077 identifies functions with excessive cyclomatic complexity, branch counts, statement counts, argument counts, and return counts. The current Ruff thresholds are too permissive, allowing complexity to accumulate unchecked.
Approach
Decompose oversized functions into focused helpers, introduce dataclasses to reduce argument counts, and apply table-driven dispatch to reduce return counts. Then tighten thresholds so new code cannot regress.
Changes
Complexity decomposition (C901/PLR0912/PLR0915)
compile/cli.pymcp_integrator_install.pymcp_integrator.pyvalidation.pyresolve.pypack.pyArgument reduction (PLR0913)
services.pyIntegratorBundlefrozen dataclassinstall.pydry_runparamcommand.pyconflicts.pyReturn reduction (PLR0911)
publisher.pysemver.py_CMP_OPSdispatchpolicy_checks.pyThreshold updates
pyproject.toml: max-complexity=50, max-branches=60, max-statements=200, max-args=15, max-returns=12ci.yml: MAX_LINES 2450 -> 2000Deferred
Validation
Closes #1077