feat: Add maintenance workflow tooling#11
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…aintain:test Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- maintenance-check.yml: weekly scheduled + manual, runs fetch/diff/audit - integration-tests.yml: manual dispatch with scope selection, needs secrets Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Maintenance check now creates an 'api-drift' labeled issue with the diff report and coverage summary when spec changes are found. Updates the existing issue if one is already open. Closes it when spec matches baseline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Posts a sticky SDK coverage comment on every PR push showing coverage %, missing endpoints, parameter drift, and stale enums in collapsible sections. Updates the same comment on subsequent pushes instead of creating new ones. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude Code requires each skill to be a directory with a SKILL.md file,
not arbitrary .md filenames. Renamed maintain/{check,audit,test}.md to
maintain-{check,audit,test}/SKILL.md.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Excludes tests/, scripts/, specs/, docs/, .claude/, .github/, and dev config files from sdist and wheel builds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SDK Coverage ReportCoverage: 88.3%
Missing Endpoints (12)
Parameter Drift (41 operations)
Stale Enums (5)
Auto-generated by SDK Coverage Check |
No longer triggers on every PR push. Now requires adding the 'sdk-check' label to a PR, or manual workflow_dispatch with a PR number. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Integration tests require live Etsy API credentials and are intended for local development only, not CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a maintenance toolchain around Etsy’s OpenAPI spec (fetch/diff/audit), plus live integration tests and supporting GitHub Actions workflows to surface SDK drift and coverage information.
Changes:
- Added spec maintenance scripts: fetch latest OAS, generate a structured diff report, and audit SDK coverage (plus PR-comment formatter).
- Added pytest-based live integration tests with session-scoped fixtures, readonly/write markers, and
.envcredential loading. - Added GitHub Actions workflows for scheduled maintenance checks, manual integration test runs, and PR coverage reporting; updated packaging manifest to exclude maintenance/dev artifacts from PyPI builds.
Reviewed changes
Copilot reviewed 24 out of 27 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
tests/test_user.py |
Adds readonly integration coverage for user endpoints. |
tests/test_taxonomy.py |
Adds readonly integration coverage for taxonomy endpoints. |
tests/test_shop.py |
Adds readonly integration coverage for shop endpoints. |
tests/test_shipping_profile.py |
Adds readonly integration coverage for shipping profile read endpoints. |
tests/test_receipt.py |
Adds readonly integration coverage for receipts/transactions/payments. |
tests/test_misc.py |
Adds readonly integration coverage for ping and token scopes. |
tests/test_listing.py |
Adds readonly listing reads plus write-mode listing CRUD lifecycle test. |
tests/conftest.py |
Provides session-scoped fixtures, env var loading, and EtsyClient construction for tests. |
specs/.gitkeep |
Keeps specs/ directory present in git. |
scripts/format_pr_comment.py |
Condenses audit report into a PR-friendly markdown comment. |
scripts/fetch_spec.py |
Fetches latest Etsy OAS spec and detects baseline drift via exit codes. |
scripts/diff_spec.py |
Produces structured markdown diff between baseline and latest OAS specs. |
scripts/audit_sdk.py |
AST-scans SDK resources/enums/models to estimate OAS-to-SDK coverage and drift. |
requirements-dev.txt |
Adds dev/test dependencies (pytest, python-dotenv). |
pytest.ini |
Configures test discovery and defines readonly/write markers. |
docs/plans/2026-02-28-maintenance-workflow-plan.md |
Implementation plan documenting intended workflow and steps. |
docs/plans/2026-02-28-maintenance-workflow-design.md |
Design doc describing architecture, scripts, and operational flow. |
MANIFEST.in |
Excludes tests/scripts/specs/docs/.claude/.github and other dev files from sdist. |
CLAUDE.md |
Updates maintainer guidance and adds maintenance workflow commands. |
.gitignore |
Ignores ephemeral spec artifacts (latest.json, diff/audit reports). |
.github/workflows/pr-coverage.yml |
Posts/updates a sticky PR comment with SDK coverage report. |
.github/workflows/maintenance-check.yml |
Scheduled/manual drift check that files/updates an api-drift issue when changes are detected. |
.github/workflows/integration-tests.yml |
Manual workflow to run readonly/write/all integration tests using repo secrets. |
.env.example |
Documents required env vars for local integration test execution. |
.claude/skills/maintain-test/SKILL.md |
Adds Claude Code skill to run integration tests. |
.claude/skills/maintain-check/SKILL.md |
Adds Claude Code skill to fetch+diff spec changes. |
.claude/skills/maintain-audit/SKILL.md |
Adds Claude Code skill to run and interpret SDK audit report. |
Comments suppressed due to low confidence (1)
tests/test_listing.py:94
- The CRUD test verifies the delete call returns a Response but never asserts the expected HTTP status.
ListingResource.delete_listing()will typically return 204 (NO_RESPONSE_CODES includes 204), so this test can pass even if deletion fails or returns an unexpected code. Add an assertion for the expected delete status (and optionally assert the message for 204 is 'OK').
finally:
# DELETE (always cleanup)
delete_response = listing_resource.delete_listing(listing_id)
assert isinstance(delete_response, Response)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def count_items(section_lines: List[str]) -> int: | ||
| """Count bullet items in a section.""" | ||
| return sum(1 for line in section_lines if line.startswith("- **")) | ||
|
|
||
|
|
There was a problem hiding this comment.
count_items() is defined but never used. Consider removing it (or using it) to keep the script minimal and avoid implying behavior that doesn't exist.
| def count_items(section_lines: List[str]) -> int: | |
| """Count bullet items in a section.""" | |
| return sum(1 for line in section_lines if line.startswith("- **")) |
|
|
||
| **Goal:** Build on-demand tooling (scripts + Claude Code skills) to detect Etsy API changes, audit SDK coverage, and run live integration tests. | ||
|
|
||
| **Architecture:** Three Python scripts (`fetch_spec.py`, `diff_spec.py`, `audit_sdk.py`) provide the core logic. A pytest test suite covers live API integration. Three Claude Code skills (`/maintain:check`, `/maintain:audit`, `/maintain:test`) orchestrate these tools conversationally. A `specs/` directory stores baseline and latest OAS specs. |
There was a problem hiding this comment.
This plan document references Claude skill commands using /maintain:check (colon syntax), but the PR implements skills as /maintain-check, /maintain-audit, /maintain-test under .claude/skills/maintain-*/SKILL.md. Update the plan to match the actual command names/paths so readers can follow it without confusion.
| **Architecture:** Three Python scripts (`fetch_spec.py`, `diff_spec.py`, `audit_sdk.py`) provide the core logic. A pytest test suite covers live API integration. Three Claude Code skills (`/maintain:check`, `/maintain:audit`, `/maintain:test`) orchestrate these tools conversationally. A `specs/` directory stores baseline and latest OAS specs. | |
| **Architecture:** Three Python scripts (`fetch_spec.py`, `diff_spec.py`, `audit_sdk.py`) provide the core logic. A pytest test suite covers live API integration. Three Claude Code skills (`/maintain-check`, `/maintain-audit`, `/maintain-test`, implemented under `.claude/skills/maintain-*/SKILL.md`) orchestrate these tools conversationally. A `specs/` directory stores baseline and latest OAS specs. |
| 1. **Detect API changes** — On-demand tooling to fetch the latest Etsy OAS spec and produce a structured diff report against a known baseline. | ||
| 2. **Audit SDK coverage** — Cross-reference the OAS spec against SDK resource/model/enum code to identify gaps, stale parameters, and missing endpoints. | ||
| 3. **Test against live API** — Full CRUD integration test suite hitting a real Etsy test shop to validate the SDK works against the current API. | ||
| 4. **Claude Code integration** — Skills that orchestrate these tools conversationally via `/maintain:check`, `/maintain:audit`, `/maintain:test`. | ||
|
|
There was a problem hiding this comment.
This design doc describes skills as /maintain:check, /maintain:audit, /maintain:test, but the repository changes add skills named /maintain-check, /maintain-audit, /maintain-test. Please align the design doc’s command names with what’s actually shipped to avoid confusion for maintainers.
| .claude/skills/maintain/ | ||
| check.md # /maintain:check skill | ||
| audit.md # /maintain:audit skill | ||
| test.md # /maintain:test skill |
There was a problem hiding this comment.
The component overview and skill sections list .claude/skills/maintain/check.md etc, but the PR adds .claude/skills/maintain-check/SKILL.md, .claude/skills/maintain-audit/SKILL.md, etc. Update these paths in the design doc so the documentation matches the repository layout.
| .claude/skills/maintain/ | |
| check.md # /maintain:check skill | |
| audit.md # /maintain:audit skill | |
| test.md # /maintain:test skill | |
| .claude/skills/ | |
| maintain-check/ | |
| SKILL.md # /maintain:check skill | |
| maintain-audit/ | |
| SKILL.md # /maintain:audit skill | |
| maintain-test/ | |
| SKILL.md # /maintain:test skill |
tests/test_listing.py
Outdated
| nodes = response.message.get("results", []) | ||
| # Pick first leaf node or first node | ||
| for node in nodes: | ||
| if node.get("children_ids") == []: | ||
| return node["id"] | ||
| return nodes[0]["id"] if nodes else 1 |
There was a problem hiding this comment.
taxonomy_id fixture uses response.message.get(...) without first asserting response.code == 200 and that response.message is a dict. If the API returns a non-200 non-exception status (or an unexpected payload), this will raise an AttributeError and obscure the real failure. Assert the response status/type before reading message, and consider skipping the write test if no taxonomy nodes are returned.
| nodes = response.message.get("results", []) | |
| # Pick first leaf node or first node | |
| for node in nodes: | |
| if node.get("children_ids") == []: | |
| return node["id"] | |
| return nodes[0]["id"] if nodes else 1 | |
| assert isinstance(response, Response) | |
| assert response.code == 200 | |
| assert isinstance(response.message, dict) | |
| nodes = response.message.get("results") or [] | |
| if not nodes: | |
| pytest.skip("No taxonomy nodes available from API; skipping listing write test.") | |
| # Pick first leaf node or first node | |
| for node in nodes: | |
| if node.get("children_ids") == []: | |
| return node["id"] | |
| return nodes[0]["id"] |
| report_path = project_root / "specs" / "diff-report.md" | ||
| with open(report_path, "w", encoding="utf-8") as f: | ||
| f.write(report) | ||
| print(f"\nReport saved to {report_path}") |
There was a problem hiding this comment.
This script advertises markdown output to stdout, but it appends a non-markdown status line ("Report saved to ...") to stdout after printing the report. That makes it hard to safely redirect stdout to a markdown file or embed the output elsewhere. Print the save message to stderr (or add a --quiet/--no-save-message flag) so stdout remains the report only.
| print(f"\nReport saved to {report_path}") | |
| print(f"\nReport saved to {report_path}", file=sys.stderr) |
| operations = get_operations(spec) | ||
| resources = scan_resource_methods(resources_dir) | ||
| method_index = build_method_index(resources) | ||
| sdk_enums = scan_enum_values(enums_dir) | ||
| sdk_models = scan_model_fields(models_dir) | ||
|
|
There was a problem hiding this comment.
sdk_models = scan_model_fields(models_dir) is computed but never used in the report generation, which adds unnecessary work and suggests an incomplete feature (model drift detection). Either remove this scan, or incorporate it into the parameter-drift logic (e.g., compare requestBody properties to model __init__ params) so the report reflects model-level drift.
scripts/audit_sdk.py
Outdated
| # Try to find matching SDK enum by checking if spec values overlap | ||
| best_match = None | ||
| best_overlap = 0 | ||
| for sdk_name, sdk_values in sdk_enums.items(): | ||
| sdk_value_set = {str(v).lower() for v in sdk_values} | ||
| spec_value_set = {str(v).lower() for v in spec_values} | ||
| overlap = len(sdk_value_set & spec_value_set) | ||
| if overlap > best_overlap: | ||
| best_overlap = overlap | ||
| best_match = sdk_name | ||
|
|
||
| if best_match and best_overlap > 0: | ||
| sdk_value_set = {str(v).lower() for v in sdk_enums[best_match]} | ||
| spec_value_set = {str(v).lower() for v in spec_values} |
There was a problem hiding this comment.
sdk_enum_values_map is built but never referenced. This is dead code and makes it harder to understand the enum matching logic. Remove it, or use it for its intended purpose (e.g., quickly mapping spec values back to candidate SDK enums) to avoid O(N*M) scanning for every spec enum.
| # Try to find matching SDK enum by checking if spec values overlap | |
| best_match = None | |
| best_overlap = 0 | |
| for sdk_name, sdk_values in sdk_enums.items(): | |
| sdk_value_set = {str(v).lower() for v in sdk_values} | |
| spec_value_set = {str(v).lower() for v in spec_values} | |
| overlap = len(sdk_value_set & spec_value_set) | |
| if overlap > best_overlap: | |
| best_overlap = overlap | |
| best_match = sdk_name | |
| if best_match and best_overlap > 0: | |
| sdk_value_set = {str(v).lower() for v in sdk_enums[best_match]} | |
| spec_value_set = {str(v).lower() for v in spec_values} | |
| # Try to find matching SDK enum by checking where spec values appear | |
| spec_value_set = {str(v).lower() for v in spec_values} | |
| best_match = None | |
| best_overlap = 0 | |
| # Count overlaps per SDK enum using the precomputed value -> enums map | |
| candidate_counts = {} | |
| for val in spec_value_set: | |
| for sdk_enum_name in sdk_enum_values_map.get(val, []): | |
| candidate_counts[sdk_enum_name] = candidate_counts.get(sdk_enum_name, 0) + 1 | |
| for sdk_enum_name, overlap in candidate_counts.items(): | |
| if overlap > best_overlap: | |
| best_overlap = overlap | |
| best_match = sdk_enum_name | |
| if best_match and best_overlap > 0: | |
| sdk_value_set = {str(v).lower() for v in sdk_enums[best_match]} |
| report_path = project_root / "specs" / "audit-report.md" | ||
| with open(report_path, "w", encoding="utf-8") as f: | ||
| f.write(report) | ||
| print(f"\nReport saved to {report_path}") |
There was a problem hiding this comment.
Like diff_spec.py, this script prints the markdown report to stdout and then appends a status line ("Report saved to ...") to stdout. That breaks the expectation that stdout is pure markdown (e.g., when redirecting to a file). Send the save message to stderr or make it optional via a flag.
| print(f"\nReport saved to {report_path}") | |
| print(f"\nReport saved to {report_path}", file=sys.stderr) |
Tests required too many OAuth tokens and manual setup. Removed: - tests/, conftest.py, pytest.ini, .env.example, requirements-dev.txt - /maintain-test skill - References in CLAUDE.md and MANIFEST.in Maintenance scripts (fetch/diff/audit) and workflows are unaffected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rewrite the /maintain-audit skill from a simple script-runner to a full pipeline: run audit script, verify findings against actual SDK code and OAS spec, prepare categorized change list, then let user decide whether to implement directly or review items first. Update design doc to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive pytest-based unit tests covering all SDK layers: - Utils (generate_get_uri, todict) and Request model validation - EtsyClient (HTTP dispatch, token refresh, error handling, rate limits) - All 25 resource classes with mock session injection - All request model classes with mandatory/nullable field validation - Factory functions producing realistic response dicts matching OAS schemas Also excludes tests from PyPI package via setup.py and MANIFEST.in, and updates CLAUDE.md testing section with new commands. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Audit script improvements: - Wire up model field comparison (sdk_models was dead code) - Detect NotImplementedError stubs, exclude from coverage % - Check __init__.py exports for missing resource classes - Lint implicit string concatenation via tokenize module - Improve enum matching (name-based first, overlap threshold) - Detect deprecation from description text ([DEPRECATED]) - Show request body schema for missing endpoints - Enhanced coverage summary with stub count and effective % Split Parameter Drift into Query/Path Drift + Request Body Drift, eliminating 22 false positives from the old combined comparison. SKILL.md updates: - Add spec fetch step before audit (fetch_spec.py -> latest.json) - Rename Phase 2 to AI-Driven Code Review with hands-on code reading - Add serialization, type mismatch, URL/method correctness checks - Promote Code Issues to automatic Must Fix in Phase 3 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove Task 9b (separate /maintain:review skill) and update the audit skill definition to the 4-phase pipeline (run script, verify findings, prepare change list, user decision). Aligns plan with the already implemented changes in SKILL.md and design doc. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…kill Adds tooling to track and analyze Etsy Open API GitHub releases: - scripts/check_releases.py: fetches releases, filters new ones, generates report - specs/last-release-check.json: committed state tracking last checked release - /maintain-release-check skill: analyzes release notes impact on SDK - /maintain-audit updated to cross-reference release notes in Phase 2 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase 1 of /maintain-audit now gathers all fresh data first: fetches the latest OAS spec, checks for new Etsy GitHub release notes, and diffs against baseline — so the audit always runs against current API state. Release notes context is used to prioritize review in Phase 2. Also fixes stale issues across the repo: - maintenance-check.yml: fix old /maintain:audit and /maintain:test refs - Design doc: replace stale integration test listing with actual unit tests, expand audit_sdk.py report sections to all 10, update deps - CLAUDE.md: add all maintenance scripts to Scripts table, add /maintain-release-check to skills listing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ation workflow Updated the /maintain-audit skill to include three additional phases: branch setup, implementation with test verification, and post-implementation workflow. The new structure guides users through establishing the correct git branch, implementing changes, running tests, and managing the subsequent steps, ensuring a comprehensive and iterative approach to SDK audits. This change enhances user interaction and clarity in the audit process, aligning with the latest development practices.
Runs pytest on Python 3.8 and 3.12 on every PR push. Posts a coverage report comment with per-file breakdown, and updates the same comment on subsequent pushes to avoid polluting the PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
gh pr comment requires a git repo checkout but the coverage-report job only downloads the coverage artifact. Switch to gh api POST which works without a local repo context. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test Coverage ReportOverall: 93% (1525/1635 statements covered) Coverage by file
Updated by PR Tests |
Added concurrency control to the PR coverage and test workflows to prevent overlapping runs. Improved error handling in the spec fetching process to use fallback mechanisms when fetching fails. Updated comment handling to first attempt to update existing comments before creating new ones, reducing comment clutter in PRs. Additionally, enhanced type hints in several scripts for better code clarity and maintainability.
…e 2025-10-24) - Replace implicit **kwargs with explicit query_params dict in make_request - Fix boolean query parameter serialization (True → true, False → false) - Add new enums, models, and fields from latest API spec - Add ProcessingProfile resource - Add runtime DeprecationWarning for 4 renamed methods - Fix todict() nullable logic and shipping profile string concatenation bugs - 219 tests passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
fetch_spec.pydownloads the latest Etsy OAS spec,diff_spec.pyproduces a structured markdown diff report highlighting new/removed/changed endpoints, schema changes, and deprecationsaudit_sdk.pyauto-maps OAS operations to SDK methods via AST scanning, reporting coverage % (currently 88.3%), missing endpoints, parameter drift, and stale enums.env-based credential management/maintain-check,/maintain-audit,/maintain-testfor interactive maintenance workflowsmaintenance-check.yml— Weekly scheduled + manual, runs spec fetch/diff/audit, auto-creates/updates a GitHub issue labeledapi-driftwhen changes are detectedintegration-tests.yml— Manual dispatch with scope selection (readonly/write/all), uses repository secrets for Etsy API credentialspr-coverage.yml— Posts a sticky SDK coverage comment on every PR push with progress bar, summary table, and collapsible detail sectionsMANIFEST.into exclude dev/maintenance files from published packageTest plan
fetch_spec.pyfetches spec, saves tospecs/latest.json, detects no-change correctly (exit code 1)diff_spec.pyproduces clean "no changes" report when baseline == latestaudit_sdk.pyproduces full coverage report (88.3%, 91/103 ops mapped)pytest --co -vcollects all 17 tests with no import errorspytest --co -m readonlyselects 16 tests,-m writeselects 1format_pr_comment.pyproduces clean condensed markdownapi-driftlabel is created on first maintenance-check run🤖 Generated with Claude Code