Skip to content

chore: add constrained Crabbox setup#104

Merged
vincentkoc merged 1 commit into
mainfrom
chore/setup-baseline-safe-20260523
May 22, 2026
Merged

chore: add constrained Crabbox setup#104
vincentkoc merged 1 commit into
mainfrom
chore/setup-baseline-safe-20260523

Conversation

@vincentkoc
Copy link
Copy Markdown
Member

Summary

  • Problem: Crabbox setup was missing from the repo baseline, but the previous all-in-one baseline PR was too broad to land as-is.
  • What changed: copied the exact Crabbox skill from openclaw/openclaw, added constrained Crabbox config/workflow, added actionlint runner-label config, added CODEOWNERS coverage for the new setup files, and added package scripts for the copied skill command surface.
  • What this does not cover: no CodeQL, stale automation, licensing, Dependabot, or fixture/report policy changes.

Fixture impact

  • Adds or updates fixture manifest entries
  • Updates submodule pins
  • Changes contract/smoke logic
  • Docs only

No fixture manifests, submodule pins, or contract logic changed.

Verification

  • git diff --check
  • Ruby YAML parse for .crabbox.yaml, .github/actionlint.yaml, and .github/workflows/crabbox-hydrate.yml
  • actionlint -config-file .github/actionlint.yaml .github/workflows/crabbox-hydrate.yml
  • Crabbox skill SHA-256 matched openclaw/openclaw: ed512c0b0385fae7f6c5c14a7e9e6236ab68936506687a99ca976873492bdc43
  • Package script presence check for check:changed, test:changed, and crabbox:*
  • Private-path scan for new public files, excluding existing baseline/report artifacts
  • npm test
  • node scripts/sync-fixtures.mjs --check
  • node scripts/run-contract-smoke.mjs
  • strict fixture smoke, if submodules were materialized

Notes

npm test was attempted but is not a valid proof in this fresh worktree without materialized plugin submodules: 10 fixture/capture/report tests failed because expected plugin entrypoints and fixture captures were absent. No live Crabbox lease was started for this setup-only patch.

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 22, 2026

Codex review: found issues before merge.

Latest ClawSweeper review: 2026-05-22 21:44 UTC / May 22, 2026, 5:44 PM ET.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The PR adds a constrained Crabbox setup for crabpot, including a Crabbox skill, .crabbox.yaml, a self-hosted hydrate workflow, actionlint/CODEOWNERS coverage, and package scripts.

Reproducibility: yes. for the review finding: source inspection of the PR head shows the skill references scripts/crabbox-wrapper.mjs and .github/workflows/ci-check-testbox.yml, while current main has neither path. There is no product bug report to reproduce.

PR rating
Overall: 🦐 gold shrimp
Proof: 🌊 off-meta tidepool
Patch quality: 🦐 gold shrimp
Summary: The patch is plausible infrastructure work, but repo-specific skill correctness and the self-hosted workflow boundary still need resolution.

Rank-up moves:

  • Replace the OpenClaw-only Testbox helper commands with crabpot-valid commands or remove that lane.
  • Have maintainers confirm dispatch permissions and runner isolation for the self-hosted Crabbox hydrate workflow.
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Not applicable: The external-contributor proof gate does not apply to this member-authored setup PR; the body also says no live Crabbox lease was started.

Risk before merge

  • The copied skill still contains OpenClaw-only helper paths and workflow names, so agents following the Blacksmith/Testbox lane in crabpot will fail before reaching Crabbox.
  • The new workflow is a manual workflow_dispatch path on self-hosted runners that checks out an input ref and runs npm install; maintainers should confirm who can dispatch it and what runner isolation/secrets are available.
  • The PR body says no live Crabbox lease was started, so the hydrate and attach loop has static validation but not target-repo smoke proof yet.

Maintainer options:

  1. Fix the repo-specific skill lane first (recommended)
    Adapt or remove the OpenClaw-only Testbox wrapper section, then run static workflow validation plus one maintainer-approved Crabbox hydrate smoke before merge.
  2. Accept as maintainer-only infrastructure
    Maintainers may intentionally accept the self-hosted workflow once they have confirmed dispatch permissions, ephemeral runner isolation, and absence of unintended secrets or shared state.
  3. Pause until the runner boundary is settled
    If the self-hosted workflow policy is not ready, keep the PR open or split the non-runner files from the workflow-bearing change.

Next step before merge
The remaining work combines a narrow skill fix with maintainer approval of a self-hosted runner policy boundary, so it should stay in human review rather than cleanup-close or automated repair.

Security
Needs attention: The diff introduces a workflow_dispatch path on self-hosted runners, so maintainers need to confirm the input-ref and runner-isolation boundary before merge.

Review findings

  • [P2] Replace the OpenClaw-only Testbox helper — .agents/skills/crabbox/SKILL.md:199-203
Review details

Best possible solution:

Land a crabpot-specific Crabbox baseline after replacing OpenClaw-only skill commands with repo-valid commands and after maintainers approve the self-hosted runner dispatch/ref boundary.

Do we have a high-confidence way to reproduce the issue?

Yes for the review finding: source inspection of the PR head shows the skill references scripts/crabbox-wrapper.mjs and .github/workflows/ci-check-testbox.yml, while current main has neither path. There is no product bug report to reproduce.

Is this the best way to solve the issue?

No; copying the OpenClaw skill unchanged is not the best crabpot solution because it leaves non-existent helper paths. The narrower maintainable path is to adapt the skill to the files and workflows this repo actually provides.

Label justifications:

  • P2: This is a normal-priority repository automation/setup PR with limited runtime blast radius but real merge-review work remaining.
  • merge-risk: 🚨 automation: The PR adds repo automation instructions and package scripts, and one copied skill path currently points at helpers that do not exist in crabpot.
  • merge-risk: 🚨 security-boundary: The PR introduces a self-hosted workflow dispatch path that runs an input ref on a runner.
  • rating: 🦐 gold shrimp: Current PR rating is 🦐 gold shrimp because proof is 🌊 off-meta tidepool, patch quality is 🦐 gold shrimp, and The patch is plausible infrastructure work, but repo-specific skill correctness and the self-hosted workflow boundary still need resolution.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Not applicable: The external-contributor proof gate does not apply to this member-authored setup PR; the body also says no live Crabbox lease was started.

Full review comments:

  • [P2] Replace the OpenClaw-only Testbox helper — .agents/skills/crabbox/SKILL.md:199-203
    This new skill tells agents to run node scripts/crabbox-wrapper.mjs with .github/workflows/ci-check-testbox.yml, but crabpot has neither path on main. Agents following this Blacksmith/Testbox lane will fail before reaching Crabbox, so the copied section needs to use commands that exist here or be removed.
    Confidence: 0.93

Overall correctness: patch is incorrect
Overall confidence: 0.86

Security concerns:

  • [medium] Confirm self-hosted ref execution boundary — .github/workflows/crabbox-hydrate.yml:42
    The hydration workflow checks out ${{ inputs.ref || github.ref }} and runs npm install on a self-hosted runner. Even with read-only repository permissions, maintainers should explicitly approve who may dispatch arbitrary refs and what isolation or secrets are available on those runners.
    Confidence: 0.84

Acceptance criteria:

  • actionlint -config-file .github/actionlint.yaml .github/workflows/crabbox-hydrate.yml
  • One maintainer-approved Crabbox hydrate smoke against the PR head after the skill commands are adapted

What I checked:

  • Current main does not already contain this setup: Current main has no .crabbox.yaml, crabbox-wrapper.mjs, ci-check-testbox.yml, .github/actionlint.yaml, or CODEOWNERS file, and package.json only exposes the existing scripts through test on lines 10-39. (package.json:10, 2680adb3aaf9)
  • Copied skill references missing crabpot paths: The PR head skill instructs agents to run node scripts/crabbox-wrapper.mjs and .github/workflows/ci-check-testbox.yml, but those paths do not exist on current main. (.agents/skills/crabbox/SKILL.md:199, 35c6e4869287)
  • Self-hosted workflow boundary is security-sensitive: The added hydrate workflow runs on self-hosted Crabbox labels, checks out an input ref, and runs npm install, so dispatch permissions and runner isolation need maintainer confirmation. (.github/workflows/crabbox-hydrate.yml:39, 35c6e4869287)
  • PR body notes no live Crabbox smoke: The PR verification lists static checks and explicitly says no live Crabbox lease was started for this setup-only patch.
  • Related setup baseline was closed unmerged: GitHub search shows the prior broader setup baseline at chore: add maintainer setup baseline #103 is closed and unmerged, so this narrower PR is not superseded by a landed baseline.
  • Workflow/package ownership history: Git blame and log show Vincent Koc authored the current package/workflow baseline and later workflow PNPM/dashboard fixes in the relevant area. (package.json:10, 9d8c0f473d31)

Likely related people:

  • vincentkoc: Current-main blame and workflow history show Vincent Koc introduced the crabpot package/workflow baseline and made recent CI workflow changes in the same automation surface. (role: recent area contributor; confidence: high; commits: 9d8c0f473d31, 59b90b8b36aa, fb2264049c7a; files: package.json, .github/workflows/check.yml, .github/workflows/openclaw-ref-compat.yml)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 2680adb3aaf9.

@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. labels May 22, 2026
@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 22, 2026

ClawSweeper PR egg

🔥 Warming up: real-behavior proof passed; findings, security review, or rank-up moves are still in progress.

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.
What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@vincentkoc vincentkoc marked this pull request as ready for review May 22, 2026 21:39
@vincentkoc vincentkoc merged commit 4fa495f into main May 22, 2026
14 checks passed
@vincentkoc vincentkoc deleted the chore/setup-baseline-safe-20260523 branch May 22, 2026 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 automation 🚨 Merging this PR could break CI, automerge, proof capture, label sync, or automation. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. P2 Normal priority bug or improvement with limited blast radius. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant