fix(inspect): harden Docker discovery with batched inspect, fix find, add tests#103
Conversation
… recursion filters
Move all business logic from cmd/create/chat_archive.go (313 lines) to pkg/chatarchive/ (6 files). cmd/ layer is now ~60 lines of thin orchestration per the cmd/ vs pkg/ enforcement rule. Cross-platform fixes: - Use filepath.ToSlash() for all path pattern matching (Windows compat) - Add platform-aware default sources (Claude Code, Codex, Windsurf, Cursor) - Use existing parse.ExpandHome instead of private duplicate (DRY) - Bounded JSON validation reads (4KB max, prevents OOM on large files) New features: - Add `eos backup chats` command as convenience alias - Idempotent archival: loads existing manifest to skip re-copying - Structured logging via otelzap (replaces fmt.Printf in business logic) - Error context wrapping on all error paths (OWASP A10) Testing (90.3% coverage): - Unit tests: sanitize, hash, manifest, discover, defaults (table-driven) - Integration tests: full flow, idempotent re-run, dry-run, error paths - Race detector: clean pass Follow-up issues created: - #84: fix golangci-lint config for v2 - #85: add file locking for concurrent manifest writes - #86: add Windows/Linux CI matrix - #87: add symlink loop detection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…very P0 fixes: - Fix dead hash test assertion (wantHash declared but never compared) - Add post-copy byte count verification (defense-in-depth for NFS/FUSE) P1 fixes: - MergeEntries no longer mutates input manifest (returns new pointer) - Add manifest schema version field (version: 1) for future migration - Tighten isCandidate /dev/ match with home-directory boundary check - Add --exclude flag for operator false-positive suppression P2 fixes: - Pre-commit hook runs build + unit tests only (~5s vs ~60s) - Remove Node.js dependency from CI coverage thresholds (pure bash) - Make isExcludedArchiveDir data-driven instead of hardcoded Test evidence: - Unit coverage: 71.3% (above 70% floor) - Combined unit+integration coverage: 90.0% (meets 90% floor) - Race detector: clean - E2E smoke tests: clean - npm run ci: all stages green Follow-up issues: #91 (concurrent locking), #92 (JSON output), #93 (incremental mtime tracking), #94 (golangci-lint wiring) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rison - CI workflow: replace hardcoded go-version '1.25' with go-version-file to stay in sync with go.mod (was 1.24.6/1.24.7) - CI script: move coverage threshold checks before summary JSON write so artifacts are not written with "passed" status when thresholds fail - discover.go: use errors.Is(err, io.EOF) instead of bare == comparison per Go best practice for wrapped error chains Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
During rebase onto origin/main, the cmd/backup/chats.go conflict was resolved incorrectly — the thin chatarchive alias was kept instead of main's full chatbackup (restic) implementation. This commit restores main's version and removes the misleading "eos backup chats (alias)" reference from chat_archive.go since the two commands now serve different purposes: chatarchive (dedup) vs chatbackup (encrypted backup). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The rebase carried forward chatarchive-specific replacements for .github/hooks/pre-commit and setup-hooks.sh. Main's versions are more comprehensive and should be preserved — the chatarchive CI has its own workflow file. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…command, and add tests - Fix P0 bug: `find` command passed `2>/dev/null` as literal arg and had ungrouped `-type f` that only applied to last `-name` pattern - Fix P0 perf: batch `docker inspect` to 2 execs instead of N+1 - Fix P1 security: add MaxComposeFileSize guard (10MB) to prevent OOM - Fix P1 correctness: ParseDockerSize now uses SI/decimal units matching Docker's actual format (1 GB = 1e9, not 1024^3) - Fix P1 style: remove emoji from all log messages in inspect package - Add deterministic output: sort networks, ports, volumes for stable diffs - Extract pure parsing functions for testability - Add 24 tests: unit (70%), integration (20%), e2e (10%) - Add constants for compose search paths, file names, sensitive keywords Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| name: Chat Archive CI (${{ matrix.os }}) | ||
| runs-on: ${{ matrix.os }} | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| os: [ubuntu-latest, macos-latest, windows-latest] | ||
|
|
||
| steps: | ||
| - name: Checkout | ||
| uses: actions/checkout@v4 | ||
|
|
||
| - name: Set up Go | ||
| uses: actions/setup-go@v5 | ||
| with: | ||
| go-version-file: 'go.mod' | ||
|
|
||
| - name: Set up Node | ||
| uses: actions/setup-node@v4 | ||
| with: | ||
| node-version: '20' | ||
|
|
||
| - name: Download Go modules | ||
| run: go mod download | ||
| shell: bash | ||
|
|
||
| - name: Run chat archive CI | ||
| run: npm run ci | ||
| shell: bash | ||
|
|
||
| - name: Upload verification summary | ||
| if: always() | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: chatarchive-summary-${{ matrix.os }} | ||
| path: outputs/chatarchive-ci/ | ||
|
|
||
| - name: Publish job summary | ||
| if: always() | ||
| shell: bash | ||
| run: | | ||
| if [[ -f outputs/chatarchive-ci/summary.txt ]]; then | ||
| cat outputs/chatarchive-ci/summary.txt >> "$GITHUB_STEP_SUMMARY" | ||
| fi |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 24 days ago
In general, to fix this class of issue you explicitly declare a permissions block at the workflow level (applies to all jobs) or at the specific job level (overrides workflow defaults) that grants only the minimum scopes required. For a pure CI workflow that only checks out code, runs tests, and uploads artifacts, contents: read is typically sufficient; actions/upload-artifact does not require elevated GitHub API permissions beyond what contents: read provides for the checkout step.
For this specific workflow, the simplest and safest fix without changing existing functionality is to add a workflow-level permissions block right after the name declaration and before the on: trigger. This block should set contents: read, which is sufficient for actions/checkout@v4 to read the repository and does not grant write access to any resources. No job-specific overrides are necessary because the single chatarchive-ci job only needs read access. No additional imports or methods are required, as this is purely a YAML configuration change.
Concretely:
-
Edit
.github/workflows/chatarchive-quality.yml. -
Insert:
permissions: contents: read
between the existing line
name: Chat Archive Qualityand theon:block. -
Leave the rest of the workflow unchanged.
| @@ -1,5 +1,8 @@ | ||
| name: Chat Archive Quality | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| on: | ||
| pull_request: | ||
| paths: |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d80d09c4be
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - name: Run chat archive CI | ||
| run: npm run ci |
There was a problem hiding this comment.
Invoke the chatarchive pipeline from the new workflow
This job never runs the new chatarchive checks. I checked package.json, and npm run ci still maps to npm run ci:debug --silent, so scripts/chatarchive-ci.sh is never executed here. As a result the dedicated unit/integration/e2e/coverage gates for pkg/chatarchive/** are skipped, and the workflow usually will not produce the outputs/chatarchive-ci/ artifacts it tries to upload.
Useful? React with 👍 / 👎.
| if err := os.Rename(tmpPath, dst); err != nil { | ||
| return fmt.Errorf("replace destination: %w", err) |
There was a problem hiding this comment.
Replace existing archive files in a Windows-safe way
The manifest-recovery flow in Archive reuses the same hash-based destination names after RecoverManifest, so it will hit pre-existing archived files if the manifest is corrupt but the payload files are still present. On Windows, os.Rename does not overwrite an existing destination, so this path turns into a replace destination failure for every previously archived transcript and the manifest cannot be rebuilt there.
Useful? React with 👍 / 👎.
| if _, ok := seen[path]; ok { | ||
| continue | ||
| } | ||
| seen[path] = struct{}{} |
There was a problem hiding this comment.
Normalize source roots before deduplicating them
defaultSourcesForPlatform now adds both ~/Dev and ~/dev, but ResolveOptions keys seen by the raw absolute path only. On case-insensitive hosts such as Windows and default macOS, those two defaults resolve to the same directory with different casing, so discovery walks the tree twice and every transcript under that folder is counted as a duplicate of itself. Canonicalizing the path before deduping would avoid that false-duplicate behavior in the default configuration.
Useful? React with 👍 / 👎.
…eplace find with WalkDir - Extract CommandRunner interface for dependency injection (testability seam) - Replace shell `find` with filepath.WalkDir for compose file discovery (portable, depth-limited, no shell escaping issues) - Batch network and volume inspects (2 commands instead of N+1) - Fix parseMemoryInfo to handle Swap lines (4 fields, not 7) - Extract constants: CommandTimeout, ComposeSearchMaxDepth - Downgrade per-command logging from Info to Debug - Add splitNonEmpty DRY helper for newline-separated output parsing - Add composeFileNameSet for O(1) compose file name matching - Skip symlinks in compose file discovery (security) - Use os.Lstat in readComposeFile for symlink detection Test coverage for changed files: docker.go ~94%, inspector.go ~100% (up from 4.6% overall). 60+ tests covering pure functions, mock-based Inspector methods, realistic Docker JSON, and e2e data flows. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Fixes #102
findcommand was passing2>/dev/nullas a literal argument and had ungrouped-type f- now uses proper\( -name X -o -name Y \)groupingdocker inspect(2 execs instead of N+1) with graceful fallback to individual inspectMaxComposeFileSize(10MB) guard to prevent OOM from oversized/malicious compose filesParseDockerSizenow uses SI/decimal units matching Docker's actual format (1 GB = 1e9)parseContainerInspectJSON,parseEnvVars,isSensitiveEnvVar,ParseDockerSize)Test plan
pkg/inspect/docker_test.goParseDockerSize,parseEnvVars,isSensitiveEnvVar,parseContainerInspectJSONgo build -o /tmp/eos-build ./cmd/passesgo vet ./pkg/inspect/passesgofmt -lreturns nothing for changed filesgolangci-lint run ./pkg/inspect/- zero issues in changed filesGenerated with Claude Code