Skip to content

sync: comprehensive audit fixes — PYTHONPATH, SQL injection, era mapper, test coverage#37

Merged
simongonzalezdc merged 5 commits into
mainfrom
sync/audit-fixes-may-2026
May 26, 2026
Merged

sync: comprehensive audit fixes — PYTHONPATH, SQL injection, era mapper, test coverage#37
simongonzalezdc merged 5 commits into
mainfrom
sync/audit-fixes-may-2026

Conversation

@simongonzalezdc
Copy link
Copy Markdown
Member

@simongonzalezdc simongonzalezdc commented May 26, 2026

Summary

Synced from DEV-ARCH after comprehensive audit pass (2026-05-26).

Fixes included

  • SQL injection — parameterized queries in agent_benchmark.py, null-safe era_row access
  • PYTHONPATH injection — subprocess build-db calls now propagate the package root so tests pass regardless of CWD
  • Era mapper ISO datesload_eras now handles 2026-01-01 to 2026-01-02 format alongside the original Jan 1 - Jan 5 format
  • visualize command — commit_eras data now included in inline PROJECT_DATA JS object (was silently empty)
  • signals UX — exits nonzero with clear message when DB missing; distinguishes 0-signal result from missing DB
  • local_pipeline.py — added timeout=300 to subprocess call
  • Duplicate serve command — renamed dashboard server to dashboard to stop silently overwriting the datasette server

Test coverage

  • New file: tests/test_cli_coverage.py — 26 tests for 17 previously untested CLI commands (93 total, all passing)

View with Codesmith Autofix with Codesmith
Need help on this PR? Tag @codesmith with what you need. Autofix is disabled.

Summary by CodeRabbit

  • New Features

    • Added theme switching and improved navigation to project dashboard
    • Added comprehensive analysis deliverables including agent mastery, architecture timelapse, creative patterns, and cognitive load insights
  • Improvements

    • Enhanced session handling reliability when data is unavailable
    • Updated local inference configuration to use localhost for easier development
    • Added subprocess timeout protection for data processing
  • Tests

    • Added CLI command test coverage for error handling and workflows

Review Change Stack

- Replace hardcoded improvements list in run_source_archaeologist with
  _derive_improvements(): scores recommendations from flapping hotspots,
  todo/stub signals, decomposition momentum, and quality ratio
- Add _approximate_sessions() fallback in agentic-workflow vector:
  groups commits via 2-hour inactivity gap when sessions table is absent
- Fix modelTimeline in visualization template: was hardcoded to 7 eras,
  now derived from D.commit_eras — era scanner no longer flags stale entries
Synced from DEV-ARCH 4cf5261
@simongonzalezdc simongonzalezdc merged commit f825557 into main May 26, 2026
2 of 12 checks passed
@simongonzalezdc simongonzalezdc deleted the sync/audit-fixes-may-2026 branch May 26, 2026 20:32
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 2749967d-3889-412c-9509-b5f7705335bc

📥 Commits

Reviewing files that changed from the base of the PR and between 9638ae6 and eb8f858.

📒 Files selected for processing (28)
  • .cursorrules
  • .github/copilot-instructions.md
  • .windsurfrules
  • AGENTS.md
  • archaeology/analysis_runner.py
  • archaeology/cli.py
  • archaeology/era_mapper.py
  • archaeology/local_pipeline.py
  • archaeology/visualization/agent_benchmark.py
  • archaeology/visualization/template.html
  • projects/demo-project/deliverables/archaeology.html
  • projects/demo-project/deliverables/opportunity/opportunity-ai-agent-mastery.json
  • projects/demo-project/deliverables/opportunity/opportunity-architecture-timelapse.json
  • projects/demo-project/deliverables/opportunity/opportunity-before-after-snapshot.json
  • projects/demo-project/deliverables/opportunity/opportunity-commit-cognitive-load.json
  • projects/demo-project/deliverables/opportunity/opportunity-creative-dna.json
  • projects/demo-project/deliverables/opportunity/opportunity-cross-repo-transfer.json
  • projects/demo-project/deliverables/opportunity/opportunity-frustration-to-automation.json
  • projects/demo-project/deliverables/opportunity/opportunity-knowledge-gap.json
  • projects/demo-project/deliverables/opportunity/opportunity-learning-velocity.json
  • projects/demo-project/deliverables/opportunity/opportunity-model-selection-advisor.json
  • projects/demo-project/deliverables/opportunity/opportunity-neurodivergent-profile.json
  • projects/demo-project/deliverables/opportunity/opportunity-session-quality.json
  • projects/demo-project/deliverables/opportunity/opportunity-token-efficiency.json
  • projects/demo-project/deliverables/opportunity/opportunity-youtube-learning-graph.json
  • projects/demo-project/deliverables/visuals/archaeology.html
  • scripts/data/generate_missing_deliverables.py
  • tests/test_cli_coverage.py

📝 Walkthrough

Walkthrough

This PR hardens the devarch framework for robustness and flexibility while modernizing the demo visualization. Local inference endpoints migrate from IP addresses to localhost for better portability. The analysis engine now derives sessions and recommendations dynamically when database tables are missing. Date parsing becomes flexible across multiple formats. SQL queries use parameterization to prevent injection. Charts render dynamically from runtime data. The demo HTML gains a theme system, navigation scroll-spy, and semantic terminology updates from primary/other to liminal/other. A comprehensive CLI test suite validates all command paths.

Changes

Local Inference Endpoint Migration

Layer / File(s) Summary
Inference endpoint configuration updates
.cursorrules, .github/copilot-instructions.md, .windsurfrules, AGENTS.md, scripts/data/generate_missing_deliverables.py
Four documentation files and one Python script update local LLM server addresses from 100.66.225.85:1234 to localhost:1234 for portability across different network/tunnel configurations.

Framework Robustness and Visualization

Layer / File(s) Summary
CLI environment and command registration fixes
archaeology/cli.py
Subprocess calls in demo --build-db, build-db, cascade, and sync commands now pass PYTHONPATH environment variable to ensure package discovery. The signals command validates database existence before processing. The dashboard command is renamed to dashboard. The visualize command initializes era data explicitly and merges era metadata into embedded JSON.
Analysis engine robustness for missing data
archaeology/analysis_runner.py
AnalysisRunner now derives session groupings from commit timestamps when the sessions table is absent using a 2-hour inactivity heuristic. The source-archaeologist analysis computes hotspots via SQL and generates improvements dynamically from commit signals including flapping patterns, TODOs, and quality keywords instead of hardcoded recommendations.
Flexible era date parsing and pipeline timeout
archaeology/era_mapper.py, archaeology/local_pipeline.py
load_eras() now parses era dates in multiple formats (ISO, month-only, Jan DD style) and supports multiple range separators, handles single-date eras, skips unparseable dates, and supports year wraparound. The local pipeline runner enforces a 300-second subprocess timeout.
SQL injection prevention in era queries
archaeology/visualization/agent_benchmark.py
Era metadata retrieval switches from f-string SQL interpolation to parameterized queries with null guards. Era mapping construction safely parses dates strings only when they contain the expected range separator.
Dynamic chart data from runtime context
archaeology/visualization/template.html
Model Adoption Timeline chart now derives era entries, colors, and model strings dynamically from D.commit_eras and D.telemetry_agents instead of using hardcoded era mappings, making visualization responsive to actual analysis data.
Demo HTML theme system, navigation, and styling
projects/demo-project/deliverables/archaeology.html
Redesigned with editorial branding metadata, CSS theme token variables for dark/light mode and accessibility, sticky navigation bar with scroll-spy highlighting, theme switcher for persistent user preference via localStorage, and improved focus indicators. Section labels updated to reflect development eras terminology.
Chart semantic updates
projects/demo-project/deliverables/archaeology.html
Monthly Commit Velocity chart changes from primary/other to liminal/other terminology. Session Depth Gradient adds name formatter and enhanced axis label shortening. Topic Evolution eras dataset updated. Model Adoption Timeline header comments reflect new era terminology.
Demo project sample deliverable JSON files
projects/demo-project/deliverables/opportunity/*
Thirteen JSON files provide initialized/sample analysis outputs for ai-agent-mastery, architecture-timelapse, before-after-snapshot, commit-cognitive-load, creative-dna, cross-repo-transfer, frustration-to-automation, knowledge-gap, learning-velocity, model-selection-advisor, neurodivergent-profile, session-quality, token-efficiency, and youtube-learning-graph analyses.
Comprehensive CLI test suite
tests/test_cli_coverage.py
New pytest module exercises Click CLI error paths (missing repos, invalid configs, missing databases), happy paths (visualize, fetch-github, publish-static, build-db with PYTHONPATH), and command behavior validation (cascade --dry-run, sync, dashboard, analyze) using mocking, filesystem fixtures, and CliRunner isolation.

🎯 3 (Moderate) | ⏱️ ~20 minutes

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/audit-fixes-may-2026

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eb8f858c6c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +204 to +205
if prev_ts is None or (ts - prev_ts).total_seconds() > GAP_HOURS * 3600:
session_id = ts.strftime("%Y%m%d-%H%M%S")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard mixed timestamp types before computing session gaps

When a commit row has an unparsable date, prev_ts is set to a string (YYYY-MM-DD), but the next parseable row produces a datetime, and this subtraction path executes ts - prev_ts. That raises TypeError and aborts run_agentic_workflow for repositories without a sessions table but with mixed/dirty date rows (e.g., one blank or malformed date among normal commits). This turns the new fallback logic into a runtime crash instead of a graceful approximation.

Useful? React with 👍 / 👎.

assert db_builder_calls, "db.builder was never invoked"
env = db_builder_calls[0]["env"]
assert "PYTHONPATH" in env
assert "archaeology" in env["PYTHONPATH"] or "DEV-ARCH" in env["PYTHONPATH"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Remove repository-name assumptions from PYTHONPATH test

This assertion hardcodes path substrings ("archaeology" or "DEV-ARCH") that are unrelated to correctness, so the test fails in valid checkouts whose root path has a different name (for example, /workspace/devarch-framework). The command can propagate PYTHONPATH correctly and still fail this test, creating a false-negative CI failure.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant