Skip to content

tryscribeco/scribe

Repository files navigation

Scribe

AI-powered SEO content platform for small businesses. Scribe generates niche-specific blog articles and DALL-E featured images, then publishes them into brand-specific blog experiences served from a shared root domain.

Live: tryscribe.co | Platform: app.tryscribe.co


Overview

Scribe has two main runtime surfaces:

  • Platform: the Next.js app that powers the marketing site, dashboard, API routes, and blog rendering
  • Worker: the long-running job processor that polls MongoDB, spawns agent sessions, writes article output, and handles image generation workflows

The platform and worker do not talk to each other directly over a private API. They coordinate through MongoDB job and content records.

Architecture

graph TD
    subgraph Platform ["☁️ Vercel (Single Project)"]
        A["tryscribe.co — Marketing + Blog Subfolders"]
        AA["app.tryscribe.co — Dashboard + API"]
    end

    subgraph Worker ["🖥️ EC2 Worker (us-east-1)"]
        C["Scroll Worker<br/>(Job Poller)"]
        D["OpenClaw Gateway"]
        E["Scribe Walker<br/>(Claude Opus 4.6)"]
        C -->|"spawns agent session"| D
        D -->|"runs"| E
    end

    subgraph Data ["💾 Data Layer"]
        B["MongoDB Atlas<br/>ScribeCluster<br/>(Articles, Jobs, Sites)"]
    end

    subgraph Media ["🖼️ Media Pipeline"]
        F["OpenAI API<br/>(DALL-E 3)"]
        G["Vercel Blob<br/>(Image CDN)"]
        F -->|"compressed + uploaded"| G
    end

    AA -->|"writes job to DB"| B
    A -->|"reads articles"| B
    A -->|"serves images from"| G
    C -->|"polls for pending jobs"| B
    E -->|"writes articles"| B
    E -->|"generates images"| F
Loading

Key: There is no direct API connection between the platform (Vercel) and the EC2 worker. They communicate through MongoDB only. The Scroll Worker, OpenClaw Gateway, and Scribe Walker all run privately on the EC2 instance and are not exposed to the public internet.

End-to-end flow: User clicks "Summon Scribe" → API creates job + placeholder articles in MongoDB → Scroll Worker polls and picks up job → worker spawns Scribe Walker via OpenClaw Gateway → agent writes articles + generates images → results are saved to MongoDB → processed images are uploaded to CDN → completion email is sent.

URL Architecture (Subfolder Model)

All client blog content is served under tryscribe.co/{brand}/ subfolders to consolidate SEO authority on the root domain. The dashboard lives on app.tryscribe.co.

URL Purpose
tryscribe.co/ Marketing site
tryscribe.co/{brand}/ Brand blog home (e.g., tryscribe.co/sallys-spa)
tryscribe.co/{brand}/{slug} Individual article
app.tryscribe.co/dashboard User dashboard
app.tryscribe.co/onboarding New user onboarding

Why subfolders over subdomains: tryscribe.co is a new domain. Subfolders consolidate all keyword authority, backlink juice, and content value under one root domain. Subdomains would scatter SEO value across isolated domains. Legacy subdomain URLs (brand.tryscribe.co) are 301 redirected to the subfolder equivalent.


Project Structure

At a high level, the repo is split between the user-facing platform, the worker runtime, and deployment or progress documentation.

scribe/
├── platform/           # Next.js app (dashboard, API, blog renderer)
│   ├── src/app/        # Pages + API routes
│   ├── src/lib/        # Shared libs (db, auth, image processing)
│   └── .vercel/        # Vercel project config (platform-try-scribe)
├── worker/             # Scroll Worker (job poller + agent orchestration)
│   ├── job-worker.js   # Main worker process
│   ├── prompts/        # Centralized prompt modules
│   │   ├── article-writing.js   # buildScribePrompt()
│   │   ├── quality-rules.js     # SEO quality rules, word count
│   │   ├── dalle-image.js       # DALL-E prompt builder
│   │   └── tags.js              # Article tag generation
│   └── agent-workspace/         # Version-controlled agent identity files
│       ├── SOUL.md, AGENTS.md, IDENTITY.md, TOOLS.md, USER.md, HEARTBEAT.md
│       └── seo/SCRIBE-WALKER-CONTEXT.md
├── public/             # Marketing site (tryscribe.co) static files
├── PROGRESS.md         # Build tracker + daily work log
├── JOURNEY-PITCH.md    # Development journey for pitch material
└── vercel.json         # Marketing site Vercel config

Deployment

Scribe uses a split deployment model:

  • Vercel serves the platform, dashboard, API routes, and marketing surface
  • EC2 runs the worker, OpenClaw gateway, and agent runtime

Platform — tryscribe.co + app.tryscribe.co (Single Project)

  • Vercel project: platform-try-scribe
  • Domains: tryscribe.co, www.tryscribe.co, app.tryscribe.co, *.tryscribe.co (legacy redirect)
  • Deploys: Manual CLI only via cd platform && npx vercel --prod --yes
  • Root directory: platform/
  • ⚠️ NOT auto-deploying on git push — Always deploy manually after pushing changes.
  • Build time: ~45-55 seconds
  • Framework: Next.js 14 (App Router)
  • Auth required: Must be logged into Vercel CLI (npx vercel login). Token can expire — if deploy fails with "Not authorized", re-login.

Note: The old try-scribe Vercel project previously served the static marketing site. As of Mar 9, 2026, the marketing site is bundled into the platform project (served from platform/public/marketing.html via middleware rewrite). The try-scribe project can be safely deleted.

EC2 Worker — Scroll Worker + Scribe Walker

  • Stack: t3.small, Ubuntu 24.04, us-east-1
  • Services: Two systemd units:
    • openclaw-gateway.service — OpenClaw agent runtime
    • scribe-worker.service — Scroll Worker (job poller)
  • Access: See worker/DEPLOYMENT.md for SSH, instance details, and credentials

Updating EC2 Worker

Every EC2 update follows: pull → (install) → (deploy workspace) → restart.

What changed Steps
Prompt JS or worker code git pull → restart scribe-worker
Worker dependencies (worker/package.json) git pullcd worker && npm install → restart scribe-worker
Workspace .md files (SOUL, AGENTS, etc.) git pull → copy files to ~/.openclaw/workspace/ → restart openclaw-gateway AND scribe-worker
Everything git pullnpm install → copy workspace → restart both services

⚠️ Gateway Cache Gotcha: OpenClaw gateway caches workspace files on startup. git pull does NOT update the agent's workspace — files must be copied from worker/agent-workspace/ to ~/.openclaw/workspace/. Then restart the gateway. Without this, the agent uses stale identity/context files.

⚠️ Don't forget git pull: Restarting services without pulling first just restarts the old code.

See worker/DEPLOYMENT.md for exact SSH commands, paths, and troubleshooting.

EC2 Environment

Required env vars (in /etc/scribe/.env, root-owned, chmod 600):

MONGODB_URI, OPENAI_API_KEY, BLOB_READ_WRITE_TOKEN, EMAIL_SERVER, WORKER_SECRET, PLATFORM_URL, MAX_CONCURRENT, POLL_INTERVAL, NODE_ENV

See worker/DEPLOYMENT.md for full paths, credentials locations, and agent config.

Local setup

Environment templates are included for local setup:

  • platform/.env.example
  • worker/.env.example

Copy them to local env files and fill in real values locally. Do not commit populated env files.


Worker Routing

Jobs can be routed to EC2 (default) or Mac Mini (dev/testing):

Target How Worker Filter
EC2 (default) Normal Summon Scribe { status: "pending", worker: { $ne: "local" } }
Mac Mini ?worker=local in dashboard URL { status: "pending", worker: "local" }
  • Dashboard URL app.tryscribe.co/dashboard?worker=local passes param to generate API
  • Job document gets worker: "local" field

Limitations:

  • Only article generation supports routing. Image regeneration jobs always go to EC2.
  • Mac Mini worker is not a systemd service — it must be started manually:
    cd worker && WORKER_TARGET=local AGENT_ID=scribe-walker node job-worker.js
  • EC2 worker runs automatically via systemd (auto-restart on failure)

Prompt Modules

All article-generation rules and prompt assembly live in worker/prompts/:

File Exports Purpose
article-writing.js buildScribePrompt(job) Full agent prompt assembly
quality-rules.js buildQualityBlock(), buildCtaBlock(), buildBrandSeoBlock() SEO rules, word count (1400-1700 target, 1200 min, 1900 max)
dalle-image.js buildDallePrompt(), buildDalleRulesBlock() Image generation prompts
tags.js buildTagsBlock() Article tagging rules

Design decision: Prompts stay in JS files, not in MongoDB. This keeps prompt logic version-controlled and deployable through the normal code path.


Image Pipeline

  1. DALL-E 3 generates at 1792x1024 (landscape enforced)
  2. sharp compresses to JPEG Q85
  3. Vercel Blob stores on CDN
  4. 3-layer padding detection rejects bad images and auto-queues regeneration:
    • Square/portrait check (width ≤ height)
    • Solid-color edge bars (20px scan → 100px confirmation, any color)
    • Soft padding (edge-vs-center color distance > 40)

AWS Infrastructure

  • Region: us-east-1 (permanent — Identity Center requirement)
  • Budget: $150/month with alerts at 80% actual + 100% forecast
  • Access: SSO via IAM Identity Center, CLI profile scribe-admin
  • Details: See worker/DEPLOYMENT.md for account IDs, SSO portal URL, and setup

Key Decisions

See PROGRESS.md for the full decision log. Highlights:

  • Scribe Walker = main agent on EC2 — full OpenClaw lifecycle (compaction, hooks, model updates)
  • Stateless jobs — unique session ID per job, no memory carryover
  • DB stores DATA, JS stores RULES — site configs in MongoDB, prompt rules in code
  • Exact dedup only — no semantic/contextual dedup (businesses want multiple angles on same topic)
  • Claude Opus 4.6 for article writing, OpenAI for DALL-E only
  • t3.small x86 over ARM — universal compatibility, battle-tested

Troubleshooting

flowchart TD
    A["Articles not generating?"] --> B{"Check job status<br/>in MongoDB"}
    B -->|"status: pending (stuck)"| C{"Is EC2 worker running?"}
    B -->|"status: failed"| D["Check job.error field<br/>for error message"]
    B -->|"status: processing<br/>(for >10 min)"| E["Agent may be hung.<br/>Restart scribe-worker"]
    B -->|"No job exists"| F["Platform issue —<br/>check Vercel function logs"]
    
    C -->|No| G["SSH in and check<br/>systemctl status scribe-worker"]
    C -->|Yes| H{"Is gateway running?"}
    
    H -->|No| I["Restart openclaw-gateway"]
    H -->|Yes| J["Check worker logs<br/>via journalctl"]
    
    D --> K{"Common errors"}
    K -->|"Unknown agent id"| L["Wrong AGENT_ID or<br/>gateway needs restart"]
    K -->|"askFallback: deny"| M["tools.exec.security<br/>must be 'full' on EC2"]
    K -->|"DALL-E error"| N["OpenAI issue —<br/>retry, usually transient"]
    K -->|"session history"| O["Unique session ID<br/>not working — check<br/>job-worker.js"]
Loading

Quick checks:

  • Worker alive?sudo systemctl status scribe-worker (via SSH)
  • Gateway alive?sudo systemctl status openclaw-gateway (via SSH)
  • Recent logs?sudo journalctl -u scribe-worker --since "5 min ago" (via SSH)
  • Job stuck? → Check MongoDB jobs collection for status and error fields
  • Images failing? → Check if OpenAI API key is valid, DALL-E errors are usually transient
  • Articles too short/long? → Check worker/prompts/quality-rules.js word count settings
  • Wrong demographic in images? → DALL-E overrides ethnicity prompts — this is expected. Images prefer no-people shots.
  • Workspace changes not taking effect? → Restart openclaw-gateway (caches .md files)
  • Platform deploy fails "Not authorized"? → Run npx vercel login to refresh token

Common Operations

A few routine commands are enough for most day-to-day deploy and maintenance work.

# Deploy platform (tryscribe.co + app.tryscribe.co) — manual deploy required
cd platform && npx vercel --prod --yes

# Deploy EC2 worker — see worker/DEPLOYMENT.md for full SSH commands
# Short version: SSH in → git pull → restart services

Monitoring

  • Worker health endpoint: GET /api/worker/health — worker pings every 60s
  • EC2 worker logs: sudo journalctl -u scribe-worker -f (via SSH)
  • Gateway logs: sudo journalctl -u openclaw-gateway -f (via SSH)
  • Job status: Check MongoDB jobs collection — status field shows pending, processing, complete, or failed
  • Failed jobs: Worker auto-retries up to 3 times. Check error field on failed jobs for diagnosis.

For full EC2 operations, see worker/DEPLOYMENT.md.


Documentation

Doc Location Purpose
Architecture platform/docs/ARCHITECTURE.md Full system design, agent lifecycle, decision log
SEO Checklist platform/docs/SEO-QUALITY-CHECKLIST.md 22-point article quality gate
Deployment worker/DEPLOYMENT.md EC2 ops, SSH, credentials, troubleshooting
Progress PROGRESS.md Build tracker + daily work log
Journey JOURNEY-PITCH.md Development story for pitch material

About

Scribe — Autonomous SEO content engine that writes, optimizes, and publishes articles for your business automatically. Proven on 1,285+ articles in 60 days. Zero setup, zero CMS, zero human edits. Tell us your business, watch your first articles appear in minutes. Free tier included. Built by operators who scaled it on their own site first.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors