Last updated: 2026-03-09
graph TD
User("🌐 User Browser")
User -->|"Sign in and access dashboard"| AppPlatform
AppPlatform -->|"Write structured job request"| JobsCollection
AppPlatform -->|"Read and write site config"| SitesUsers
AppPlatform -.->|"Poll article status every 2.5s"| ArticlesCollection
ArticlesCollection -.->|"Serve published content"| ContentSite
subgraph MongoDB["🗄️ MongoDB Atlas"]
JobsCollection("Jobs Collection")
ArticlesCollection("Articles Collection")
SitesUsers("Sites and Users Collections")
end
subgraph OpenClaw["🤖 OpenClaw Engine"]
ScribeWalker("Scribe Walker, Agent Orchestration")
ClaudeOpus("🧠 Claude Opus 4.6")
DallE("🎨 DALL-E 3")
WebResearch("🔍 Web Research")
end
JobsCollection -->|"Poll and pick up pending jobs"| ScribeWalker
subgraph Vercel["☁️ Vercel Platform (Single Project)"]
ContentSite("tryscribe.co — Marketing + Blog Subfolders")
AppPlatform("app.tryscribe.co — Dashboard + API")
end
ScribeWalker -->|"Generate SEO articles"| ClaudeOpus
ScribeWalker -->|"Generate featured images"| DallE
ScribeWalker -->|"Research trending topics"| WebResearch
ScribeWalker -->|"Write completed articles"| ArticlesCollection
style User fill:#b3e5fc,stroke:#333,stroke-width:3px,color:#000
style Vercel fill:#fff8f0,stroke:#d4920a,stroke-width:2px,color:#000
style AppPlatform fill:#fff,stroke:#d4920a,stroke-width:3px,color:#000
style ContentSite fill:#fff,stroke:#7a8c6e,stroke-width:2px,color:#000
style MongoDB fill:#e8f5e9,stroke:#4caf50,stroke-width:2px,color:#000
style JobsCollection fill:#c8e6c9,stroke:#333,stroke-width:2px,color:#000
style ArticlesCollection fill:#c8e6c9,stroke:#333,stroke-width:2px,color:#000
style SitesUsers fill:#c8e6c9,stroke:#333,stroke-width:2px,color:#000
style OpenClaw fill:#fff3e0,stroke:#d4920a,stroke-width:2px,color:#000
style ScribeWalker fill:#ffe0b2,stroke:#d4920a,stroke-width:3px,color:#000
style ClaudeOpus fill:#f5f5f5,stroke:#333,stroke-width:2px,color:#000
style DallE fill:#f5f5f5,stroke:#333,stroke-width:2px,color:#000
style WebResearch fill:#f5f5f5,stroke:#333,stroke-width:2px,color:#000
linkStyle default interpolate basis
Flow: User signs in → App writes structured job → OpenClaw polls and picks it up → Scribe Walker generates articles autonomously → Dashboard polls and displays results in real-time
Color Legend: 🔵 Light blue = User entry point · 🟠 Amber = Vercel platform · 🟢 Green = MongoDB data layer · 🟡 Warm = OpenClaw engine · ⚪ Gray = AI tools
Core Principle: MongoDB is the ONLY bridge between the app and OpenClaw. They never communicate directly. The app writes structured job requests; OpenClaw picks them up and executes autonomously.
- Content URL:
tryscribe.co— marketing site + blog subfolders - Dashboard URL:
app.tryscribe.co— auth, onboarding, dashboard, billing - Deployment: Single Vercel project serves both domains
- Role: User-facing platform — auth, onboarding, dashboard, billing, blog content
- Responsibilities:
- User authentication (NextAuth: magic link + Google OAuth)
- Onboarding flow (brand name, niche, location)
- Writing job requests to MongoDB
- Polling article status and displaying results
- Serving blog content via subfolders (
tryscribe.co/{brand}/{slug}) - Marketing site at root (
tryscribe.co/) - Stripe billing and usage tracking
- Does NOT: Generate articles, call AI APIs, run any agent logic
Why subfolders over subdomains:
tryscribe.cois a new domain with zero authority- Every article under
tryscribe.co/{brand}/consolidates keyword and backlink authority on the root domain - Subdomains (
brand.tryscribe.co) would scatter SEO value across isolated domains - Sources: Cloudflare, Ahrefs, Semrush all lean subfolder for new domains
URL Structure:
| URL | Purpose |
|---|---|
tryscribe.co/ |
Marketing site (static HTML served via middleware rewrite) |
tryscribe.co/{brand}/ |
Brand blog home (e.g., tryscribe.co/sallys-spa) |
tryscribe.co/{brand}/{slug} |
Individual article page |
app.tryscribe.co/dashboard |
User dashboard |
app.tryscribe.co/onboarding |
New user onboarding |
Middleware Routing (src/middleware.ts):
- Legacy subdomain requests (
brand.tryscribe.co) → 301 redirect totryscribe.co/{brand}/ - App routes on content domain (
tryscribe.co/dashboard) → 302 redirect toapp.tryscribe.co/dashboard - Root path on content domain (
tryscribe.co/) → rewrite to/marketing.html - Brand slug detection → rewrite
/{brand}/{slug}to internal/blog/{brand}/{slug}route - Reserved paths (
api,auth,_next, etc.) pass through unchanged
Internal Route Structure:
Blog pages live at src/app/blog/[subdomain]/ internally (the subdomain param name is kept for backward compatibility but represents the brand slug in the subfolder URL).
Domain Separation:
tryscribe.co= content site only (marketing + blog articles). Dashboard routes redirect toapp.tryscribe.co.app.tryscribe.co= dashboard + API. Content updates here don't risk breaking the content site.- Both served from the same Vercel project with middleware-based routing.
- Cluster: ScribeCluster (currently M0 Free, AWS us-east-1)
- Role: Shared data layer and job queue
- Collections:
users— user accounts, plans, referralssites— brand configurations (niche, location, subdomain)articles— generated content (status: generating/published/failed)jobs— article generation job queue (NEW)sessions,accounts— NextAuth session management
- Current host: Mac Mini (local development)
- Future host: Linux VM (production)
- Role: AI orchestration engine — the brains
- Responsibilities:
- Polling
jobscollection for pending work - Running Scribe Walker agent sessions for each job
- Research, writing, image generation, quality checks
- Writing completed articles to
articlescollection - Updating job status (pending → processing → complete/failed)
- Polling
- Does NOT: Serve web traffic, handle user auth, manage billing
interface Job {
_id: ObjectId;
// Who requested it
userId: ObjectId;
siteId: ObjectId;
// What to generate
action: "generate"; // Typed enum — no freeform actions
params: {
brandName: string; // From site record
niche: string; // From site record
location?: string; // From site seoConfig
tone?: string; // "professional" | "casual" | "authoritative"
count: number; // Number of articles (default: 3)
topicStyles: string[]; // ["how-to", "tips", "why", "listicle", "guide"]
};
// Job lifecycle
status: "pending" | "processing" | "complete" | "failed";
priority: number; // Lower = higher priority (default: 10)
attempts: number; // Retry count (default: 0)
maxAttempts: number; // Max retries (default: 3)
// Results
articleIds: ObjectId[]; // Populated as articles are created
error?: string; // Error message if failed
// Timestamps
createdAt: Date;
startedAt?: Date;
completedAt?: Date;
}Only these actions are valid. OpenClaw rejects anything else:
| Action | Description | Params |
|---|---|---|
generate |
Generate new articles for a site | brandName, niche, location, tone, count, topicStyles |
rewrite |
Rewrite an existing article | articleId, instructions (from predefined set) |
refresh |
Generate more articles for existing site | same as generate |
No freeform prompts. No shell commands. No tool instructions. The job contains DATA, not INSTRUCTIONS. OpenClaw constructs its own prompts internally using SCRIBE-WALKER-CONTEXT.md and its agent reasoning.
1. User clicks "Summon Your Scribe ✒️"
2. App validates user auth + plan limits
3. App creates Job doc (status: "pending")
4. App creates placeholder Article docs (status: "generating")
5. App returns immediately — dashboard starts polling articles
6. OpenClaw polls jobs collection (every 5-10 seconds)
7. Picks up pending job, sets status: "processing", sets startedAt
8. Spawns Scribe Walker session with structured params
9. Scribe Walker:
a. Researches relevant topics for the niche/location
b. Writes SEO-optimized articles (Claude Opus 4.6)
c. Generates DALL-E 3 featured images
d. Quality checks (word count, SEO meta, no em dashes, etc.)
10. Updates Article docs: content, SEO meta, images, status: "published"
11. Updates Job doc: status: "complete", completedAt
12. Dashboard polling picks up published articles in real-time
13. User sees articles appear one by one (2-3 second poll interval)
If an attacker gains access to the app's MongoDB connection string, they could write malicious jobs.
OpenClaw validates every job against the typed schema before processing:
actionmust be in the allowed enumparamsmust match the expected shape for that action- All string fields have max length limits
- No nested objects beyond one level
- Any invalid job is rejected and logged as a security event
The job never contains prompts, instructions, or commands for the agent. OpenClaw uses the structured data fields (brandName, niche, location) to fill in its OWN hardcoded workflow. The agent's behavior is defined by SCRIBE-WALKER-CONTEXT.md, not by job data.
Think of it as: generateArticles(niche="plumbing", location="Salt Lake City") — a function call with typed parameters.
- App DB user: Write access to
jobsonly. Read access toarticles,sites,users. No access to system collections. - OpenClaw DB user: Full access to
jobs,articles. Read access tosites,users. - Even with compromised app credentials, attacker cannot modify articles or users directly.
- Per-user: Max 5 jobs per hour (configurable per plan)
- Global: Max 20 concurrent processing jobs
- Retry cap: Max 3 attempts per job, then permanent failure
- Enforced at both app level (before writing) and OpenClaw level (before processing)
- App signs each job with HMAC-SHA256 using a shared secret
signature = HMAC(jobId + siteId + action + timestamp, SECRET)- OpenClaw verifies signature before processing
- Unsigned or invalid-signature jobs are rejected
- Protects against direct DB manipulation even with full DB access
- OpenClaw instance is NOT publicly accessible — no open ports, no API endpoints
- Only outbound connections: OpenClaw connects TO MongoDB, Anthropic, OpenAI. Nothing connects TO OpenClaw.
- Job expiry: Jobs older than 1 hour auto-expire (prevents queue poisoning)
- Audit log: All job state transitions logged with timestamps
- Single OpenClaw instance on Taha's Mac Mini
- Handles 10 beta users easily
- Scribe Walker already proven (1,285+ articles)
- Limitation: tied to local machine uptime
- AWS EC2 t3.small (us-east-1), Ubuntu 24.04 LTS
- Scribe Walker as main OpenClaw agent (not sub-agent)
- OpenClaw gateway service (systemd, loopback)
- Prompt intelligence in
worker/prompts/*.js(version-controlled) - Worker routing (#67) for parallel testing with Mac Mini
- Cost: ~$20/mo (t3.small)
- Multiple OpenClaw instances polling the same job queue
- MongoDB's
findOneAndUpdatewith atomic status transitions prevents double-processing - Each instance picks up different jobs — natural load balancing
- Can scale horizontally by adding VMs
- Trigger: when single instance can't keep up with job volume
- Mac VMs are expensive ($100-200+/mo via MacStadium/AWS)
- Scribe's article generation doesn't need macOS-specific features
- No iMessage, no Apple Contacts, no macOS UI automation needed
- Linux gives us everything: Node.js, headless browser, API access
- Decision: Linux VM for production
The quality comes from the agentic orchestration, not just the model:
- Research phase — Agent browses web, checks trends, finds angles
- Topic differentiation — Checks existing articles to avoid duplicates
- Writing with reasoning — Claude Opus reasons about structure, SEO, audience
- Image matching — Agent crafts DALL-E prompts that specifically match article content
- Quality gate — Self-checks word count, SEO meta completeness, no banned patterns
- Context awareness — Uses SCRIBE-WALKER-CONTEXT.md for consistent style/rules
- Each job spawns an isolated Scribe Walker session (sub-agent)
- Session receives: brand context (name, niche, location) + SCRIBE-WALKER-CONTEXT.md base rules
- Sessions are isolated — one user's generation doesn't affect another's
- Single agent, multiple sessions — not multi-agent (simpler, sufficient for MVP)
-
seo/SCRIBE-WALKER-CONTEXT.md— writing rules, quality gates, image procedures - OpenClaw config (
openclaw.json) — agent settings, auth profiles - Anthropic auth (setup-token or API key)
- OpenAI API key (for DALL-E)
- MongoDB connection string
- Any learned patterns from
memory/files relevant to article quality
- Polling interval: How often should OpenClaw check for new jobs? 5s? 10s? Webhook-triggered?
- Article count per plan: Free tier gets 10+5/mo — do we enforce this at app level, OpenClaw level, or both?
- Concurrent generation: Should we limit to 1 job at a time per instance, or allow parallel sessions?
- Error handling UX: What does the user see if generation fails? Auto-retry? Manual retry button?
Image storage:Resolved — Vercel Blob CDN with sharp JPEG Q85 compression (#51)Research depth:Resolved — Tiered: Free = evergreen only, Pro = seasonal, Scale = web researchSubdomain SSL:Resolved — Migrated to subfolder model (Mar 9, 2026). No wildcard certs needed.
| Date | Decision | Rationale |
|---|---|---|
| 2026-03-04 | MongoDB as job queue (not REST API) | Decoupled, no direct access to OpenClaw, easier to scale |
| 2026-03-04 | Typed job schema, no prompt passthrough | Security — prevents command injection via DB |
| 2026-03-04 | Linux VM over Mac VM for production | Cheaper, sufficient features, Scribe doesn't need macOS |
| 2026-03-04 | Single agent, multiple sessions | Simpler than multi-agent, sufficient for MVP scale |
| 2026-03-04 | Claude Opus 4.6 for all tiers | Quality first, cost modeling later |
| 2026-03-04 | DALL-E 3 for featured images | Proven quality from 1,285+ articles on tahaabbasi.com |
| 2026-03-07 | Scribe Walker as main agent on EC2 | Full OpenClaw lifecycle (compaction, hooks, model updates) without custom plumbing |
| 2026-03-07 | Migration + evolution, not 1:1 copy | Mac prototype proven; EC2 must incorporate all quality intelligence patterns |
| 2026-03-07 | Prompt modules as centralized intelligence | worker/prompts/*.js = single source of truth for quality rules across article gen + image regen |
| 2026-03-07 | System-level systemd for gateway | openclaw gateway install fails over SSH; manual unit file matches Hetzner docs pattern |
| 2026-03-07 | 80% evergreen / 20% seasonal-timely | Evergreen is the backbone for local business SEO; trending = "relevant now" not news slop |
| 2026-03-07 | Exact/near-exact dedup only (no contextual) | Contextual dedup backfires — businesses WANT multiple articles on same topic from different angles |
| 2026-03-07 | Dedup window scales with plan | Free=all(15), Pro=all(50), Scale=last 100, Agency=configurable |
| 2026-03-07 | Rename Business tier to Scale (#64) | Better name for the 150 articles/mo tier |
| 2026-03-07 | Services field for Pro+ only (#65) | Free stays frictionless; Pro+ gets targeted articles via services list |
| 2026-03-07 | Prompt modules stay in JS files | Security > hot-reload. Deploy = git pull + restart. No DB-stored prompts. |
| 2026-03-07 | Stateless agent (no persistent memory) | Consistent with proven Mac Mini pattern. Each job independent. |
| 2026-03-07 | Worker routing for migration testing (#67) | Default = EC2, ?worker=local = Mac Mini. Temporary. |
| 2026-03-07 | IndexNow with dev mode gate (#66) | Submit for subdomains, defer custom domains, never submit in dev/test |
| 2026-03-09 | Subfolder model over subdomains (#80) | New domain needs consolidated SEO authority; every article under tryscribe.co/{brand}/ strengthens root domain |
| 2026-03-09 | Separate content site from dashboard | tryscribe.co = content + marketing, app.tryscribe.co = dashboard. Same Vercel project, middleware-separated. Dashboard deploys don't risk content site. |
| 2026-03-09 | No 301 redirects for old subdomains (test data) | Test sites only, no shared links exist. Legacy subdomain middleware handles any stray hits with 301. |
Added: 2026-03-07 — Documents the evolution from prototype to production
The Scribe Walker concept was proven on Taha Abbasi's Mac Mini, where it operated as a sub-agent within the "Walker Posse" — a family of specialized agents orchestrated by Benny J Walker (the primary OpenClaw agent).
How it worked on Mac Mini:
- Benny (main agent) ran cron jobs that spawned ephemeral Scribe Walker sessions
- Each session received a task message + the full
seo/SCRIBE-WALKER-CONTEXT.md(~700 lines) - The session wrote articles, published them, and terminated
- Benny's own agent backing (SOUL.md, MEMORY.md, identity, reliability patterns) provided implicit quality
- OpenClaw managed session lifecycle, compaction, error handling
What made it effective (proven over 1,285+ articles on tahaabbasi.com):
| Capability | How It Worked | Why It Mattered |
|---|---|---|
| Quality gates | Word count enforcement (1000+ min), pre-publish checklist, self-review | Prevented thin/low-quality content from going live |
| Duplicate prevention | Last 100 titles checked contextually (not just exact slug match) | Avoided writing "Why Microneedling Works" 4 articles apart |
| Brand SEO integration | Brand in title, first paragraph, 3-5x naturally, CTAs, author bio | Core product value — what makes Scribe different from generic AI |
| Topic research | Industry awareness, seasonal relevance, niche-specific trends | Timely articles supplement strong evergreen foundation |
| Image-topic matching | DALL-E prompts crafted to match specific article content, not generic | Featured images that actually represent the article topic |
| Content restrictions | Configurable no-go list (topics already published, off-brand content) | Prevented brand damage and redundancy |
| Writing style enforcement | No em dashes, no "crucial"/"utilize", varied sentence length, human voice | Articles read as human-written, not AI-generated |
| Readability & engagement | Conversational tone, relatable scenarios, questions for flow, white space | Readers actually finish articles, not bounce |
| Source attribution | All claims linked to credible sources, original synthesis required | SEO authority, no plagiarism risk |
| CTA structure | Every article ends with warm, varied call-to-action | Drives business for the brand |
The EC2 deployment is NOT a 1:1 migration. It evolves the prototype into a multi-tenant product where Scribe Walker is the main agent on its own dedicated server.
MAC MINI (Prototype):
Benny (main) → spawns ephemeral Scribe Walker → single brand (Taha)
EC2 (Production):
Scribe Walker (main) → spawns article sessions → any brand (multi-tenant)
Scribe Walker on EC2 is equivalent to what Benny is on the Mac Mini — the primary agent with full OpenClaw capabilities: identity, memory, session management, compaction, hooks, model updates.
| Benefit | Description |
|---|---|
| Full OpenClaw lifecycle | Compaction, session memory, command logging — all built-in |
| Model updates for free | New Claude/OpenAI models = openclaw onboard update, no code changes |
| Security updates | OpenClaw security patches apply directly |
| Monitoring | openclaw health, openclaw status, gateway dashboard |
| Identity persistence | SOUL.md, AGENTS.md define consistent behavior across all sessions |
| Hook system | command-logger for diagnostics, session-memory for compaction resilience |
graph TD
subgraph EC2["🖥️ AWS EC2 (t3.small, us-east-1)"]
subgraph SystemD["systemd Services"]
GW["openclaw-gateway.service"]
WK["scribe-worker.service (Scroll Worker)"]
end
subgraph OpenClaw["🤖 OpenClaw Gateway"]
MainAgent["Scribe Walker (main agent)"]
SOUL["SOUL.md — Identity & Principles"]
AGENTS["AGENTS.md — Security & Operations"]
Hooks["Hooks: command-logger, session-memory"]
MainAgent --> ArticleSession1["Article Session (Brand A)"]
MainAgent --> ArticleSession2["Article Session (Brand B)"]
MainAgent --> ArticleSession3["Article Session (Brand C)"]
end
subgraph Worker["📜 Scroll Worker (job-worker.js)"]
Poller["MongoDB Poller"]
PromptBuilder["buildScribePrompt()"]
Modules["Prompt Modules"]
end
WK --> Worker
GW --> OpenClaw
Poller -->|"openclaw agent --agent main"| MainAgent
PromptBuilder --> Modules
end
subgraph PromptModules["📝 Prompt Intelligence (worker/prompts/)"]
AW["article-writing.js — Orchestration"]
QR["quality-rules.js — Quality gates, readability, CTA"]
DI["dalle-image.js — Image generation rules"]
TG["tags.js — Standard tag taxonomy"]
end
subgraph MongoDB["🗄️ MongoDB Atlas"]
Jobs["jobs collection"]
Articles["articles collection"]
Sites["sites collection — brand config"]
end
subgraph External["🌐 External APIs"]
Claude["Claude Opus 4.6"]
DallE["DALL-E 3"]
WebSearch["Web Search (topic research)"]
end
Poller -->|"poll pending jobs"| Jobs
Sites -->|"brand, niche, location, tone, demographics"| PromptBuilder
PromptBuilder -->|"assembled prompt"| Poller
ArticleSession1 --> Claude
ArticleSession1 --> DallE
ArticleSession1 --> WebSearch
ArticleSession1 -->|"write completed articles"| Articles
Modules --> PromptModules
The Scribe Walker's article-writing intelligence is distributed across four layers:
Files in the agent's workspace directory that define WHO the agent is:
| File | Purpose |
|---|---|
SOUL.md |
Core identity, principles, writing philosophy |
AGENTS.md |
Security rules, operational boundaries, allowed/disallowed actions |
IDENTITY.md |
Name, role, platform context |
TOOLS.md |
Environment details, available tools |
These are loaded by OpenClaw for every session. They provide the persistent "personality" and guardrails.
Centralized, version-controlled prompt components assembled per-job:
| Module | What It Contains | Used By |
|---|---|---|
article-writing.js |
Main orchestration prompt, workflow, MongoDB instructions | Article generation |
quality-rules.js |
Word count, readability, engagement rules, CTA format, brand SEO | Article generation, regeneration |
dalle-image.js |
Image style rules, demographic matching, size/format requirements | Article generation, image regeneration |
tags.js |
Standard tag taxonomy | Article generation |
Key design: These modules are the single source of truth for quality rules. Both article generation and image regeneration call the same functions, ensuring consistency.
Per-customer data that customizes each job:
interface SiteConfig {
brandName: string; // "Sally's Spa"
niche: string; // "Med Spa"
location?: string; // "Daybreak, South Jordan, UT"
tone?: string; // "professional" | "casual" | "authoritative"
topicStyles: string[]; // ["how-to", "tips", "why"]
website?: string; // "https://sallysspa.com"
socials?: { // Social media links for CTAs
facebook?: string;
instagram?: string;
x?: string;
};
demographicProfile?: { // For image generation demographic matching
primaryDemo: string; // "caucasian women"
diversity: string; // "moderate"
region: string; // "suburban"
typicalAge: string; // "30-55"
notes?: string;
};
contentRestrictions?: { // Things the brand does NOT offer/want
excludeTopics?: string[]; // ["botox", "surgery"]
excludeCompetitors?: string[];
requiredDisclosures?: string[];
};
}These are the proven patterns from the Mac Mini that must be incorporated as agent-level capabilities, not just prompt text:
Problem: Without dedup, the agent writes "5 Benefits of Microneedling" every few runs.
Mac Mini approach: Fetch last 100 titles + slugs, contextual matching (not just exact), reject topic-level duplicates.
EC2 approach (refined):
- Before writing, query MongoDB for the site's existing article titles
- Exact/near-exact title match ONLY — "Why Microneedling Works" and "Why Microneedling Works!" = duplicate. But "Why Microneedling Works" and "Benefits of Microneedling for Your Skin" = ALLOWED (different angle, both valuable)
- No contextual/semantic dedup — this backfires. Businesses WANT multiple articles covering the same topic from different angles. A med spa should have articles about microneedling benefits, preparation, aftercare, comparisons, etc.
- Dedup window scales with plan: Free (15 articles) = check all. Pro (50) = check all. Scale (150) = last 100 cap. Agency = configurable.
- Token cost: Titles only, ~500 tokens for 50 titles. Negligible.
- Implementation: Title matching done in code (Scroll Worker / job-worker.js), NOT passed to Claude. Avoids Claude being overly conservative.
Problem: Generic articles are fine but timely, relevant articles drive more traffic.
Mac Mini approach: Web searches for breaking news in the niche before each run.
EC2 approach (tiered):
- Content mix: 80% evergreen / 20% seasonal-timely. Evergreen is the backbone for local business SEO. "How to Choose the Right Roofing Material" has value for years. Trending = "relevant to their customers right now" (e.g., "Spring Roof Maintenance Checklist"), NOT news slop.
- Free tier: No web research. Evergreen articles only (cheaper, still high quality).
- Pro tier: Light seasonal awareness (time of year, common seasonal topics for niche).
- Scale/Agency: Web research enabled for timely content alongside evergreen.
- Business-specific (Pro+ with services field, see #65): Only write about services/products the brand actually offers. Free tier writes generically about the niche without claiming the brand offers specific services.
Problem: Articles without strong brand presence don't build SEO authority.
Mac Mini approach: Brand name in title, first paragraph, 3-5x naturally, backlink CTA, author bio.
EC2 approach (carried forward — already in quality-rules.js):
- Brand name in article title (when it fits naturally)
- Brand mentioned in first paragraph as the local expert
- Brand in SEO meta description
- Brand + location combos 2-3x naturally throughout
- CTA section at article end with website/social links
- NOT over-stuffed — natural and helpful
Mac Mini approach: Extensive checklist, word count verification, style rules.
EC2 approach (carried forward — already in quality-rules.js):
- Minimum 1200 words (target 1200-1800)
- No em dashes, no "crucial"/"utilize"
- Varied sentence length, conversational tone
- Relatable scenarios, questions for flow
- Subheadings, bullets, white space for readability
- Original synthesis — not copied from sources
Problem: Generic stock-photo-style images that don't match the article topic.
Mac Mini approach: Detailed DALL-E prompts describing the specific subject, never brand names (DALL-E blocks them).
EC2 approach (carried forward — already in dalle-image.js):
- Prompts crafted to match specific article content
- Describe distinctive visual features instead of brand names
- Demographic matching when profile available
- 1792x1024 landscape, realistic stock photo style
- No text, logos, or watermarks
Mac Mini approach: IndexNow ping, published log, delivery announce.
EC2 approach (see #66):
- Update article status in MongoDB (already done)
- Email notification to site owner (already done via Resend)
- Subdomain articles (*.tryscribe.co): Submit to tryscribe.co Google Search Console, Bing Webmaster, IndexNow
- Custom domain articles: Separate workflow, deferred until #7 ships
⚠️ DEV MODE GATE: All search submissions gated behindNODE_ENV=productionANDENABLE_SEARCH_SUBMISSION=true. Both must be true. No test articles in search indices.- Analytics tracking (future)
The 22-point quality check is the product's quality standard. This is what differentiates Scribe from AI slop generators. Every article MUST pass this checklist before publishing.
Source of truth: platform/docs/SEO-QUALITY-CHECKLIST.md (replicated from tryscribe.co/seo-guidelines.html — update both when changing).
The full checklist must be incorporated into quality-rules.js as the authoritative quality gate.
Data model:
// In MongoDB site config
services?: string[]; // From curated niche list + custom approved
pendingServices?: string[]; // Custom entries awaiting admin review
excludeServices?: string[]; // Services to explicitly avoidPrompt pattern: DB stores DATA (list of services). JS stores RULES (how to use that data).
- Prompt: "Only write about services this brand offers: ${services}. Never claim they offer unlisted services."
- If no services configured (free tier): "Write generally about the niche without making specific claims about what this brand offers."
Security:
- Curated services list per niche category, plus "Other" free-text
- Automated blocklist on submission (illegal/inappropriate terms)
- Niche-mismatch soft flag: custom service doesn't match niche → flag for admin, don't block
- Custom "Other" entries go to
pendingServices— NOT in prompts until admin-approved - Attack surface: users can TYPE anything, but unapproved entries never affect article output
Job arrives from MongoDB
│
▼
buildScribePrompt(job)
│
├── Brand details (from job.params / site config)
├── Article IDs to update (from job.articleIds)
├── buildTagsBlock() — standard tag taxonomy
├── buildBrandSeoBlock() — brand SEO integration rules
├── Article structure template
├── buildQualityBlock() — quality rules, readability, engagement
├── buildDalleRulesBlock() — image generation rules
└── Image upload API instructions
│
▼
Complete prompt sent to:
openclaw agent --agent main --message <prompt>
│
▼
OpenClaw spawns article session with:
- Agent identity (SOUL.md, AGENTS.md)
- Assembled prompt (from buildScribePrompt)
- Tools: MongoDB access, DALL-E API, web search, image upload API
│
▼
Agent executes autonomously:
1. Research trending topics for niche/location
2. Check last 100 titles for duplicates
3. Write articles with full quality gates
4. Generate matched DALL-E images
5. Upload images via CDN API
6. Update article docs in MongoDB
7. Report completion
/home/ubuntu/
├── scribe/ # Git repo (tryscribeco/scribe)
│ ├── platform/ # Next.js app (deployed to Vercel)
│ │ └── docs/
│ │ └── ARCHITECTURE.md # This document
│ └── worker/
│ ├── job-worker.js # Scroll Worker — job poller + session spawner
│ └── prompts/ # Prompt intelligence modules
│ ├── article-writing.js # Main prompt builder
│ ├── quality-rules.js # Quality, CTA, brand SEO
│ ├── dalle-image.js # Image generation rules
│ └── tags.js # Standard tag taxonomy
│
├── .openclaw/
│ ├── openclaw.json # OpenClaw config (main agent = Scribe Walker)
│ ├── workspace/ # Agent workspace
│ │ ├── SOUL.md # Scribe Walker identity
│ │ ├── AGENTS.md # Security rules, operational boundaries
│ │ ├── IDENTITY.md # Name, role, platform
│ │ ├── TOOLS.md # Environment details
│ │ └── HEARTBEAT.md # No proactive tasks (headless)
│ └── agents/
│ └── main/
│ ├── agent/
│ │ └── auth-profiles.json # Anthropic auth
│ └── sessions/ # Session history
│
├── /etc/scribe/.env # Secrets (root:root, 600)
└── /etc/systemd/system/
├── openclaw-gateway.service # OpenClaw gateway (always running)
└── scribe-worker.service # Scroll Worker (always running)
| Phase | Status | Details |
|---|---|---|
| AWS Account Foundation (#63) | ✅ Complete | Org, IAM Identity Center, Production account, budget |
| EC2 Provisioning | ✅ Complete | t3.small, Ubuntu 24.04, hardened, Elastic IP |
| Node.js + Repo | ✅ Complete | Node 22, npm install, env file |
| OpenClaw Install + Onboard | ✅ Complete | v2026.3.2, Opus 4.6, hooks enabled |
| OpenClaw Gateway Service | ✅ Complete | System-level systemd, RPC probe OK |
| Agent Workspace (initial) | ✅ Complete | SOUL.md, AGENTS.md deployed |
| Architecture Review | 🔄 In Progress | This document — awaiting approval |
| Agent Workspace (enhanced) | ⬜ Pending | Incorporate quality intelligence from architecture |
| Scroll Worker Service | ✅ Complete | systemd unit for job-worker.js (scribe-worker.service) |
| Cutover & Testing | ⬜ Pending | Test articles, 24h monitor, kill Mac worker |
| Post-Migration Hardening | ⬜ Pending | Structured logging, graceful shutdown, alerting |
| # | Question | Decision |
|---|---|---|
| 1 | Topic research scope | Tiered: Free = none (evergreen only). Pro = seasonal awareness. Scale/Agency = web research. |
| 2 | Duplicate prevention window | Scales with plan: Free = all (15). Pro = all (50). Scale = last 100. Agency = configurable. Exact/near-exact match only. |
| 3 | Content restrictions storage | DB for data (services list), JS for rules (how to use that data). Custom entries require admin approval. |
| 4 | IndexNow | Submit for tryscribe.co subdomains. Custom domains deferred to #7. Dev mode gate required. See #66. |
| 5 | Prompt module updates | Keep in JS files. Deploy = git pull + restart (~10 sec). Security > hot-reload convenience. |
| 6 | Memory across jobs | Stateless. Each job independent. Mac Mini was stateless too (ephemeral sessions). Consistent. |
Worker routing via job document field:
- Default (no field): EC2 picks up the job
worker: "local": Mac Mini picks up the job (triggered via?worker=localAPI param)- Both workers run simultaneously during migration validation
- Compare article quality side by side
- Remove routing code after EC2 validated and Mac worker decommissioned