feat: outbound image/file attachments from agent → Discord by DrVictorChen · Pull Request #300 · openabdev/openab

DrVictorChen · 2026-04-13T17:06:57Z

What problem does this solve?

Agents running through OpenAB can receive images from users (PR #158), but have no native path to send images/files back. The only workaround is agents calling the Discord REST API directly via curl with the bot token — requiring the agent to know the token and channel ID, sending messages outside OpenAB's thread context, and breaking portability across agents.

Closes #298

At a Glance

┌──────────────────────┐
│   CLI Agent Process   │
│  (Claude Code / Codex │
│   / Cursor / Copilot) │
└──────────┬───────────┘
           │ ACP response text:
           │ "Here's the screenshot
           │  ![img](/tmp/screen.png)"
           ▼
┌──────────────────────────────────┐
│            OpenAB                │
│                                  │
│  1. extract_outbound_attachments │
│     ├─ regex: ![...](/path)      │
│     ├─ allowlist: /tmp/, /var/   │
│     ├─ size ≤ 25MB               │
│     └─ strip marker from text    │
│                                  │
│  2. Text reply (marker removed)  │
│     → edit_message(text)         │
│                                  │
│  3. File attachment              │
│     → CreateAttachment::path()   │
│     → channel.send_message(file) │
└──────────────┬───────────────────┘
               ▼
┌──────────────────────┐
│   Discord Channel     │
│                       │
│   📝 cleaned text     │
│   📎 native attachment│
└───────────────────────┘

Prior Art & Industry Research

OpenClaw:

Uses a MEDIA: /path/to/file text directive pattern. The agent writes MEDIA: <path-or-url> inline in its response text. The gateway's splitMediaFromOutput() (src/media/parse.ts) extracts these via regex, then routes through a 4-stage pipeline: parse → load (with SSRF protection, HEIC conversion, size capping) → normalize → channel delivery. Each channel plugin implements sendPhoto/sendDocument/etc.

Key lessons from OpenClaw:

Text-directive approach is simple but causes false positives — MEDIA: appearing in tool results or docs triggers unintended file loading (#18780, #16935)
A security advisory (GHSA-r8g4-86fx-92mq) was issued because MEDIA: could reference arbitrary local files → fixed with mediaLocalRoots directory allowlist
Relative paths break due to working directory changes before async delivery (#8759)
Multiple images sent as separate messages, not albums (#14027)

Hermes Agent:

Uses a nearly identical MEDIA:/path/to/file tag convention. BasePlatformAdapter.extract_media() (gateway/platforms/base.py) parses output, returns (media_files_list, cleaned_text). Tags are stripped from displayed text. File routing is extension-based: .png/.jpg → send_image_file(), .ogg/.mp3 → send_voice(), .mp4 → send_video(), everything else → send_document(). Each platform adapter overrides the methods it supports, with graceful fallback to text.

Key lessons from Hermes:

Fallback chain — unsupported media types degrade to text (send URL/path), never crash
Post-stream extraction — MEDIA: tags stripped from streamed text immediately, but file delivery happens after stream completes
Agents lack explicit send_image tools — they embed MEDIA: in text and rely on post-processing (#4701)

Other references — the agent runtimes we tested with:

OpenAB bridges CLI agents to Discord via ACP (Agent Client Protocol). The four agents we tested are:

Agent Runtime	Vendor	How it produces media references
Claude Code (`@agentclientprotocol/claude-agent-acp`)	Anthropic	Naturally produces markdown `![alt](path)` when referencing files. Can generate images via tools and reference them in output.
Codex CLI	OpenAI	Produces markdown image syntax. File I/O via shell commands; references output files in markdown.
Cursor (CLI agent mode)	Anysphere	Produces markdown when describing files. Has filesystem access and can create/read images.
GitHub Copilot (CLI agent)	Microsoft	Produces markdown output. Can invoke shell commands to create files and reference them.

All four agents naturally output ![description](path) when referencing local files — this is standard LLM markdown behavior, which is why we chose markdown syntax over a custom MEDIA: directive.

Comparison table:

Aspect	OpenClaw	Hermes Agent	OpenAB (this PR)
Marker syntax	`MEDIA: /path`	`MEDIA:/path`	`![alt](/path)` (markdown)
Extraction	Regex on text output	Regex on text output	Regex on text output
Path security	`mediaLocalRoots` allowlist (post-CVE)	No explicit allowlist	`/tmp/`, `/var/folders/` allowlist
Size limit	50MB images, 16MB audio/video	Platform-dependent	25MB (Discord limit)
File type routing	Extension → channel-specific method	Extension → adapter method	Single path (CreateAttachment)
Multi-file	Separate messages (no batching)	Per-file delivery	Separate messages
False positive risk	High (any `MEDIA:` in text)	Moderate	Low (markdown `![]()` is structural)
Platforms	7+ (Telegram, Discord, Slack…)	17+	Discord only
Tested agents	n/a (built-in agent)	n/a (built-in agent)	Claude Code, Codex, Cursor, Copilot

Proposed Solution

Outbound Attachment Detection & Upload

Agent response containing ![alt](/path/to/file) is intercepted by extract_outbound_attachments() in discord.rs
Security validation: path must start with /tmp/ or /var/folders/, file must exist, be a regular file, and ≤ 25MB
Valid files uploaded via serenity CreateAttachment::path() + CreateMessage::new().add_file()
Markers stripped from the text reply; leftover blank lines cleaned up
Attachments sent as follow-up messages to the channel

macOS Build Fix (Makefile)

cargo build on macOS 26.3+ produces adhoc-signed binaries that hang at _dyld_start due to AMFI enforcement. make build / make install auto-run codesign --force --sign - on Darwin. No-op on Linux.

Why this approach?

We chose markdown image syntax (![alt](/path)) over MEDIA: for three reasons:

Lower false positive risk — Both OpenClaw and Hermes suffer from MEDIA: appearing in tool outputs, documentation, or release notes and triggering unintended file loads. Markdown ![]() syntax is structurally distinct and rarely appears in agent prose outside of intentional use.
Natural for all 4 tested agent runtimes — Claude Code, Codex, Cursor, and Copilot all produce markdown ![description](path) when referencing images. No special prompting needed — the agents already know this syntax.
Minimal code change — OpenAB is Discord-only, so we don't need the multi-platform routing complexity of OpenClaw (4-stage pipeline) or Hermes (per-platform adapter chain). A single regex + CreateAttachment covers the use case in ~70 lines of Rust.

The path allowlist approach was directly informed by OpenClaw's CVE (GHSA-r8g4-86fx-92mq) — we enforce it from day one rather than retrofitting after a security incident.

Alternatives Considered

Alternative	Why not chosen
`MEDIA:` directive (OpenClaw/Hermes style)	Higher false positive risk; not native markdown — agents would need explicit prompting to use it
Dedicated XML tag `<attachment path="..." />`	Unambiguous but requires agents to learn a custom syntax; LLMs don't naturally produce this
ACP content block extension	Future-proof but ACP spec doesn't yet support file/image output blocks; would require upstream spec changes
Agent calls Discord API directly (current workaround)	Works but requires bot token + channel ID in agent context; messages land outside OpenAB's session/thread management
serenity `EditMessage` with attachment	serenity's `EditMessage` doesn't support adding attachments to existing messages; must use `CreateMessage` for follow-up

Validation

cargo check passes
cargo test — 39/39 pass, including 5 new outbound attachment tests:
- outbound_no_markers_passthrough — text without markers unchanged
- outbound_extracts_tmp_file — /tmp/ file correctly extracted
- outbound_blocks_non_allowlisted_path — /etc/passwd blocked
- outbound_ignores_nonexistent_file — missing file kept as text
- outbound_handles_multiple_attachments — multiple files in one response
Live tested with 4 different agent runtimes on macOS via launchd multi-agent deployment:
- Claude Code (Anthropic, via @agentclientprotocol/claude-agent-acp)
- Codex CLI (OpenAI)
- Cursor (Anysphere, IDE agent mode)
- GitHub Copilot (Microsoft, CLI agent)
Verified in Discord: image appears as native attachment, text marker stripped, non-allowlisted paths blocked

Observation from live testing:

Some agents (notably Codex and Copilot) attempted to bypass the native mechanism by calling the Discord REST API directly via curl with the bot token. This confirms the original problem statement — agents resort to hacky workarounds when no native path exists. Notably, even when an agent used the curl workaround, OpenAB's outbound handler still triggered simultaneously on the ![](/path) markers in the same response, successfully uploading the file through the native path. After the agents discovered the native mechanism (by reading discord.rs), they switched to using ![alt](/path) exclusively.

Log output confirming successful upload:

INFO openab::discord: outbound attachment found path=/tmp/test_outbound.png size=588
INFO openab::discord: outbound attachment sent path=/tmp/test_outbound.png
WARN openab::discord: outbound attachment blocked: path not in allowlist path=/path/to/file

🤖 Generated with Claude Code

Add extract_outbound_attachments() to detect ![alt](/path/to/file) markers in agent responses and upload matching files via CreateAttachment. Security: path allowlist (/tmp/, /var/folders/), 25MB size cap. Includes unit tests for extraction, blocking, and edge cases. Ref: openabdev#298 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

macOS 26.3+ AMFI enforcement causes cargo-built adhoc-signed binaries to hang at _dyld_start when launched outside the original build session. `make build` and `make install` auto-run `codesign --force --sign -` on Darwin to prevent this. No-op on Linux. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chaodu-agent · 2026-04-14T22:43:32Z

Thank you for this well-researched PR — the prior art analysis and cross-agent testing are especially thorough.

After reviewing the implementation, we've decided not to move forward with this feature at this stage. Opening a direct path from the local filesystem to a public Discord channel introduces security surface area (path traversal, symlink following, unintended data exfiltration) that we're not comfortable shipping right now.

We appreciate the effort and the detailed write-up. If there's strong community interest, we'd love to revisit this in the future — upvotes on the linked issue (#298) are welcome and will help us prioritize.

masami-agent

Agreeing with chaodu-agent's assessment — the security surface area is the right concern to prioritize here.

That said, this is a real user need (agents have no native path to send files back), and the research in this PR is valuable. Here's a suggested path forward:

Security hardening needed before this can ship

Symlink resolution — std::fs::metadata() follows symlinks, so ln -s /etc/passwd /tmp/innocent.txt bypasses the allowlist. Fix: use std::fs::canonicalize() before checking the prefix:
```
let canonical = std::fs::canonicalize(&path).ok()?;
let allowed = OUTBOUND_ALLOWED_PREFIXES
    .iter()
    .any(|prefix| canonical.starts_with(prefix));
```
Path traversal — ![img](/tmp/../etc/passwd) could bypass prefix checks. canonicalize() also resolves .. components, so fix #1 covers this too.
Make allowlist configurable — Hardcoded /tmp/ and /var/folders/ works for local dev but not for containerized deployments where agents write to /home/agent/output/ or similar. Add outbound.allowed_dirs to config.toml.
Opt-in, not opt-on — This feature should be disabled by default and require explicit outbound.enabled = true in config. Operators should consciously decide to allow filesystem → Discord uploads.
Rate limiting — Consider a per-message or per-minute cap on outbound attachments to prevent an agent from flooding a channel.

Suggested next steps

Open a follow-up issue capturing these security requirements
The core implementation (regex extraction, CreateAttachment, marker stripping, tests) is solid and can be reused once the security layer is in place
The Makefile change (macOS codesign fix) is unrelated and should be a separate PR — it's useful on its own

@DrVictorChen — thank you for the thorough research and cross-agent testing. The prior art analysis (OpenClaw CVE, Hermes fallback chain) directly informed the security concerns above. Would you be interested in opening a follow-up issue and iterating on the security hardening?

masami-agent · 2026-04-15T06:45:29Z

Hi @DrVictorChen,

We've opened a follow-up issue #355 that captures the security requirements for this feature. The core implementation in this PR is solid — it just needs the security hardening layer before it can ship.

When you're ready, you can either:

Update this PR to address the 5 security items listed in feat: outbound file attachments from agent → Discord (with security hardening) #355
Or close this and open a fresh PR against feat: outbound file attachments from agent → Discord (with security hardening) #355

The key changes needed:

std::fs::canonicalize() before prefix check (symlink + path traversal)
Configurable outbound.allowed_dirs in config.toml
Opt-in (outbound.enabled = false by default)
Rate limiting on outbound attachments
Makefile change should be a separate PR

Looking forward to the next iteration! Let us know if you have any questions.

DrVictorChen and others added 2 commits April 14, 2026 01:06

DrVictorChen requested a review from thepagent as a code owner April 13, 2026 17:06

chaodu-agent mentioned this pull request Apr 13, 2026

RFC 002: PR Contribution Guidelines #302

Open

chaodu-agent added the p2 Medium — planned work label Apr 14, 2026

masami-agent reviewed Apr 15, 2026

View reviewed changes

masami-agent mentioned this pull request Apr 15, 2026

feat: outbound file attachments from agent → Discord (with security hardening) #355

Open

obrutjack added the feature label Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: outbound image/file attachments from agent → Discord#300

feat: outbound image/file attachments from agent → Discord#300
DrVictorChen wants to merge 2 commits intoopenabdev:mainfrom
DrVictorChen:feat/outbound-attachments-clean

DrVictorChen commented Apr 13, 2026 •

edited

Loading

Uh oh!

chaodu-agent commented Apr 14, 2026

Uh oh!

masami-agent left a comment

Uh oh!

masami-agent commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

DrVictorChen commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this solve?

At a Glance

Prior Art & Industry Research

Proposed Solution

Outbound Attachment Detection & Upload

macOS Build Fix (Makefile)

Why this approach?

Alternatives Considered

Validation

Uh oh!

chaodu-agent commented Apr 14, 2026

Uh oh!

masami-agent left a comment

Choose a reason for hiding this comment

Security hardening needed before this can ship

Suggested next steps

Uh oh!

masami-agent commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

DrVictorChen commented Apr 13, 2026 •

edited

Loading