Summary
coven-desktop-use currently supports macOS only (via Peekaboo) and returns
"unsupported" on Linux/Windows. Requesting a Linux backend (X11 + Wayland) so
the OpenClaw desktop_use tool works for users running Ubuntu / Linux as
their primary OS.
Motivation
Several OpenCoven users run Linux as their daily driver (embedded engineering,
headless agents, dev VMs, CI). Without a Linux backend the desktop_use tool
has to be disabled in plugin config, which removes a significant capability
from the agent on those machines.
Scope
Lean v1 that mirrors the existing JSON envelope and --confirm policy, with
per-session backends instead of a single bundled tool:
- X11:
scrot / maim (capture), xdotool (input), wmctrl (focus)
- Wayland:
grim (capture), wtype / ydotool (input), swaymsg (focus on Sway)
doctor returns a tool inventory and the exact apt install line for any
missing pieces. macOS path is unchanged.
Out of scope (deferred)
- AT-SPI element-tree annotation — Linux
inspect returns a screenshot but
no B1/T2 element ids in v1. Callers use --coords x,y instead.
- Active-window capture on vanilla Wayland (
grim has no concept of focused
window).
- Real scroll-wheel events on Wayland — degrades to
Page_Up/Page_Down
via wtype, marked degraded in the response.
- Window focus on GNOME Mutter / KDE KWin — no public CLI exists.
Compatibility
- macOS path unchanged.
- TS plugin schema unchanged; new fields are additive.
- Typed-text redaction extended to cover xdotool/wtype's
-- separator.
Summary
coven-desktop-usecurrently supports macOS only (via Peekaboo) and returns"unsupported" on Linux/Windows. Requesting a Linux backend (X11 + Wayland) so
the OpenClaw
desktop_usetool works for users running Ubuntu / Linux astheir primary OS.
Motivation
Several OpenCoven users run Linux as their daily driver (embedded engineering,
headless agents, dev VMs, CI). Without a Linux backend the
desktop_usetoolhas to be disabled in plugin config, which removes a significant capability
from the agent on those machines.
Scope
Lean v1 that mirrors the existing JSON envelope and
--confirmpolicy, withper-session backends instead of a single bundled tool:
scrot/maim(capture),xdotool(input),wmctrl(focus)grim(capture),wtype/ydotool(input),swaymsg(focus on Sway)doctorreturns a tool inventory and the exactapt installline for anymissing pieces. macOS path is unchanged.
Out of scope (deferred)
inspectreturns a screenshot butno
B1/T2element ids in v1. Callers use--coords x,yinstead.grimhas no concept of focusedwindow).
Page_Up/Page_Downvia
wtype, markeddegradedin the response.Compatibility
--separator.