From db34fa11c47e99c7191f5de1d1bc517826eeaf08 Mon Sep 17 00:00:00 2001 From: Cmochance <3216202644@qq.com> Date: Wed, 20 May 2026 19:56:29 +0800 Subject: [PATCH 1/4] =?UTF-8?q?fix(adapters):=20apply=5Fpatch=20diff=20UI?= =?UTF-8?q?=20=E5=9C=A8=20chat-completions=20provider=20=E4=B8=8A=E5=B7=A5?= =?UTF-8?q?=E4=BD=9C=20(#235)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## 现象 用户用 App Transfer + DeepSeek (或 Kimi / MiMo) 接 Codex Desktop 时,所有 API 返回 200,但 apply_patch 工具调用稳定 aborted,Codex Desktop 前端不出 +/- diff 卡片,文件编辑功能彻底坏掉。shell_command / fetch 正常。 ## 根因 (对照 openai/codex @ 000bf5c 上游源码验证) Codex CLI 把 apply_patch 作为 freeform 工具注册: - `codex-rs/protocol/src/openai_models.rs:202-206` — `ApplyPatchToolType` enum 当前**只有 `Freeform`** 一个变体(社区提议 #14046 加 Function 变体未合并) - `codex-rs/core/src/tools/handlers/apply_patch_spec.rs` — wire 形态是 `ToolSpec::Freeform { format: { type:"grammar", syntax:"lark" } }` - `codex-rs/core/src/tools/router.rs:90-134` — 响应侧按 wire item type 路由: `ResponseItem::FunctionCall` → `ToolPayload::Function { arguments }`, `ResponseItem::CustomToolCall` → `ToolPayload::Custom { input }` - `codex-rs/core/src/tools/handlers/apply_patch.rs:324` — apply_patch handler 硬要求 `ToolPayload::Custom`,收 Function 直接返回 `"apply_patch handler received unsupported payload"` → abort 本仓 adapter 在响应侧把 DeepSeek 等 chat 上游的 `tool_calls[]` 一律渲染成 `function_call` wire,Codex CLI router 立刻 mismatch → abort。同时请求侧把 custom tool 降级成 function 时,upstream "do not wrap the patch in JSON" 的 description 在 chat 路径上反而误导模型;且没有 V4A 格式样例。 ## 修复 (方案 B - adapter 双向桥接) ### 请求侧 `responses/request/tools.rs` 对 `name == "apply_patch"` 特判,把 custom → function 降级时: - 替换 outer description 为 chat 路径准确的 V4A 指引(`*** Begin Patch`,文件 操作头,hunk 标记,relative path,JSON 字符串里写 `\n` 转义换行) - input 参数 description 镜像 V4A 关键约束 ### 响应侧 `responses/converter.rs` 对 `name == "apply_patch"` 特判,emit Responses `custom_tool_call` wire 而非 `function_call`: - `output_item.added` 用 `type:"custom_tool_call"`(empty `input`) - 中间 args delta **不** emit(避免对 JSON 累积字符串做流式 input 提取) - close 时一次性 emit `response.custom_tool_call_input.delta` + `.done` + `output_item.done`(`type:"custom_tool_call"`) - 提取 input:`{"input":""}` JSON 解出;非 JSON 或缺 input 字段时整段 原样透传(让 Codex CLI parse_patch 给出可读错误而非静默 abort) - envelope `output[]` 终态用同一 input 字符串(cached 到 PendingToolCall, 防 close 与 envelope build 之间 drift) - interrupted (无 finish_reason 且非 [DONE] 收尾) 时 emit `status:"incomplete"` 并 **skip** `input.done`,防止严格客户端在 stream 半截断时执行 partial patch (destructive tool 安全防线) - `call_id` 在 `output_item.added` emit 后 freeze,不再被后续 chunk 覆盖 (避免同一 item 暴露两个不同 call_id) - 加 tracing telemetry:positive shim 触发 (info)、晚到 name (warn)、 空 args (warn)、JSON parse 失败分流 (debug 裸 V4A / warn 真坏) ### 请求侧多轮回放 `responses/request.rs` (BLOCKER) turn N+1 时 Codex CLI 把上一轮 `ResponseItem::CustomToolCall` / `CustomToolCallOutput` 通过 `input[]` 回放给我们。原 `input_item_to_messages` 只处理 `function_call` / `function_call_output`,这两类静默落入 `_ =>` 兜底被 丢弃 → 多轮上下文丢失。本提交补两个分支: - `custom_tool_call` → `role:assistant` + `tool_calls[]` (function-call 形态, arguments 包成 `{"input":""}` JSON 字符串,与首轮 lowering 形态一致) - `custom_tool_call_output` → `role:tool` + `tool_call_id` + content ## 测试 新增 8 个回归测试 (响应侧 6 + 请求侧 2): - chat tool_calls(apply_patch) → custom_tool_call wire - JSON args / 裸 V4A 兜底 / 缺 input 字段 - interrupted stream → status=incomplete + skip input.done - streaming output_item.done.input == envelope.output[].input - custom_tool_call input → assistant.tool_calls (多轮回放) - custom_tool_call_output → role:tool (多轮回放) - request 侧 V4A 描述注入 (apply_patch vs 普通 custom 工具) `cargo test --workspace`: 全套通过 (506 adapter unit + 12+10+3 集成,跟原仓 一致;唯一偶发并发 flake `gemini_oauth::cancel_slot_epoch_*` 与本提交无关, serial 跑全过)。 ## 注意 不影响 Codex / GPT 官方登录路径 (那条走原生 Responses API,不经 chat adapter 转换)。本修复 strictly 针对 chat completions provider 转 Responses 的方向。 Refs #235 --- README.en.md | 1 + README.md | 1 + crates/adapters/src/responses/converter.rs | 574 ++++++++++++++++-- crates/adapters/src/responses/request.rs | 64 ++ .../adapters/src/responses/request/tests.rs | 126 ++++ .../adapters/src/responses/request/tools.rs | 72 ++- 6 files changed, 800 insertions(+), 38 deletions(-) diff --git a/README.en.md b/README.en.md index fde25872..0217d372 100644 --- a/README.en.md +++ b/README.en.md @@ -41,6 +41,7 @@ With any provider enabled, Codex CLI's model picker shows ` / , +} + +/// Codex CLI 把 `apply_patch` 作为 freeform 工具注册 +/// (`codex-rs/core/src/tools/handlers/apply_patch_spec.rs` — +/// `ToolSpec::Freeform { name: "apply_patch", ... }`),响应侧 router +/// (`codex-rs/core/src/tools/router.rs:92-130`)按 wire item type 路由: +/// `ResponseItem::FunctionCall` → `ToolPayload::Function { arguments }`, +/// `ResponseItem::CustomToolCall` → `ToolPayload::Custom { input }`,而 +/// apply_patch handler 硬要求 `ToolPayload::Custom`,收 Function 直接返回 +/// `"apply_patch handler received unsupported payload"` → abort +/// (`codex-rs/core/src/tools/handlers/apply_patch.rs:324`)。本 adapter +/// 把 chat completions provider(DeepSeek / Kimi / MiMo 等)回来的 +/// `tool_calls[]` 默认渲染成 `function_call` wire,所以必须对 apply_patch +/// 特判 — 用 `custom_tool_call` wire 给 Codex CLI 才不 abort。 +/// +/// 名字以常量集中是为了和 `request/tools.rs::APPLY_PATCH_TOOL_NAME` 对齐 +/// 字符串一致性(请求侧的特判描述 / 响应侧的 wire 重打包必须按同一 name 触发)。 +fn is_apply_patch_tool_name(name: &str) -> bool { + name == "apply_patch" } #[derive(Debug)] @@ -489,6 +524,13 @@ impl ChatToResponsesConverter { .clone() .unwrap_or_else(|| format!("call_{}_{}", self.fc_id_seed, openai_index)); let name = tc.function.name.clone().unwrap_or_default(); + // **取舍**:wire 形态(function_call vs custom_tool_call)在 open + // 时一次性根据**首帧 name** 决定,后续帧补全 name 不改 wire。 + // 实测 DeepSeek / Kimi / MiMo 都在首帧带 name。极端情况下首帧 + // name 为空、后续才补 apply_patch,会 fallback 到 function_call + // wire(同当前行为,Codex CLI 仍会 abort apply_patch 一次),不 + // 比修复前差。 + let is_apply_patch = is_apply_patch_tool_name(&name); self.tool_calls.insert( openai_index, PendingToolCall { @@ -498,35 +540,79 @@ impl ChatToResponsesConverter { name: name.clone(), args_acc: String::new(), closed: false, + is_apply_patch, + output_item_added_emitted: false, + apply_patch_input: None, }, ); - // 如果 function name 来自 namespace 包(从 original_request.tools - // 反查表查到),给 item 加 `namespace` 字段 — Codex.app 客户端 - // dispatch namespace 工具时这是必要字段(strings 实证 binary 含 - // `dynamic tool namespace must not be empty for` 校验,缺字段会 - // 报 `unsupported call: `)。 - let namespace = self.lookup_namespace_for(&name).map(str::to_owned); - let mut item = json!({ - "type": "function_call", - "id": fc_id, - "call_id": call_id, - "name": name, - "arguments": "", - "status": "in_progress", - }); - if let Some(ns) = namespace.as_ref() { - item["namespace"] = Value::String(ns.clone()); + // apply_patch:wire 必须是 `custom_tool_call`(裸 `input` 字段)。 + // 中间增量 delta **不 emit** — chat 上游给的 args 是 JSON 字符串 + // 增量(`{"input": "*** Begin Patch\n..."`),从 JSON 字符串拼接 + // 过程中流式提取 `input` 字段值需要专门的 streaming JSON state + // machine,本提交不引入。退而求其次:close 时一次性解 args 再 + // emit input.delta + output_item.done,代价是客户端看不到逐字 + // 流出的 diff(一次性出现整段 patch)。对一个长期完全不工作的 + // 功能,这是合理的第一步;后续可优化为真流式。 + if is_apply_patch { + tracing::info!( + target = "adapters::apply_patch", + call_id = %call_id, + "apply_patch shim engaged: rewriting chat function_call wire to Responses custom_tool_call", + ); + let item = json!({ + "type": "custom_tool_call", + "id": fc_id, + "call_id": call_id, + "name": name, + "input": "", + "status": "in_progress", + }); + emit_event( + out, + &mut self.sequence_number, + "response.output_item.added", + json!({ + "type": "response.output_item.added", + "output_index": output_index, + "item": item, + }), + ); + } else { + // 如果 function name 来自 namespace 包(从 original_request.tools + // 反查表查到),给 item 加 `namespace` 字段 — Codex.app 客户端 + // dispatch namespace 工具时这是必要字段(strings 实证 binary 含 + // `dynamic tool namespace must not be empty for` 校验,缺字段会 + // 报 `unsupported call: `)。 + let namespace = self.lookup_namespace_for(&name).map(str::to_owned); + let mut item = json!({ + "type": "function_call", + "id": fc_id, + "call_id": call_id, + "name": name, + "arguments": "", + "status": "in_progress", + }); + if let Some(ns) = namespace.as_ref() { + item["namespace"] = Value::String(ns.clone()); + } + emit_event( + out, + &mut self.sequence_number, + "response.output_item.added", + json!({ + "type": "response.output_item.added", + "output_index": output_index, + "item": item, + }), + ); + } + // output_item.added 已 emit。后续帧 backfill `id` 不应再换 call_id + // (否则 `output_item.added` 与后续 `input.delta` / `output_item.done` + // 用不同 call_id,严格客户端会两次解读为不同 item)。同样地, + // apply_patch 的 `is_apply_patch` 决策也已固定。 + if let Some(pending) = self.tool_calls.get_mut(&openai_index) { + pending.output_item_added_emitted = true; } - emit_event( - out, - &mut self.sequence_number, - "response.output_item.added", - json!({ - "type": "response.output_item.added", - "output_index": output_index, - "item": item, - }), - ); } // 后续帧可能补全 name(罕见但兼容) @@ -535,16 +621,31 @@ impl ChatToResponsesConverter { if let Some(pending) = self.tool_calls.get_mut(&openai_index) { if pending.name.is_empty() { pending.name = name.to_owned(); + if is_apply_patch_tool_name(name) && !pending.is_apply_patch { + // 罕见极端:首帧 name 为空,后续才补 apply_patch。 + // `output_item.added` 已经 emit `function_call` wire, + // 不能回退。这一调用 Codex CLI 仍会 abort,但起码我们 + // 在日志里能看到根因。 + tracing::warn!( + target = "adapters::apply_patch", + call_id = %pending.call_id, + "apply_patch tool name arrived AFTER first frame; wire stays function_call and Codex CLI will reject. Investigate upstream provider chunking.", + ); + } } } } } - // call_id 也可能在后续帧才出现 + // call_id 也可能在后续帧才出现 — 但只在 `output_item.added` 还没 emit + // 时才允许替换。已 emit 后再换会让客户端看到同一 item 用两个不同 + // call_id。 if let Some(id) = tc.id.as_deref() { if !id.is_empty() { if let Some(pending) = self.tool_calls.get_mut(&openai_index) { - // 只在首次给出 id 时覆盖(避免相同 index 不同 id 的混乱) - if pending.call_id.starts_with("call_") && pending.call_id.contains('_') { + if pending.output_item_added_emitted { + // 不再覆盖 — 同 item 已经对外暴露 call_id。 + } else if pending.call_id.starts_with("call_") && pending.call_id.contains('_') + { // 兜底生成的 call_id 形如 `call__`,真 id 来了就替换 if !pending.call_id.starts_with(id) && pending.call_id != id { pending.call_id = id.to_owned(); @@ -554,11 +655,16 @@ impl ChatToResponsesConverter { } } - // arguments delta(增量字符串) + // arguments delta(增量字符串)。apply_patch 路径**只**累积不 emit + // (理由见上文 open 处注释);非 apply_patch 仍逐 chunk emit + // `function_call_arguments.delta` 让客户端看到逐字流。 if let Some(args) = tc.function.arguments.as_deref() { if !args.is_empty() { if let Some(pending) = self.tool_calls.get_mut(&openai_index) { pending.args_acc.push_str(args); + if pending.is_apply_patch { + return; + } let item_id = pending.fc_id.clone(); let output_index = pending.output_index; emit_event( @@ -577,10 +683,10 @@ impl ChatToResponsesConverter { } } - fn close_tool_call(&mut self, openai_index: u32, out: &mut Vec) { + fn close_tool_call(&mut self, openai_index: u32, interrupted: bool, out: &mut Vec) { // 先把所有需要的字段 clone 出来,避免 mutable borrow 跟 // self.lookup_namespace_for 的 immutable borrow 冲突 - let (fc_id, call_id, name, args_acc, output_index, already_closed) = { + let (fc_id, call_id, name, args_acc, output_index, already_closed, is_apply_patch) = { let Some(pending) = self.tool_calls.get(&openai_index) else { return; }; @@ -591,11 +697,132 @@ impl ChatToResponsesConverter { pending.args_acc.clone(), pending.output_index, pending.closed, + pending.is_apply_patch, ) }; if already_closed { return; } + + if is_apply_patch { + // 从累积的 chat function args(标准形态 `{"input":""}`) + // 提取裸 V4A 文本。降级:模型可能直接吐裸 V4A(不包 JSON)— 历史 + // 上 freeform 工具的输出就是这个形态,某些 chat 上游可能没把它 + // 重新包成 JSON。fallback 把 args_acc 整段当 input,让上游能看到 + // 解析失败的具体内容(对调试 + 让 apply_patch parser 给出可读 + // 错误而不是静默 abort 都有用)。 + if args_acc.trim().is_empty() { + tracing::warn!( + target = "adapters::apply_patch", + call_id = %call_id, + "apply_patch tool was called with empty arguments — model likely misbehaving or provider stripped args", + ); + } + let input = extract_apply_patch_input(&args_acc); + // 缓存 input 到 pending,供 `tool_call_item_completed`(envelope + // output[] 终态)读,避免重复 parse 与潜在 drift。 + if let Some(pending) = self.tool_calls.get_mut(&openai_index) { + pending.apply_patch_input = Some(input.clone()); + } + // interrupted 中断时,patch 文本可能 mid-stream 被截断 — emit + // `status="incomplete"` 让 Codex CLI 看到 apply_patch handler 不 + // 应该执行 partial patch(apply_patch destructive,partial 执行 + // 可能在意外目标上写入意外内容)。同时 skip `input.done`(很多 + // 严格客户端在 `.done` 才触发执行)。 + if interrupted { + tracing::warn!( + target = "adapters::apply_patch", + call_id = %call_id, + args_len = args_acc.len(), + "apply_patch tool call cut off mid-stream (no finish_reason and not from [DONE]). Emitting output_item with status=incomplete; skipping input.done to prevent partial patch execution.", + ); + let item = json!({ + "type": "custom_tool_call", + "id": fc_id, + "call_id": call_id, + "name": name, + "input": input, + "status": "incomplete", + }); + emit_event( + out, + &mut self.sequence_number, + "response.output_item.done", + json!({ + "type": "response.output_item.done", + "output_index": output_index, + "item": item, + }), + ); + // 不存 cache(下一轮如果引用此 call_id 重建会拿到 incomplete + // 上下文,反而误导;让 orphan repair 路径补占位)。 + if let Some(pending) = self.tool_calls.get_mut(&openai_index) { + pending.closed = true; + } + return; + } + // open 阶段 emit 了空 input 的 output_item.added;这里一次性补 + // input.delta + output_item.done,让 Codex CLI 的 streaming + // parser(`StreamingPatchParser`)拿到完整 patch 文本后 finish。 + emit_event( + out, + &mut self.sequence_number, + "response.custom_tool_call_input.delta", + json!({ + "type": "response.custom_tool_call_input.delta", + "item_id": fc_id, + "output_index": output_index, + "call_id": call_id, + "delta": input, + }), + ); + emit_event( + out, + &mut self.sequence_number, + "response.custom_tool_call_input.done", + json!({ + "type": "response.custom_tool_call_input.done", + "item_id": fc_id, + "output_index": output_index, + "call_id": call_id, + "input": input, + }), + ); + let item = json!({ + "type": "custom_tool_call", + "id": fc_id, + "call_id": call_id, + "name": name, + "input": input, + "status": "completed", + }); + emit_event( + out, + &mut self.sequence_number, + "response.output_item.done", + json!({ + "type": "response.output_item.done", + "output_index": output_index, + "item": item, + }), + ); + // ToolCallCache 用于下一轮 Codex CLI 发 tool output 时重建工具 + // 调用上下文。回灌走 chat completions(messages.tool_calls.function + // .arguments 是 JSON 字符串),所以这里仍存原始 args_acc(JSON 形态) + // 而不是 input 裸文本,与 `assistant_message` 的 tool_calls 形态对齐。 + global_tool_call_cache().save( + &call_id, + ToolCallEntry { + name: name.clone(), + arguments: args_acc.clone(), + }, + ); + if let Some(pending) = self.tool_calls.get_mut(&openai_index) { + pending.closed = true; + } + return; + } + emit_event( out, &mut self.sequence_number, @@ -811,6 +1038,28 @@ impl ChatToResponsesConverter { } fn tool_call_item_completed(&self, pending: &PendingToolCall) -> Value { + if pending.is_apply_patch { + // envelope.output[] 终态必须和流式 `response.output_item.done` + // 的 item 一致(见 close_tool_call apply_patch 分支),否则严格 + // 客户端会两次解读为不同 item。读 close 时缓存好的 input, + // 不重新 parse args_acc — 万一 args_acc 在 close 与 envelope + // 构造之间发生意外变化(目前看不会,但防御性写法),两侧仍一致。 + // 缓存缺失时(理论上 close 一定先于 envelope build 跑,不应触发) + // fallback 到 raw args_acc,而不是再次 parse,避免重复 emit + // 任何 telemetry。 + let input = pending + .apply_patch_input + .clone() + .unwrap_or_else(|| pending.args_acc.clone()); + return json!({ + "type": "custom_tool_call", + "id": pending.fc_id, + "call_id": pending.call_id, + "name": pending.name, + "input": input, + "status": "completed", + }); + } let mut item = json!({ "type": "function_call", "id": pending.fc_id, @@ -1054,10 +1303,16 @@ impl ChatToResponsesConverter { if self.message_open && !self.message_closed { self.close_message(out); } - // tool_calls 按 OpenAI index 顺序闭合(BTreeMap 自然有序) + // tool_calls 按 OpenAI index 顺序闭合(BTreeMap 自然有序)。 + // `interrupted` = 没有 finish_reason **且**不是因 `[DONE]` 自然结束。 + // 这是用于让 apply_patch 在 close_tool_call 里 emit + // `status="incomplete"` 而不是 `completed`,防止严格客户端在 stream + // 半截断时仍把 partial patch 当成完整 tool 调用执行 + // (apply_patch 是 destructive,partial 执行风险高)。 + let interrupted = self.finish_reason.is_none() && !from_done; let tc_indices: Vec = self.tool_calls.keys().copied().collect(); for idx in tc_indices { - self.close_tool_call(idx, out); + self.close_tool_call(idx, interrupted, out); } // finish_reason → status / incomplete_details 映射。保留现有 5 路径 @@ -1252,6 +1507,60 @@ fn emit_event(out: &mut Vec, seq: &mut u64, event_name: &str, payload: Value emit_sse_event(out, seq, event_name, payload); } +/// 从 chat function args(标准形态 `{"input": ""}`)提取裸 V4A +/// 文本,供 `custom_tool_call.input` 字段使用。 +/// +/// 降级路径分两类,通过 tracing 区分以便事后定位: +/// +/// 1. **JSON parse 失败 + 看起来像裸 V4A**(以 `*** Begin Patch` 开头): +/// debug 级,预期 happy path —— 上游 chat provider 把 freeform 工具 +/// 输出原样透传未包 JSON。 +/// 2. **JSON parse 失败 + 不像 V4A**:warn 级,通常是 stream 截断 / UTF-8 +/// 损坏 / 上游加 markdown fence。把整段透传给 Codex CLI 至少能让用户 +/// 看到 `apply_patch verification failed: ` 而不是 abort +/// 无线索。 +/// 3. **JSON valid 但缺 `input` 字段**:warn 级,通常是 schema drift +/// (模型用了 `patch` 而不是 `input` 等)。整段透传暴露真坏。 +/// +/// 借鉴上游 `codex-rs/apply-patch/apply_patch_tool_instructions.md` 的 +/// V4A 格式约束。不做 V4A 语法校验 — 留给 Codex CLI 端的 `parse_patch`。 +fn extract_apply_patch_input(args_acc: &str) -> String { + let trimmed = args_acc.trim(); + if trimmed.is_empty() { + return String::new(); + } + match serde_json::from_str::(trimmed) { + Ok(parsed) => match parsed.get("input").and_then(Value::as_str) { + Some(s) => s.to_owned(), + None => { + tracing::warn!( + target = "adapters::apply_patch", + args_preview = %args_acc.chars().take(120).collect::(), + "apply_patch args parsed as JSON but missing `input` string field; passing raw args to Codex CLI", + ); + args_acc.to_owned() + } + }, + Err(err) => { + if trimmed.starts_with("*** Begin Patch") { + tracing::debug!( + target = "adapters::apply_patch", + "apply_patch args are bare V4A (no JSON wrapper); passthrough", + ); + } else { + tracing::warn!( + target = "adapters::apply_patch", + error = %err, + args_len = args_acc.len(), + args_preview = %args_acc.chars().take(120).collect::(), + "apply_patch args failed JSON parse and don't look like bare V4A; falling back to raw passthrough — likely truncation or schema drift", + ); + } + args_acc.to_owned() + } + } +} + fn drain_one_frame(buf: &mut BytesMut) -> Option { let pos = find_double_newline(buf)?; Some(buf.split_to(pos + 2).freeze()) @@ -2501,6 +2810,203 @@ data: {"choices":[{"delta":{},"finish_reason":"tool_calls"}]} assert_eq!(done.1["arguments"], "{\"a\":1}"); } + #[test] + fn apply_patch_tool_call_emits_custom_tool_call_wire_not_function_call() { + // 回归保护(issue #235):chat 上游(DeepSeek 等)用 function call 返回 + // apply_patch 时,adapter 必须把 wire 重打包成 Codex CLI 期望的 + // `custom_tool_call` 形态(上游 router 按 wire type 路由,apply_patch + // handler 硬要求 `ToolPayload::Custom { input }`)。 + // patch 文本走标准 JSON 转义:在 args.arguments 字符串里,V4A 原文的 + // `\n` 被双重转义成 `\\n`(JSON 字符串里写 `\n`)。 + let mut c = fixed(); + // 真实 chat 上游 wire 中,tool_call.arguments 是 JSON 字符串字面值, + // patch 里的换行必须双重转义(SSE outer JSON 的 string value 里写 + // `\\n`,解码后是 `\n` 字面;`arguments` 值再被 client 当 JSON 解一次 + // 得到 `*** Begin Patch\n...` 真换行的 V4A patch)。 + let chunks = concat!( + r#"data: {"choices":[{"index":0,"delta":{"tool_calls":[{"index":0,"id":"call_ap","type":"function","function":{"name":"apply_patch","arguments":"{\"input\":\"*** Begin Patch\\n*** Update File: foo.py\\n@@\\n-old\\n+new\\n*** End Patch\\n\"}"}}]}}]}"#, + "\n\n", + r#"data: {"choices":[{"delta":{},"finish_reason":"tool_calls"}]}"#, + "\n\n", + "data: [DONE]\n\n", + ); + let out = c.feed(chunks.as_bytes()); + let events = parse_emitted(&out); + let kinds = names(&events); + // open 必须用 custom_tool_call 而不是 function_call + let added = events + .iter() + .find(|(n, _)| n == "response.output_item.added") + .expect("应当 emit output_item.added"); + assert_eq!( + added.1["item"]["type"], "custom_tool_call", + "apply_patch wire 必须是 custom_tool_call,实际 events: {kinds:?}" + ); + assert_eq!(added.1["item"]["name"], "apply_patch"); + // 中间不应有 function_call_arguments.delta(apply_patch 路径 close 时 + // 一次性 emit custom_tool_call_input.delta) + assert!( + !kinds.contains(&"response.function_call_arguments.delta"), + "apply_patch 路径不应 emit function_call_arguments.delta,events: {kinds:?}" + ); + // close 必须 emit custom_tool_call_input.delta + .done + let input_delta = events + .iter() + .find(|(n, _)| n == "response.custom_tool_call_input.delta") + .expect("应当 emit custom_tool_call_input.delta"); + let expected_v4a = + "*** Begin Patch\n*** Update File: foo.py\n@@\n-old\n+new\n*** End Patch\n"; + assert_eq!(input_delta.1["delta"], expected_v4a); + assert_eq!(input_delta.1["call_id"], "call_ap"); + // envelope.output[] 终态也必须是 custom_tool_call + let completed = events + .iter() + .rev() + .find(|(n, _)| n == "response.completed") + .unwrap(); + let output = &completed.1["response"]["output"][0]; + assert_eq!(output["type"], "custom_tool_call"); + assert_eq!(output["input"], expected_v4a); + assert_eq!(output["call_id"], "call_ap"); + } + + #[test] + fn apply_patch_falls_back_to_raw_args_when_not_json() { + // 模型直接吐裸 V4A 而不包 JSON(某些 chat 上游可能这样转译 freeform)。 + // adapter 必须把整段 args_acc 当 input 而不是空字符串,让 Codex CLI + // 至少能看到 patch 内容并尝试解析。 + let raw_v4a = "*** Begin Patch\n*** Add File: a.md\n+hi\n*** End Patch\n"; + // serde_json::to_string 自动产生合法 JSON 字符串 escape(`\n` → `\\n` + // 字面、引号转义、反斜杠转义),比手工 replace 链可靠且贴近真实 wire。 + let args_json_string = serde_json::to_string(raw_v4a).unwrap(); + let mut c = fixed(); + let frame = format!( + "data: {{\"choices\":[{{\"index\":0,\"delta\":{{\"tool_calls\":[{{\"index\":0,\"id\":\"call_ap\",\"type\":\"function\",\"function\":{{\"name\":\"apply_patch\",\"arguments\":{args_json_string}}}}}]}}}}]}}\n\ndata: {{\"choices\":[{{\"delta\":{{}},\"finish_reason\":\"tool_calls\"}}]}}\n\ndata: [DONE]\n\n", + args_json_string = args_json_string, + ); + let out = c.feed(frame.as_bytes()); + let events = parse_emitted(&out); + let delta = events + .iter() + .find(|(n, _)| n == "response.custom_tool_call_input.delta") + .expect("custom_tool_call_input.delta 应当 emit"); + assert_eq!( + delta.1["delta"], raw_v4a, + "非 JSON args 应整段当 input(裸 V4A 兜底)" + ); + } + + #[test] + fn apply_patch_interrupted_stream_emits_incomplete_status_skips_input_done() { + // 回归保护:apply_patch 是 destructive 工具,stream 中途断开 → close + // 必须 emit `status="incomplete"` 且 skip `custom_tool_call_input.done`, + // 让 Codex CLI 看到不完整状态而不是执行 partial patch。 + let partial_v4a = "*** Begin Patch\n*** Update File: foo.py\n@@\n-old\n"; // 截断在 @@ 之后 + let inner = serde_json::to_string(&json!({ "input": partial_v4a })).unwrap(); + let args_json_string = serde_json::to_string(&inner).unwrap(); + let mut c = fixed(); + // 仅 emit tool_call 增量与 lifecycle 开头,不 emit finish_reason / [DONE], + // 模拟 upstream EOF 中断。 + let frame = format!( + "data: {{\"choices\":[{{\"index\":0,\"delta\":{{\"tool_calls\":[{{\"index\":0,\"id\":\"call_ap\",\"type\":\"function\",\"function\":{{\"name\":\"apply_patch\",\"arguments\":{args_json_string}}}}}]}}}}]}}\n\n", + args_json_string = args_json_string, + ); + let _ = c.feed(frame.as_bytes()); + let out = c.finish(); + let events = parse_emitted(&out); + let kinds = names(&events); + // 必须 NOT 出现 .delta 或 .done 的 custom_tool_call_input(防止 client + // 在 .done 时触发执行 partial patch) + assert!( + !kinds.contains(&"response.custom_tool_call_input.done"), + "interrupted 时禁止 emit custom_tool_call_input.done,events: {kinds:?}" + ); + assert!( + !kinds.contains(&"response.custom_tool_call_input.delta"), + "interrupted 时禁止 emit custom_tool_call_input.delta(避免提前触发执行),events: {kinds:?}" + ); + // output_item.done item 必须含 status=incomplete + let done = events + .iter() + .find(|(n, _)| n == "response.output_item.done") + .expect("interrupted 仍应 emit output_item.done"); + assert_eq!(done.1["item"]["type"], "custom_tool_call"); + assert_eq!( + done.1["item"]["status"], "incomplete", + "interrupted apply_patch 必须 status=incomplete" + ); + // envelope 也是 incomplete + interrupted + let completed = events + .iter() + .rev() + .find(|(n, _)| n == "response.completed") + .unwrap(); + assert_eq!(completed.1["response"]["status"], "incomplete"); + assert_eq!( + completed.1["response"]["incomplete_details"]["reason"], + "interrupted" + ); + } + + #[test] + fn apply_patch_streaming_input_matches_envelope_output() { + // 防御性回归:`response.output_item.done` 的 `item.input` 必须跟 + // `response.completed.output[].input` 完全一致,避免两次 emit 路径 + // 在未来重构时 drift。 + let patch = "*** Begin Patch\n*** Update File: x.txt\n@@\n-a\n+b\n*** End Patch\n"; + // chat wire 里 `arguments` 是 JSON-string(双重编码):先把 V4A 包成 + // `{"input": ""}` JSON 文本,再 JSON-quote 一次作为字符串值。 + let inner = serde_json::to_string(&json!({ "input": patch })).unwrap(); + let args_json_string = serde_json::to_string(&inner).unwrap(); + let mut c = fixed(); + let frame = format!( + "data: {{\"choices\":[{{\"index\":0,\"delta\":{{\"tool_calls\":[{{\"index\":0,\"id\":\"call_match\",\"type\":\"function\",\"function\":{{\"name\":\"apply_patch\",\"arguments\":{args_json_string}}}}}]}}}}]}}\n\ndata: {{\"choices\":[{{\"delta\":{{}},\"finish_reason\":\"tool_calls\"}}]}}\n\ndata: [DONE]\n\n", + args_json_string = args_json_string, + ); + let out = c.feed(frame.as_bytes()); + let events = parse_emitted(&out); + let done = events + .iter() + .find(|(n, v)| { + n == "response.output_item.done" && v["item"]["type"] == "custom_tool_call" + }) + .expect("应当有 custom_tool_call output_item.done"); + let streamed_input = done.1["item"]["input"].as_str().unwrap().to_owned(); + let completed = events + .iter() + .rev() + .find(|(n, _)| n == "response.completed") + .unwrap(); + let envelope_input = completed.1["response"]["output"][0]["input"] + .as_str() + .unwrap() + .to_owned(); + assert_eq!( + streamed_input, envelope_input, + "streamed output_item.done.input 必须跟 envelope.output[].input 完全一致" + ); + assert_eq!(streamed_input, patch); + } + + #[test] + fn extract_apply_patch_input_extracts_or_falls_back() { + // happy path:`{input: string}` 提出 string 字段 + assert_eq!( + extract_apply_patch_input(r#"{"input":"*** Begin Patch\nfoo"}"#), + "*** Begin Patch\nfoo" + ); + // 非 JSON:整段透传 + let raw = "*** Begin Patch\nfoo\n*** End Patch\n"; + assert_eq!(extract_apply_patch_input(raw), raw); + // JSON 但无 input 字段:整段透传 + assert_eq!( + extract_apply_patch_input(r#"{"other":"x"}"#), + r#"{"other":"x"}"# + ); + // 空字符串:返回空 + assert_eq!(extract_apply_patch_input(""), ""); + } + #[test] fn message_then_tool_call_keeps_output_index_order() { let mut c = fixed(); diff --git a/crates/adapters/src/responses/request.rs b/crates/adapters/src/responses/request.rs index 621e4cfa..76a16cbd 100644 --- a/crates/adapters/src/responses/request.rs +++ b/crates/adapters/src/responses/request.rs @@ -518,6 +518,70 @@ fn input_item_to_messages(item: &serde_json::Map) -> Vec { "content": output_str, })] } + "custom_tool_call" => { + // Codex CLI 把 freeform apply_patch 的回放 wire 包成 + // `ResponseItem::CustomToolCall { name, input, call_id, ... }` + // (`codex-rs/protocol/src/models.rs:824-832`)。我们在 turn N 通过 + // `converter.rs::close_tool_call` apply_patch 分支 emit 了它; + // Codex CLI 在 turn N+1 把同一 item 通过 `input[]` 回放给我们。 + // 转下游 chat completions 时必须重新打包成 `assistant.tool_calls` + // 的 `type:"function"` 形态(chat 端不认 custom_tool_call),且 + // `function.arguments` 必须是 JSON 字符串 `{"input":""}` + // (与首轮在 `tools.rs::convert_responses_tool_to_chat_tool` 的 + // `"custom" =>` 分支 lowering 形态保持一致)—— 模型才不会因 + // wire 形态变化失忆。 + let call_id = item + .get("call_id") + .and_then(|v| v.as_str()) + .or_else(|| item.get("id").and_then(|v| v.as_str())) + .unwrap_or("") + .to_owned(); + let name = item.get("name").and_then(|v| v.as_str()).unwrap_or(""); + let input_text = item.get("input").and_then(|v| v.as_str()).unwrap_or(""); + // arguments 必须是 chat function-call 的标准 JSON 字符串形态。 + // serde_json::to_string 自动处理换行 / 引号 / 反斜杠等所有转义。 + let arguments_json = serde_json::to_string(&json!({ "input": input_text })) + .unwrap_or_else(|_| { + // to_string 在 input 是 valid UTF-8 string 时不会失败;若 + // 真发生,fallback 到空对象保持下游 chat schema 合法。 + "{}".to_owned() + }); + let arguments = sanitize_tool_arguments_json_string(&arguments_json); + vec![json!({ + "role": "assistant", + "content": "", + "tool_calls": [{ + "id": if call_id.is_empty() { "call_unknown".to_owned() } else { call_id }, + "type": "function", + "function": { "name": name, "arguments": arguments }, + }], + })] + } + "custom_tool_call_output" => { + // `ResponseItem::CustomToolCallOutput { call_id, output, ... }` + // (`codex-rs/protocol/src/models.rs:839-847`)使用与 function_call_output + // 相同的 `output` payload encoding(string 或 content_items array)。 + // 转 chat 时只需把 wire item type 对齐到普通 `role:"tool"` message, + // tool_call_id 来源仍按 call_id / tool_call_id / id 三级兜底。 + let call_id = item + .get("call_id") + .and_then(|v| v.as_str()) + .or_else(|| item.get("tool_call_id").and_then(|v| v.as_str())) + .or_else(|| item.get("id").and_then(|v| v.as_str())) + .unwrap_or("") + .to_owned(); + let output_value = item + .get("output") + .cloned() + .unwrap_or(Value::String(String::new())); + let output_str = + normalize_tool_output_for_context(Some(call_id.as_str()), output_value); + vec![json!({ + "role": "tool", + "tool_call_id": call_id, + "content": output_str, + })] + } "input_image" => { let image_url = item .get("image_url") diff --git a/crates/adapters/src/responses/request/tests.rs b/crates/adapters/src/responses/request/tests.rs index 0bb2d70e..5d9470f4 100644 --- a/crates/adapters/src/responses/request/tests.rs +++ b/crates/adapters/src/responses/request/tests.rs @@ -1952,6 +1952,81 @@ fn function_call_output_becomes_tool_message_with_placeholder_assistant() { assert_eq!(messages[1]["content"], "sunny"); } +#[test] +fn custom_tool_call_input_item_lowered_to_assistant_tool_calls() { + // 回归保护(issue #235):turn N+1 Codex CLI 回放上一轮的 + // `ResponseItem::CustomToolCall { name, input, call_id }`,我们必须把它 + // 转成 chat completions 的 `assistant.tool_calls` 形态(function-call), + // 否则模型完全看不到上一轮 apply_patch 调用 → 多轮上下文丢失。 + // arguments 必须是 JSON 字符串 `{"input":""}`,与首轮在请求侧 + // lowering 的形态保持一致,模型才不失忆。 + let patch_text = "*** Begin Patch\n*** Update File: a.py\n@@\n-x\n+y\n*** End Patch\n"; + let out = convert(json!({ + "input": [{ + "type": "custom_tool_call", + "id": "ctc_1", + "call_id": "call_ap_1", + "name": "apply_patch", + "input": patch_text, + "status": "completed", + }] + })); + let messages = out["messages"].as_array().unwrap(); + let assistant = messages + .iter() + .find(|m| m["role"] == "assistant" && m["tool_calls"].is_array()) + .expect("custom_tool_call 应当映射成 assistant.tool_calls"); + let tc = &assistant["tool_calls"][0]; + assert_eq!(tc["type"], "function"); + assert_eq!(tc["id"], "call_ap_1"); + assert_eq!(tc["function"]["name"], "apply_patch"); + // arguments 是 JSON 字符串值。serde_json 解一次得到 {input: }, + // 再 V4A 的换行已被正常 JSON-escape(`\n` 字面值)。 + let args_str = tc["function"]["arguments"].as_str().unwrap(); + let parsed: serde_json::Value = + serde_json::from_str(args_str).expect("arguments 必须是合法 JSON"); + assert_eq!(parsed["input"], patch_text); +} + +#[test] +fn custom_tool_call_output_input_item_lowered_to_role_tool() { + // 回归保护(issue #235):`ResponseItem::CustomToolCallOutput { call_id, output }` + // 回放时必须转成 chat 端的 `role:"tool"` message,tool_call_id 跟前面的 + // assistant.tool_calls.id 配对,否则 chat 上游会因 orphan tool message 400。 + let out = convert(json!({ + "input": [ + { + "type": "custom_tool_call", + "call_id": "call_ap_2", + "name": "apply_patch", + "input": "*** Begin Patch\n*** Add File: b.md\n+hi\n*** End Patch\n", + }, + { + "type": "custom_tool_call_output", + "call_id": "call_ap_2", + "output": "Patch applied successfully", + } + ] + })); + let messages = out["messages"].as_array().unwrap(); + let tool_msg = messages + .iter() + .find(|m| m["role"] == "tool") + .expect("custom_tool_call_output 应当映射成 role:tool"); + assert_eq!(tool_msg["tool_call_id"], "call_ap_2"); + assert_eq!(tool_msg["content"], "Patch applied successfully"); + // 同 PR 还要保证 assistant 在 tool 前(orphan repair 不会插占位) + let assistant_idx = messages + .iter() + .position(|m| m["role"] == "assistant") + .unwrap(); + let tool_idx = messages.iter().position(|m| m["role"] == "tool").unwrap(); + assert!( + assistant_idx < tool_idx, + "assistant.tool_calls 必须在 role:tool 之前出现" + ); +} + #[test] fn function_call_output_non_string_is_json_serialized() { // 走完整 convert 路径(global cache 在生产里就这条路); @@ -2644,6 +2719,57 @@ fn tools_custom_type_is_lowered_to_function_with_input() { "string" ); assert_eq!(tool["function"]["parameters"]["required"][0], "input"); + // 非 apply_patch 的 custom 工具仍透传 outer description,input 用泛指 + // 兜底描述,不注入 V4A 提示。 + assert_eq!(tool["function"]["description"], "anything"); + assert!( + tool["function"]["parameters"]["properties"]["input"]["description"] + .as_str() + .unwrap_or_default() + .contains("verbatim"), + "非 apply_patch 应保留泛指 input 描述,实际:{}", + tool["function"]["parameters"]["properties"]["input"]["description"] + ); +} + +#[test] +fn tools_custom_apply_patch_injects_v4a_format_hint() { + // 回归保护(issue #235):chat 上游(DeepSeek 等)拿到 freeform apply_patch + // 时,上游的 "do not wrap in JSON" 描述会误导模型;且原始描述里没有 V4A + // 格式样例。adapter 必须替换描述为 chat 路径准确的 V4A 指引,模型才能 + // 正确填充 `input` 字段。 + let out = convert(json!({ + "input": "hi", + "tools": [{ + "type": "custom", + "name": "apply_patch", + "description": "Use the `apply_patch` tool to edit files. This is a FREEFORM tool, so do not wrap the patch in JSON." + }] + })); + let tool = &out["tools"][0]; + assert_eq!(tool["type"], "function"); + assert_eq!(tool["function"]["name"], "apply_patch"); + + // outer description 必须替换(不能保留误导性的 "do not wrap" 文本) + let outer = tool["function"]["description"].as_str().unwrap_or_default(); + assert!(!outer.contains("do not wrap"), "误导性原描述未替换:{outer}"); + assert!( + outer.contains("V4A"), + "outer description 应当包含 V4A 关键字:{outer}" + ); + assert!( + outer.contains("*** Begin Patch"), + "outer description 应当含 V4A 边界标记:{outer}" + ); + + // input 参数描述必须含 V4A 格式约束(provider 可能更看 parameter desc) + let input_desc = tool["function"]["parameters"]["properties"]["input"]["description"] + .as_str() + .unwrap_or_default(); + assert!( + input_desc.contains("V4A") && input_desc.contains("*** Begin Patch"), + "input description 应含 V4A 与边界标记:{input_desc}" + ); } #[test] diff --git a/crates/adapters/src/responses/request/tools.rs b/crates/adapters/src/responses/request/tools.rs index 040c01fe..58264e6b 100644 --- a/crates/adapters/src/responses/request/tools.rs +++ b/crates/adapters/src/responses/request/tools.rs @@ -3,6 +3,42 @@ use serde_json::{json, Value}; use super::provider_looks_like; +/// Codex freeform tool name we special-case. See the `"custom" =>` arm in +/// `convert_responses_tool_to_chat_tool` below for the request-side rewrite +/// rationale, and `converter.rs::close_tool_call` for the response-side +/// wire re-shape — they must trigger on the exact same tool name. +pub(crate) const APPLY_PATCH_TOOL_NAME: &str = "apply_patch"; + +/// Chat-path replacement for Codex CLI's freeform `apply_patch` description. +/// Original upstream text says "do not wrap the patch in JSON" because the +/// Responses API freeform/lark grammar accepts raw text — but on the +/// chat-completions path the model MUST emit a function call whose `input` +/// argument is a JSON string containing the V4A patch. We rewrite the +/// description so the model sees instructions consistent with the wire +/// format it has to produce. +pub(crate) const APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT: &str = concat!( + "Edit files using the apply_patch tool. ", + "Call this function with a single `input` string containing a V4A patch. ", + "The patch must start with `*** Begin Patch` and end with `*** End Patch`. ", + "Each file operation header is one of `*** Add File: `, ", + "`*** Update File: ` (optionally followed by `*** Move to: `), ", + "or `*** Delete File: `. ", + "Within Update hunks, use `@@ @@` markers, prefix unchanged lines ", + "with a single space, removed lines with `-`, and added lines with `+`. ", + "Use relative paths only (never absolute). ", + "Embed real newlines as `\\n` inside the JSON string value for `input`." +); + +/// Chat-path replacement for the freeform `input` parameter description. +/// Mirrors `APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT` but at the parameter level, +/// so the model sees the format constraint regardless of whether providers +/// surface tool-level or parameter-level descriptions more prominently. +pub(crate) const APPLY_PATCH_INPUT_DESCRIPTION_FOR_CHAT: &str = concat!( + "A V4A patch starting with `*** Begin Patch` and ending with `*** End Patch`. ", + "Use `*** Add File:`, `*** Update File:`, or `*** Delete File:` headers and ", + "`@@ ... @@` hunks with ` `/`+`/`-` line prefixes. Relative paths only." +); + /// Responses tool 定义 → Chat tool 定义. /// 把单个 Responses API tool 转成零或多个 Chat Completions tool。 /// @@ -53,23 +89,51 @@ pub fn convert_responses_tool_to_chat_tool( })] } "custom" => { - // Custom tool(无 JSON schema)降级为接受单字符串 input 的 function + // Custom tool(Responses API freeform tool,无 JSON schema)降级为 + // 接受单字符串 input 的 function tool — chat completions 不认 + // `type:"custom"`,DeepSeek / Kimi / MiMo 等 chat 上游必须走 function。 + // + // **apply_patch 特判**:Codex CLI 把 apply_patch 作为 freeform 工具 + // 注册,wire description 是 "Use the `apply_patch` tool to edit files. + // This is a FREEFORM tool, so do not wrap the patch in JSON." + // (上游 `codex-rs/core/src/tools/handlers/apply_patch_spec.rs` 实证)。 + // 经 chat function-call 反而**必须**把 patch 包进 JSON 字符串值 —— + // 上游的 "do not wrap in JSON" 指令在 chat 路径下会误导模型, + // 且原 description 没给 V4A 格式样例。这里替换成对 chat 路径准确 + // 的指引,把 V4A 关键字 / 文件操作头 / hunk 标记列清楚,让 DeepSeek + // 之类的模型知道 input 字段该填什么。 + // 响应侧(converter.rs::close_tool_call)对 name==apply_patch 特判, + // 把模型回来的 function_call 重新打包成 custom_tool_call wire, + // 让 Codex CLI router (`ResponseItem::CustomToolCall`) 正确路由到 + // apply_patch handler(handler 硬要求 `ToolPayload::Custom { input }`, + // 见 `codex-rs/core/src/tools/handlers/apply_patch.rs:324`)。 let name = obj.get("name").and_then(|v| v.as_str()).unwrap_or(""); - let description = obj + let original_description = obj .get("description") .and_then(|v| v.as_str()) .unwrap_or(""); + let (tool_description, input_description) = if name == APPLY_PATCH_TOOL_NAME { + ( + APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT.to_owned(), + APPLY_PATCH_INPUT_DESCRIPTION_FOR_CHAT.to_owned(), + ) + } else { + ( + original_description.to_owned(), + "Free-form input passed verbatim to the tool.".to_owned(), + ) + }; vec![json!({ "type": "function", "function": { "name": name, - "description": description, + "description": tool_description, "parameters": { "type": "object", "properties": { "input": { "type": "string", - "description": "Free-form input passed verbatim to the tool.", + "description": input_description, } }, "required": ["input"], From cde491a59cae24fecd57d6c22c10452df96a4adf Mon Sep 17 00:00:00 2001 From: Cmochance <3216202644@qq.com> Date: Wed, 20 May 2026 21:05:44 +0800 Subject: [PATCH 2/4] =?UTF-8?q?fix(adapters):=20apply=5Fpatch=20tool=20des?= =?UTF-8?q?cription=20=E6=98=BE=E5=BC=8F=20hunk=20anchor=20=E8=AF=AD?= =?UTF-8?q?=E4=B9=89=20(#235)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 真机验证(用户 prompt 让模型在 README 5-10 行间插一段 markdown)发现:wire 桥接成功(25+ shim 触发 zero abort),但模型连续 20 分钟、25+ retry 在 V4A hunk header 上栽跟头,最终 fallback 到 sed 才完成。 根因:`@@ @@` 后的 space-prefixed 行的语义,在 freeform/lark grammar 受约束的解码空间下不会错(模型只能产出语法合法序列);切到 chat function-call 路径后 lark 强约束消失,description 只说了 ` `/`+`/`-` prefix,**没说 space 行对应 anchor *之后* 的行**。DeepSeek 反复把 anchor 当 space 行重复一次,parse_patch 找不到这样的双行存在 → 拒。 修复:在 `APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT` 加显式 "CRITICAL HUNK SEMANTICS" 段 + 一个最小可执行 V4A 示例(rename a let binding),展示 anchor 只出现在 `@@ ... @@` 里、不重复到 space 行。`APPLY_PATCH_INPUT_ DESCRIPTION_FOR_CHAT`(参数级)也加紧凑版同规则,防 provider 长 history 时截断 tool-level description。 测试:`tools_custom_apply_patch_injects_v4a_format_hint` 增加 4 个断言, 锁住 anchor 语义解释 + 最小示例 + 参数级紧凑版,防 description 在未来 refactor 时被误删。`cargo test --workspace` 全套通过。 注意:Codex CLI 端 `parse_patch` 失败不会经过 proxy log —— 那个错误在 client tool runtime 路径里被 emit 给模型作为 tool error,所以 PR 之前的 monitor 看不到。本次 follow-up 完全靠用户真机手工实测反馈(感谢)。 Refs #235 --- .../adapters/src/responses/request/tests.rs | 30 ++++++++++++++ .../adapters/src/responses/request/tools.rs | 41 ++++++++++++++++++- 2 files changed, 69 insertions(+), 2 deletions(-) diff --git a/crates/adapters/src/responses/request/tests.rs b/crates/adapters/src/responses/request/tests.rs index 5d9470f4..56813fcd 100644 --- a/crates/adapters/src/responses/request/tests.rs +++ b/crates/adapters/src/responses/request/tests.rs @@ -2770,6 +2770,36 @@ fn tools_custom_apply_patch_injects_v4a_format_hint() { input_desc.contains("V4A") && input_desc.contains("*** Begin Patch"), "input description 应含 V4A 与边界标记:{input_desc}" ); + + // 回归保护(issue #235 真机验证暴露的二级问题):tool 描述必须显式解释 + // hunk semantics —— context 锚点 vs space-prefixed 行的区别。DeepSeek 在没 + // 有 lark grammar 强约束的 chat 路径上反复栽在这里(把 anchor 当 space 行 + // 重复一次),花 20 分钟、25+ 次 retry 最后 fallback 到 sed。description + // 必须含可执行的最小示例 + 显式的"do not repeat the anchor"指引。 + let outer_lc = outer.to_lowercase(); + assert!( + outer.contains("@@") && outer.contains("anchor"), + "tool description 必须解释 hunk anchor 语义:{outer}" + ); + assert!( + outer.contains("AFTER the anchor") || outer.contains("after it"), + "必须显式说明 space 行对应 anchor *之后* 的位置:{outer}" + ); + assert!( + outer_lc.contains("do not repeat the anchor") || outer_lc.contains("not again as a space"), + "必须显式禁止把 anchor 当 space 行重复:{outer}" + ); + assert!( + outer.contains("*** Update File:") && outer.contains("@@ fn main()"), + "必须包含一个最小可执行 V4A 示例:{outer}" + ); + + // 参数描述同样必须保留紧凑版的语义提示 + assert!( + input_desc.contains("anchor") + && (input_desc.contains("AFTER") || input_desc.contains("after")), + "input description 必须保留 anchor 语义紧凑版:{input_desc}" + ); } #[test] diff --git a/crates/adapters/src/responses/request/tools.rs b/crates/adapters/src/responses/request/tools.rs index 58264e6b..1d7f0c4e 100644 --- a/crates/adapters/src/responses/request/tools.rs +++ b/crates/adapters/src/responses/request/tools.rs @@ -16,6 +16,18 @@ pub(crate) const APPLY_PATCH_TOOL_NAME: &str = "apply_patch"; /// argument is a JSON string containing the V4A patch. We rewrite the /// description so the model sees instructions consistent with the wire /// format it has to produce. +/// +/// **重要:hunk body 的 space-prefixed 行语义** — 上游 freeform 工具用 lark +/// grammar 强制约束,模型在受约束的解码空间里不会搞错;但 chat function-call +/// 没有 grammar 约束,只剩 description。实测(issue #235 真机)DeepSeek +/// 反复在一个具体语义上栽跟头: +/// +/// > `@@ @@` 标记后的 space-prefixed 行 = 文件中 context 锚点 +/// > **之后**的行,**不是** context 行本身的重复 +/// +/// 不显式说清这个,模型会把 context 行当成 space 行再写一次,parse_patch +/// 找不到双行 → 整个 patch 拒收。本 description 通过显式规则 + 一个最小 +/// 可执行的更新文件 example 让模型看到正确形态。 pub(crate) const APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT: &str = concat!( "Edit files using the apply_patch tool. ", "Call this function with a single `input` string containing a V4A patch. ", @@ -26,17 +38,42 @@ pub(crate) const APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT: &str = concat!( "Within Update hunks, use `@@ @@` markers, prefix unchanged lines ", "with a single space, removed lines with `-`, and added lines with `+`. ", "Use relative paths only (never absolute). ", - "Embed real newlines as `\\n` inside the JSON string value for `input`." + "Embed real newlines as `\\n` inside the JSON string value for `input`.\n\n", + "CRITICAL HUNK SEMANTICS (the most common cause of patch rejection):\n", + "`@@ @@` is an anchor that names ONE existing line in the file. ", + "Every space-prefixed line that follows the `@@` marker corresponds to lines ", + "AFTER the anchor in the file (not the anchor itself). ", + "Do NOT repeat the anchor line as the first space-prefixed line — the parser will reject it.\n\n", + "EXAMPLE — to change `let x = 1;` to `let x = 2;` in a file whose lines around the change read:\n", + " fn main() {\n", + " let x = 1;\n", + " println!(\"{}\", x);\n", + " }\n", + "The correct patch is:\n", + "*** Begin Patch\n", + "*** Update File: src/main.rs\n", + "@@ fn main() {\n", + "- let x = 1;\n", + "+ let x = 2;\n", + " println!(\"{}\", x);\n", + "*** End Patch\n", + "Notice: `fn main() {` appears in `@@ ... @@` once as the anchor, NOT again as a space-prefixed line below. ", + "The first content line under the anchor is the line immediately after `fn main() {` in the file." ); /// Chat-path replacement for the freeform `input` parameter description. /// Mirrors `APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT` but at the parameter level, /// so the model sees the format constraint regardless of whether providers /// surface tool-level or parameter-level descriptions more prominently. +/// Same anchor-vs-space-line gotcha called out here in compact form (some +/// providers truncate or de-emphasize tool-level descriptions on long +/// histories — keep the rule visible at parameter level too). pub(crate) const APPLY_PATCH_INPUT_DESCRIPTION_FOR_CHAT: &str = concat!( "A V4A patch starting with `*** Begin Patch` and ending with `*** End Patch`. ", "Use `*** Add File:`, `*** Update File:`, or `*** Delete File:` headers and ", - "`@@ ... @@` hunks with ` `/`+`/`-` line prefixes. Relative paths only." + "`@@ @@` hunks with ` `/`+`/`-` line prefixes. Relative paths only. ", + "CRITICAL: in an Update hunk the `@@ @@` anchor is a SINGLE existing file line; ", + "the space-prefixed lines following the anchor describe lines AFTER it (do not repeat the anchor)." ); /// Responses tool 定义 → Chat tool 定义. From 9b7a7fc3a7081dbdd05466c02d887c42f5909340 Mon Sep 17 00:00:00 2001 From: Cmochance <3216202644@qq.com> Date: Wed, 20 May 2026 22:07:54 +0800 Subject: [PATCH 3/4] =?UTF-8?q?fix(adapters):=20apply=5Fpatch=20chat-path?= =?UTF-8?q?=20=E5=AE=9E=E6=88=98=20workaround=20=E5=90=8C=E6=AD=A5?= =?UTF-8?q?=E6=B3=A8=E5=85=A5=20system=20+=20description=20(#235)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit DeepSeek 稳定性测试(10 个 Level 全跑通,详见 #235 PR 评论)模型自己摸索出 4 个 chat-path 上 apply_patch 的非平凡行为,每次任务平均花 1-3 分钟绕弯子: 1. `@@ <非空文本> @@` 锚点在 chat 路径上常匹配失败 → 模型用 `printf '\n' >> file` 种空行作锚点 → patch → 事后清理多余空行 2. `*** Add File: foo` + 同 patch 内 `*** Update File: foo` 冲突 (新建文件未落盘 Update 已读取)→ 模型改用预建锚点文件 3. 纯空目标文件无法直接 `*** Update File:` → 必须 shell 先 seed 一行 4. 多行文件里纯 `+` 行在锚点后是"追加"不是"替换" → 需 `-` + `+` 配对替换 这些都是 Codex CLI 端 parse_patch 的实际行为,adapter 修不了 wire 层。但可以 预先在请求侧告诉模型这些 workaround,让首次成功率提升、token 浪费降低。 实现: - `crates/adapters/src/responses/request.rs`:加 `tools_register_apply_patch()` 检测 + `APPLY_PATCH_CHAT_PATH_SYSTEM_GUIDANCE` 文案 + `apply_patch_chat_guidance_message()` 构造器;`build_messages_from_input` 紧跟 Codex CLI instructions system message 之后追加注入,**仅当**当前 turn 的 tools 数组真正注册了 apply_patch (type:custom + name:apply_patch)。非 apply_patch turn 0 浪费。 - `crates/adapters/src/responses/request/tools.rs`:在 `APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT` 末尾补 4 条紧凑版 workaround;`APPLY_PATCH_INPUT_DESCRIPTION_FOR_CHAT`(参数级) 同步加更紧凑的 backup。三层 redundancy(system / tool desc / param desc)防止 上游 provider 截断或弱化某一层时模型完全失指引。 设计取舍: - 注入独立 system message(不合并到 Codex 原 instructions),保持职责分离 + 方便日后调整 / 替换 - 文本英文,匹配现有 description 风格,跟下游模型 vocab 也更对齐 - 检测条件用 `type:"custom" && name:"apply_patch"` 而不是 lowered 后的 `type:"function"`,因为我们在 `build_messages_from_input` 时拿到的是原始 Responses body(`convert_responses_tool_to_chat_tool` 在 `tools` 字段转换路径 里调用,跟 `messages` 字段构造路径平行) 测试:3 个新单测覆盖: - 注册 apply_patch 时注入(Codex instructions 不被覆盖、4 条 workaround 都在、 marker 存在) - 未注册 apply_patch 时不注入(无 system 数量增加、无 guidance marker) - 反复 convert 同一 body 3 次,每次 guidance 计数仍为 1(防 merge_consecutive_ system_messages 之类后处理累积) `cargo test --workspace`:509 adapter unit + 25 集成测试全过 (506→509)。 `cargo fmt --all -- --check` clean。 Refs #235 --- crates/adapters/src/responses/request.rs | 52 ++++++++++ .../adapters/src/responses/request/tests.rs | 95 +++++++++++++++++++ .../adapters/src/responses/request/tools.rs | 15 ++- 3 files changed, 160 insertions(+), 2 deletions(-) diff --git a/crates/adapters/src/responses/request.rs b/crates/adapters/src/responses/request.rs index 76a16cbd..c0082549 100644 --- a/crates/adapters/src/responses/request.rs +++ b/crates/adapters/src/responses/request.rs @@ -296,6 +296,14 @@ fn build_messages_from_input( messages.push(msg); } + // 紧跟 Codex CLI 自带 instructions 之后注入 apply_patch chat-path 指引 + // (仅当本 turn 真正注册了 apply_patch 工具时)。位置选择:Codex 系统 + // 指令之后,user input 之前 — 既不污染 Codex 原指令,又确保模型在 + // 读完工具列表准备调 apply_patch 时已经见过 chat-path 限制。 + if tools_register_apply_patch(body) { + messages.push(apply_patch_chat_guidance_message()); + } + let current_messages = body .get("input") .map(input_field_to_messages) @@ -2271,4 +2279,48 @@ mod tests; use tools::{ contains_kimi_web_search_tool, convert_responses_tool_to_chat_tool, normalize_tool_choice, + APPLY_PATCH_TOOL_NAME, }; + +/// chat-path 实战指引,作为独立 `role:"system"` 注入,仅在该 turn 的 tools +/// 数组里注册了 `apply_patch` 时启用。理由参见 issue #235 真机稳定性测试 +/// (DeepSeek 跑 10 个 Level 共发现的 4 个 chat-path 行为):tool/参数 +/// description 同时含紧凑版作 fallback,但 system message 在多数 chat +/// 上游里被赋予更高权重,且模型在 system 块里读到的指引更难被遗忘 / 截断。 +const APPLY_PATCH_CHAT_PATH_SYSTEM_GUIDANCE: &str = concat!( + "[apply_patch chat-path guidance — injected by codex-app-transfer adapter because the upstream lark grammar constraint is unavailable on chat function-call providers]\n", + "When you call the `apply_patch` tool, follow these rules empirically observed with non-OpenAI chat providers:\n", + "\n", + "1. Use an EMPTY LINE as the `@@` anchor whenever possible. Non-empty anchors (e.g. `@@ Hello World!`) frequently fail to match on this path. ", + "If the target file lacks a blank line near your hunk, first run `printf '\\n' >> ` via shell to seed one, then use `@@` with empty content as the anchor, and clean up extra blank lines after the patch lands.\n", + "\n", + "2. Do NOT combine `*** Add File: ` and `*** Update File: ` for the same path in a single patch. ", + "The Update step reads the file before the Add step lands on disk, so it sees an empty file and fails. Either: (a) make `*** Add File:` write the final content in one shot, or (b) split into two separate `apply_patch` invocations.\n", + "\n", + "3. `*** Update File:` cannot operate on a totally empty file. If the target is empty, first use shell (e.g. `printf '\\n' > `) to write at least one line, then call `apply_patch`.\n", + "\n", + "4. In a multi-line file, lone `+` lines following an `@@` anchor APPEND below the anchor — they do NOT replace the anchor line. To change an existing line, you must include BOTH a `-` line to remove the old content AND a `+` line to add the new content. Do not omit the `-` line.\n", + "\n", + "Following these rules avoids retry storms and improves the success rate on first attempt." +); + +/// 检测 Responses request body 的 tools 数组是否注册了 `apply_patch` 工具。 +/// `apply_patch` 在 Responses 协议里以 `type:"custom", name:"apply_patch"` 出现, +/// 在被 [`convert_responses_tool_to_chat_tool`] 降级前。 +/// 用于决定本 turn 是否注入 [`APPLY_PATCH_CHAT_PATH_SYSTEM_GUIDANCE`]。 +fn tools_register_apply_patch(body: &Value) -> bool { + let Some(tools) = body.get("tools").and_then(Value::as_array) else { + return false; + }; + tools.iter().any(|t| { + t.get("name").and_then(Value::as_str) == Some(APPLY_PATCH_TOOL_NAME) + && t.get("type").and_then(Value::as_str) == Some("custom") + }) +} + +fn apply_patch_chat_guidance_message() -> Value { + json!({ + "role": "system", + "content": APPLY_PATCH_CHAT_PATH_SYSTEM_GUIDANCE, + }) +} diff --git a/crates/adapters/src/responses/request/tests.rs b/crates/adapters/src/responses/request/tests.rs index 56813fcd..43f658e7 100644 --- a/crates/adapters/src/responses/request/tests.rs +++ b/crates/adapters/src/responses/request/tests.rs @@ -1952,6 +1952,101 @@ fn function_call_output_becomes_tool_message_with_placeholder_assistant() { assert_eq!(messages[1]["content"], "sunny"); } +#[test] +fn apply_patch_chat_path_guidance_injected_when_tool_registered() { + // 真机稳定性测试发现:即使 wire 桥接通了 + tool description 有 V4A + // 规则,DeepSeek 在 chat-path 上仍会反复尝试错误的 anchor / Add+Update + // 组合 / 空文件 Update 等无效路径,平均每次任务摸索 1-3 分钟。为节省 + // tokens 和提升首次成功率,adapter 在 tools 数组里注册了 apply_patch + // 的 turn 注入一段独立 system message 告知 chat-path 实战 workaround。 + let out = convert(json!({ + "input": [{"type": "message", "role": "user", "content": "edit foo.py"}], + "instructions": "You are a coding assistant.", + "tools": [{ + "type": "custom", + "name": "apply_patch", + "description": "Use the `apply_patch` tool to edit files." + }] + })); + let messages = out["messages"].as_array().unwrap(); + + // Codex CLI 原 instructions 必须保留在第一条 + assert_eq!(messages[0]["role"], "system"); + assert!( + messages[0]["content"] + .as_str() + .unwrap_or_default() + .contains("coding assistant"), + "Codex 原 instructions 不应被覆盖" + ); + + // 紧跟在 Codex instructions 之后必须有一条 adapter-injected guidance + assert_eq!(messages[1]["role"], "system"); + let guidance = messages[1]["content"].as_str().unwrap_or_default(); + assert!( + guidance.contains("apply_patch chat-path guidance"), + "注入的指引必须带可识别 marker:{guidance}" + ); + // 4 个核心 workaround 都要含进去 + assert!(guidance.contains("empty line") || guidance.contains("EMPTY LINE")); + assert!(guidance.contains("Add File") && guidance.contains("Update File")); + assert!(guidance.contains("empty file") || guidance.contains("totally empty")); + assert!(guidance.contains("APPEND") || guidance.contains("append")); +} + +#[test] +fn apply_patch_chat_path_guidance_skipped_when_tool_not_registered() { + // 非 apply_patch 任务不应注入指引,避免污染 token / 模型注意力 + let out = convert(json!({ + "input": [{"type": "message", "role": "user", "content": "list files"}], + "instructions": "You are a coding assistant.", + "tools": [{ + "type": "function", + "name": "shell_command", + "description": "Run a shell command", + "parameters": {"type": "object", "properties": {}} + }] + })); + let messages = out["messages"].as_array().unwrap(); + let has_guidance = messages.iter().any(|m| { + m["content"] + .as_str() + .unwrap_or_default() + .contains("apply_patch chat-path guidance") + }); + assert!( + !has_guidance, + "无 apply_patch 注册时不应注入 chat-path guidance" + ); +} + +#[test] +fn apply_patch_chat_path_guidance_idempotent_across_turns() { + // 防止 merge_consecutive_system_messages 把 adapter-injected guidance + // 跟 Codex instructions 拼到一起后,反复 convert 时被重复累积(连发 3 个 + // turn,每 turn 转换出的 messages 里仍只含 1 段 guidance)。 + let one_turn = json!({ + "input": [{"type": "message", "role": "user", "content": "edit"}], + "instructions": "You are helpful.", + "tools": [{ + "type": "custom", + "name": "apply_patch", + "description": "edit" + }] + }); + for _ in 0..3 { + let out = convert(one_turn.clone()); + let guidance_count = out["messages"] + .as_array() + .unwrap() + .iter() + .map(|m| m["content"].as_str().unwrap_or_default()) + .filter(|c| c.contains("apply_patch chat-path guidance")) + .count(); + assert_eq!(guidance_count, 1, "每次 convert 仅注入一次 guidance"); + } +} + #[test] fn custom_tool_call_input_item_lowered_to_assistant_tool_calls() { // 回归保护(issue #235):turn N+1 Codex CLI 回放上一轮的 diff --git a/crates/adapters/src/responses/request/tools.rs b/crates/adapters/src/responses/request/tools.rs index 1d7f0c4e..c8ef490d 100644 --- a/crates/adapters/src/responses/request/tools.rs +++ b/crates/adapters/src/responses/request/tools.rs @@ -58,7 +58,15 @@ pub(crate) const APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT: &str = concat!( " println!(\"{}\", x);\n", "*** End Patch\n", "Notice: `fn main() {` appears in `@@ ... @@` once as the anchor, NOT again as a space-prefixed line below. ", - "The first content line under the anchor is the line immediately after `fn main() {` in the file." + "The first content line under the anchor is the line immediately after `fn main() {` in the file.\n\n", + "CHAT-PATH GOTCHAS (the lark grammar is gone here; observed empirically with non-OpenAI providers):\n", + "1. Prefer matching an empty line as the `@@` anchor. Non-empty anchors often fail to match on this path. ", + "If needed, run `printf '\\n' >> ` first to seed a blank line, then `@@` with empty content, then clean up extra blank lines afterward.\n", + "2. Do NOT combine `*** Add File: foo` and `*** Update File: foo` in the SAME patch — Update reads the file before Add lands on disk. ", + "Either make Add File write the final content in one shot, or split into two separate patches.\n", + "3. `*** Update File:` cannot operate on a completely empty file. Use shell to write at least one line first, then apply_patch.\n", + "4. In a multi-line file, lone `+` lines AFTER an `@@` anchor APPEND below the anchor — they do NOT replace the anchor line. ", + "To change a line, use `-` to remove the old line AND `+` to add the new one; do not omit the `-`." ); /// Chat-path replacement for the freeform `input` parameter description. @@ -73,7 +81,10 @@ pub(crate) const APPLY_PATCH_INPUT_DESCRIPTION_FOR_CHAT: &str = concat!( "Use `*** Add File:`, `*** Update File:`, or `*** Delete File:` headers and ", "`@@ @@` hunks with ` `/`+`/`-` line prefixes. Relative paths only. ", "CRITICAL: in an Update hunk the `@@ @@` anchor is a SINGLE existing file line; ", - "the space-prefixed lines following the anchor describe lines AFTER it (do not repeat the anchor)." + "the space-prefixed lines following the anchor describe lines AFTER it (do not repeat the anchor). ", + "Chat-path gotchas: prefer empty-line anchors (seed with `printf '\\n' >> file` if needed); ", + "do not Add+Update the same path in one patch; Update cannot operate on a totally empty file; ", + "lone `+` lines after `@@` APPEND below the anchor (use `-` + `+` to replace a line)." ); /// Responses tool 定义 → Chat tool 定义. From 345234a40d2cd64d78648ec31568a18e7933f3a6 Mon Sep 17 00:00:00 2001 From: Cmochance <3216202644@qq.com> Date: Thu, 21 May 2026 16:11:25 +0800 Subject: [PATCH 4/4] =?UTF-8?q?fix(adapters):=20apply=5Fpatch=20chat-path?= =?UTF-8?q?=20prompt=20=E8=B4=A8=E9=87=8F=E6=8F=90=E5=8D=87=20=E2=80=94=20?= =?UTF-8?q?verbatim=20=E4=B8=8A=E6=B8=B8=20V4A=20=E6=95=99=E5=AD=A6=20+=20?= =?UTF-8?q?3=20=E7=A4=BA=E4=BE=8B=20+=20byte-exact=20=E8=A7=84=E5=88=99?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PR #236 修了 wire 层(custom_tool_call SSE 桥接 + 多轮历史回放 + 首版 system / description 注入), 但 issue #235 真机 capture(28-turn / 26MB / DeepSeek + Kimi)显示 7 个 apply_patch tool_call 中 6 个仍因模型生成的 V4A patch 内容质量被 Codex Desktop 端验证器拒绝: - 3 turn:模型直接吐 Python 代码,Codex Desktop 报 `invalid hunk at line 3, 'def main():' is not a valid hunk header` - 1 turn:V4A 格式正确但 `-` 行不 byte-exact 匹配文件,报 `Failed to find expected lines` - 3 turn:模型选 `exec_command` 而非 `apply_patch` 本 PR 针对模型生成质量,wire 层不再变化。 主要改动 - `crates/adapters/src/responses/apply_patch_v4a_reference.md`(新增):verbatim 镜像上游 Codex CLI `codex-rs/core/prompt_with_apply_patch_instructions.md` L277-L351 @ commit `0b4f86095c8005d8f74e9c62b971d72c1670aa88`(Apache-2.0, Copyright 2025 OpenAI)。头部加 adapter note 显式 override "shell command" 字眼为 "function-call tool",其余原文未改动。 - `crates/adapters/src/responses/request.rs::APPLY_PATCH_CHAT_PATH_SYSTEM_GUIDANCE`:重写为三段结构 — (1) Tool selection 顶层引导(对抗 exec_command 偏好) (2) include_str! 嵌入上述 V4A 教学 (3) 5 条 chat-path 实测 gotcha(byte-exact / Empty-line anchor 仅当 blank 存在 / Add+Update 同 patch / 空文件 / 纯 `+` 行不替换)。 - `crates/adapters/src/responses/request/tools.rs::APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT`:扩展加入 Tool selection 顶层引导、BYTE-EXACT 匹配规则、3 个 positive example(modify line / Add File / 多 hunk Update)、anti-pattern reminder("NEVER pass raw source code")。 - `APPLY_PATCH_INPUT_DESCRIPTION_FOR_CHAT`(参数级 mirror):部分 provider 在长历史中降权 tool-level description,参数级补 byte-exact + anti-pattern 紧凑版保持可见性。 - License 合规:新增 NOTICE 文件;ACKNOWLEDGEMENTS.md / README.md / README.en.md 致谢段同步加上游 attribution(full 40-char SHA + L277-L351 + Apache-2.0)。 - docs/CHANGELOG.md + docs/investigation/protocol-conversion-3way-comparison.md 同步本次改动。 - 5 个新 / 扩展 unit test 断言:Tool selection / V4A reference verbatim 引用证据(`Patch := Begin { FileOp } End` EBNF 块 + `@@ class BaseClass` 双锚点 example) / byte-exact / 3 positive example / anti-pattern reminder。509 tests pass。 成功率改善幅度待 push 后真机 regression 测试出 round2 数据,届时在 PR description 补具体数字。 Refs #235 --- ACKNOWLEDGEMENTS.md | 10 ++- NOTICE | 60 ++++++++++++++ README.en.md | 2 +- README.md | 2 +- .../responses/apply_patch_v4a_reference.md | 79 +++++++++++++++++++ crates/adapters/src/responses/request.rs | 55 ++++++++++--- .../adapters/src/responses/request/tests.rs | 72 ++++++++++++++++- .../adapters/src/responses/request/tools.rs | 60 +++++++++++--- docs/CHANGELOG.md | 11 ++- .../protocol-conversion-3way-comparison.md | 4 +- 10 files changed, 324 insertions(+), 31 deletions(-) create mode 100644 NOTICE create mode 100644 crates/adapters/src/responses/apply_patch_v4a_reference.md diff --git a/ACKNOWLEDGEMENTS.md b/ACKNOWLEDGEMENTS.md index 044e65a6..70eb5cc7 100644 --- a/ACKNOWLEDGEMENTS.md +++ b/ACKNOWLEDGEMENTS.md @@ -124,8 +124,8 @@ - **Link**: https://github.com/openai/codex - **License**: Apache-2.0 -- **借鉴形式**: Prompt 蓝本(精简移植)+ 协议反查(数据模式参照) -- **首次借鉴 PR / 时间**: v2.0.x 起协议结构反查;fix/219 起 prompt 结构借鉴 +- **借鉴形式**: Prompt 蓝本(精简移植)+ 协议反查(数据模式参照)+ V4A apply_patch system guidance(verbatim 引用) +- **首次借鉴 PR / 时间**: v2.0.x 起协议结构反查;fix/219 起 compact prompt 结构借鉴;PR #235 后续(post-#236)起 apply_patch V4A chat-path guidance verbatim 引用 - **借鉴清单**: - `COMPACT_SUMMARIZATION_PROMPT` 基础骨架 → `crates/adapters/src/responses/compact.rs:82-92` (源文件:`codex-rs/core/templates/compact/prompt.md`,~460 chars) @@ -137,6 +137,12 @@ - `CompactHistoryResponse { output: Vec }` + `ResponseItem::Compaction { encrypted_content }` 响应结构 → `compact.rs` 序列化路径 (源文件:`codex-rs/codex-api/src/endpoint/compact.rs` + `codex-rs/protocol/src/models.rs:882`) + - apply_patch V4A chat-path system guidance(verbatim 引用)→ + `crates/adapters/src/responses/apply_patch_v4a_reference.md`(verbatim 镜像,头部加 adapter note 显式 override "shell command" 字眼为 "function-call tool",其余原文未改动) + 由 `crates/adapters/src/responses/request.rs::APPLY_PATCH_CHAT_PATH_SYSTEM_GUIDANCE`(`include_str!`)拼装注入 + 源文件:`codex-rs/core/prompt_with_apply_patch_instructions.md` 的 L277-L351 @ commit `0b4f86095c8005d8f74e9c62b971d72c1670aa88` + 动机:issue #235 真机 capture 28-turn / 26MB DeepSeek+Kimi 数据,模型生成 V4A patch 在 chat-path 失败率 6 / 7 个 turn,需要在 prompt 注入完整 V4A 教学(envelope / hunks / `@@` 锚点 / 3-line context / EBNF / 多操作 example)。chat-path 无 lark grammar 兜底,只能靠 prompt 引导。 + 上游 SHA 升级时:同步更新本文件 + `apply_patch_v4a_reference.md` 头部 + `request.rs::APPLY_PATCH_CHAT_PATH_SYSTEM_GUIDANCE` doc comment + `README.md` / `README.en.md` 致谢段 + `NOTICE` 文件共 6 处 commit SHA,再 fresh slice 覆盖 reference 正文。 - **本项目差异 / 扩展**: - prompt 补两条 Claude Code 关键 bullet("All user messages verbatim" + "Next Step verbatim quote"), 借鉴自 Piebald-AI/claude-code-system-prompts 反编译公开版本第 6 / 9 段(见下方同名 entry) diff --git a/NOTICE b/NOTICE new file mode 100644 index 00000000..b37a7e13 --- /dev/null +++ b/NOTICE @@ -0,0 +1,60 @@ +Codex App Transfer +Copyright (c) 2026 Codex App Transfer + +This project is licensed under the MIT License (see LICENSE.txt). + +============================================================================= +Third-party attribution +============================================================================= + +This project includes prose / prompt content derived from upstream open-source +projects. The following attributions are required by the upstream licenses. + +----------------------------------------------------------------------------- +openai/codex — Apache License 2.0 +----------------------------------------------------------------------------- + +Copyright 2025 OpenAI + +Licensed under the Apache License, Version 2.0 (the "License"); you may not +use this file except in compliance with the License. You may obtain a copy +of the License at: + + https://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +License for the specific language governing permissions and limitations +under the License. + +Derived content in this repository: + + * crates/adapters/src/responses/apply_patch_v4a_reference.md + Verbatim mirror of the V4A apply_patch walkthrough section + (envelope grammar / `@@` anchors / 3-line context / EBNF / multi-op + example) from upstream: + Source repo: https://github.com/openai/codex + Source commit: 0b4f86095c8005d8f74e9c62b971d72c1670aa88 + Source file: codex-rs/core/prompt_with_apply_patch_instructions.md + Source lines: L277-L351 (75 lines) + The mirror file adds a one-line "Adapter note" preamble (authored by + codex-app-transfer maintainers) clarifying that in this chat-completions + environment apply_patch is a function-call tool, not a shell command. + The borrowed prose itself is otherwise unmodified. + + The mirror file is embedded into the system prompt at runtime via + `include_str!` from + `crates/adapters/src/responses/request.rs::APPLY_PATCH_CHAT_PATH_SYSTEM_GUIDANCE`. + + * crates/adapters/src/responses/compact.rs (prior borrowing, retained) + `COMPACT_SUMMARIZATION_PROMPT` skeleton and `COMPACT_SUMMARY_PREFIX` + derived from + `codex-rs/core/templates/compact/{prompt.md,summary_prefix.md}`. + See ACKNOWLEDGEMENTS.md for full borrowing list and file:line mapping. + +----------------------------------------------------------------------------- + +For the complete list of upstream projects this repository borrows from +(including non-Apache projects), see ACKNOWLEDGEMENTS.md and the credits +section of README.md / README.en.md. diff --git a/README.en.md b/README.en.md index 0217d372..6a2c4bc3 100644 --- a/README.en.md +++ b/README.en.md @@ -225,7 +225,7 @@ Some experimental providers (Grok Web / Gemini CLI OAuth / Antigravity OAuth) in - [`lonr-6/cc-desktop-switch`](https://github.com/lonr-6/cc-desktop-switch) — v1.x desktop shell skeleton + README structure reference - [`BerriAI/litellm`](https://github.com/BerriAI/litellm) — bidirectional protocol translation patterns - [`tauri-apps/tauri`](https://tauri.app/) — v2 + `cas://` architecture base -- [`openai/codex`](https://github.com/openai/codex) — autocompact prompt base structure + compact protocol reverse-reference (Apache-2.0) +- [`openai/codex`](https://github.com/openai/codex) — autocompact prompt base structure + compact protocol reverse-reference + verbatim apply_patch V4A chat-path system guidance (`codex-rs/core/prompt_with_apply_patch_instructions.md` L277-L351 @ commit `0b4f86095c8005d8f74e9c62b971d72c1670aa88`, Apache-2.0) - [`Piebald-AI/claude-code-system-prompts`](https://github.com/Piebald-AI/claude-code-system-prompts) — autocompact anchor bullets (All user messages verbatim + Next Step verbatim) - [`7as0nch/mimo2codex`](https://github.com/7as0nch/mimo2codex) — MiMo protocol reference - [`router-for-me/CLIProxyAPI`](https://github.com/router-for-me/CLIProxyAPI) — Gemini OAuth wire-level reference diff --git a/README.md b/README.md index ad9a6a00..a630924f 100644 --- a/README.md +++ b/README.md @@ -233,7 +233,7 @@ v2.1.12+ 的客户端 **强制** RSA-3072 PKCS#1-v1.5-SHA256 验签 `latest.json - [`lonr-6/cc-desktop-switch`](https://github.com/lonr-6/cc-desktop-switch) — v1.x 桌面壳骨架 + README 结构参考 - [`BerriAI/litellm`](https://github.com/BerriAI/litellm) — 协议双向转换思路 - [`tauri-apps/tauri`](https://tauri.app/) — v2 + `cas://` 架构基座 -- [`openai/codex`](https://github.com/openai/codex) — autocompact prompt 骨架 + compact 协议结构反查(Apache-2.0) +- [`openai/codex`](https://github.com/openai/codex) — autocompact prompt 骨架 + compact 协议结构反查 + apply_patch V4A chat-path system guidance verbatim 引用(`codex-rs/core/prompt_with_apply_patch_instructions.md` L277-L351 @ commit `0b4f86095c8005d8f74e9c62b971d72c1670aa88`,Apache-2.0) - [`Piebald-AI/claude-code-system-prompts`](https://github.com/Piebald-AI/claude-code-system-prompts) — autocompact prompt 锚定 bullet(All user messages verbatim + Next Step verbatim) - [`7as0nch/mimo2codex`](https://github.com/7as0nch/mimo2codex) — MiMo 协议借鉴 - [`router-for-me/CLIProxyAPI`](https://github.com/router-for-me/CLIProxyAPI) — Gemini OAuth wire 参考 diff --git a/crates/adapters/src/responses/apply_patch_v4a_reference.md b/crates/adapters/src/responses/apply_patch_v4a_reference.md new file mode 100644 index 00000000..5b9f136e --- /dev/null +++ b/crates/adapters/src/responses/apply_patch_v4a_reference.md @@ -0,0 +1,79 @@ +> Adapter note (codex-app-transfer): in this chat-completions environment `apply_patch` is exposed as a function-call tool, NOT a shell command. Read every "`apply_patch` shell command" / "invoke apply_patch like `shell {"command":[...]}`" reference in the prose below as "the `apply_patch` function-call tool" — the patch envelope, hunk format, and V4A grammar are identical; only the invocation surface differs. Specifically: call the `apply_patch` function with a single `input` string containing the V4A patch (no `shell` wrapper, no JSON-quoting tricks); the trailing shell-style invocation example at the very end is for the upstream Codex CLI's shell-tool path and does not apply here. +> +> Source: verbatim mirror of openai/codex @ commit `0b4f86095c8005d8f74e9c62b971d72c1670aa88`, file `codex-rs/core/prompt_with_apply_patch_instructions.md`, lines L277-L351. Apache-2.0, Copyright 2025 OpenAI. Stored at `crates/adapters/src/responses/apply_patch_v4a_reference.md` in this repository. See `NOTICE` + `ACKNOWLEDGEMENTS.md` for full attribution. + +## `apply_patch` + +Use the `apply_patch` shell command to edit files. +Your patch language is a stripped‑down, file‑oriented diff format designed to be easy to parse and safe to apply. You can think of it as a high‑level envelope: + +*** Begin Patch +[ one or more file sections ] +*** End Patch + +Within that envelope, you get a sequence of file operations. +You MUST include a header to specify the action you are taking. +Each operation starts with one of three headers: + +*** Add File: - create a new file. Every following line is a + line (the initial contents). +*** Delete File: - remove an existing file. Nothing follows. +*** Update File: - patch an existing file in place (optionally with a rename). + +May be immediately followed by *** Move to: if you want to rename the file. +Then one or more “hunks”, each introduced by @@ (optionally followed by a hunk header). +Within a hunk each line starts with: + +For instructions on [context_before] and [context_after]: +- By default, show 3 lines of code immediately above and 3 lines immediately below each change. If a change is within 3 lines of a previous change, do NOT duplicate the first change’s [context_after] lines in the second change’s [context_before] lines. +- If 3 lines of context is insufficient to uniquely identify the snippet of code within the file, use the @@ operator to indicate the class or function to which the snippet belongs. For instance, we might have: +@@ class BaseClass +[3 lines of pre-context] +- [old_code] ++ [new_code] +[3 lines of post-context] + +- If a code block is repeated so many times in a class or function such that even a single `@@` statement and 3 lines of context cannot uniquely identify the snippet of code, you can use multiple `@@` statements to jump to the right context. For instance: + +@@ class BaseClass +@@ def method(): +[3 lines of pre-context] +- [old_code] ++ [new_code] +[3 lines of post-context] + +The full grammar definition is below: +Patch := Begin { FileOp } End +Begin := "*** Begin Patch" NEWLINE +End := "*** End Patch" NEWLINE +FileOp := AddFile | DeleteFile | UpdateFile +AddFile := "*** Add File: " path NEWLINE { "+" line NEWLINE } +DeleteFile := "*** Delete File: " path NEWLINE +UpdateFile := "*** Update File: " path NEWLINE [ MoveTo ] { Hunk } +MoveTo := "*** Move to: " newPath NEWLINE +Hunk := "@@" [ header ] NEWLINE { HunkLine } [ "*** End of File" NEWLINE ] +HunkLine := (" " | "-" | "+") text NEWLINE + +A full patch can combine several operations: + +*** Begin Patch +*** Add File: hello.txt ++Hello world +*** Update File: src/app.py +*** Move to: src/main.py +@@ def greet(): +-print("Hi") ++print("Hello, world!") +*** Delete File: obsolete.txt +*** End Patch + +It is important to remember: + +- You must include a header with your intended action (Add/Delete/Update) +- You must prefix new lines with `+` even when creating a new file +- File references can only be relative, NEVER ABSOLUTE. + +You can invoke apply_patch like: + +``` +shell {"command":["apply_patch","*** Begin Patch\n*** Add File: hello.txt\n+Hello, world!\n*** End Patch\n"]} +``` diff --git a/crates/adapters/src/responses/request.rs b/crates/adapters/src/responses/request.rs index c0082549..a3f192eb 100644 --- a/crates/adapters/src/responses/request.rs +++ b/crates/adapters/src/responses/request.rs @@ -2283,25 +2283,56 @@ use tools::{ }; /// chat-path 实战指引,作为独立 `role:"system"` 注入,仅在该 turn 的 tools -/// 数组里注册了 `apply_patch` 时启用。理由参见 issue #235 真机稳定性测试 -/// (DeepSeek 跑 10 个 Level 共发现的 4 个 chat-path 行为):tool/参数 -/// description 同时含紧凑版作 fallback,但 system message 在多数 chat -/// 上游里被赋予更高权重,且模型在 system 块里读到的指引更难被遗忘 / 截断。 +/// 数组里注册了 `apply_patch` 时启用。理由参见 issue #235 真机稳定性测试: +/// chat-completions providers 上无 lark grammar 兜底,模型生成的 V4A 内容 +/// 在 6 / 7 个 turn 失败(直接吐 Python 代码 / `-` 行不 byte-exact / 选 exec_command)。 +/// +/// 结构 = 三段: +/// (1) Tool selection 顶层引导 (对抗 exec_command 偏好) +/// (2) [include_str!] V4A 完整教学块 — verbatim 借鉴自上游 Codex CLI +/// openai/codex @ commit `0b4f86095c8005d8f74e9c62b971d72c1670aa88`, +/// codex-rs/core/prompt_with_apply_patch_instructions.md L277-L351, +/// Apache-2.0 licensed,Copyright 2025 OpenAI。 +/// attribution 同时见 NOTICE 文件 + README 中英致谢段 + +/// ACKNOWLEDGEMENTS.md + `apply_patch_v4a_reference.md` 文件头部 +/// adapter note。 +/// 上游若发版,**同步**更新 5 处 commit SHA: +/// - `apply_patch_v4a_reference.md` 文件头部 adapter note +/// - 本常量上方 doc comment(本处) +/// - `ACKNOWLEDGEMENTS.md` openai/codex section +/// - `README.md` 致谢段 +/// - `README.en.md` 致谢段 +/// - `NOTICE` 文件 third-party attribution 段 +/// 再用 fresh upstream slice 覆盖 `apply_patch_v4a_reference.md` 正文。 +/// (3) Chat-path specific gotchas — 真机 capture (issue #235) 沉淀的 5 条 +/// failure mode workaround,补 V4A 通用规则未覆盖的 non-OpenAI provider 差异。 +/// +/// 总长约 3KB。已对照 Anthropic `Define tools` best practice ("at least 3-4 +/// sentences per tool description, more if the tool is complex")。 const APPLY_PATCH_CHAT_PATH_SYSTEM_GUIDANCE: &str = concat!( "[apply_patch chat-path guidance — injected by codex-app-transfer adapter because the upstream lark grammar constraint is unavailable on chat function-call providers]\n", - "When you call the `apply_patch` tool, follow these rules empirically observed with non-OpenAI chat providers:\n", "\n", - "1. Use an EMPTY LINE as the `@@` anchor whenever possible. Non-empty anchors (e.g. `@@ Hello World!`) frequently fail to match on this path. ", - "If the target file lacks a blank line near your hunk, first run `printf '\\n' >> ` via shell to seed one, then use `@@` with empty content as the anchor, and clean up extra blank lines after the patch lands.\n", + "## Tool selection\n", + "\n", + "When the user asks you to create, edit, refactor, or delete file content, you MUST use the `apply_patch` tool. Do NOT shell out to `sed` / `awk` / `cat < file` / `printf > file` to write or modify file content — those produce inconsistent diffs across reruns, bypass the diff UI, and frequently leave whitespace mismatches that break subsequent edits. Reserve shell tools for reads (`cat`, `sed -n '1,80p' `, `ls`), execution, and tests.\n", + "\n", + "## V4A patch format\n", + "\n", + include_str!("apply_patch_v4a_reference.md"), + "\n", + "## Chat-path specific gotchas (codex-app-transfer adapter, empirically observed with non-OpenAI providers)\n", + "\n", + "1. Match `-` lines BYTE-EXACT to the file's current content — same leading whitespace, no trimmed trailing spaces, no normalized newlines. If you are not certain what the lines look like, first run `cat ` or `sed -n '1,80p' ` via shell to read the file, then compose the patch from the actual bytes. Guessing `-` content is the #1 cause of `Failed to find expected lines` errors on this path.\n", + "\n", + "2. Empty-line `@@` anchors only work if a blank line actually exists at that position in the file. If you need an anchor in a file with no blank lines, prefer a non-empty `@@
` anchor (e.g. `@@ def add(a, b):`) over forcing a blank line.\n", "\n", - "2. Do NOT combine `*** Add File: ` and `*** Update File: ` for the same path in a single patch. ", - "The Update step reads the file before the Add step lands on disk, so it sees an empty file and fails. Either: (a) make `*** Add File:` write the final content in one shot, or (b) split into two separate `apply_patch` invocations.\n", + "3. Do NOT combine `*** Add File: ` and `*** Update File: ` for the same path in a single patch. The Update step reads the file before the Add step lands on disk, so it sees an empty file and fails. Either: (a) make `*** Add File:` write the final content in one shot, or (b) split into two separate `apply_patch` invocations.\n", "\n", - "3. `*** Update File:` cannot operate on a totally empty file. If the target is empty, first use shell (e.g. `printf '\\n' > `) to write at least one line, then call `apply_patch`.\n", + "4. `*** Update File:` cannot operate on a totally empty file. If the target is empty, first use shell (e.g. `printf '\\n' > `) to write at least one line, then call `apply_patch`.\n", "\n", - "4. In a multi-line file, lone `+` lines following an `@@` anchor APPEND below the anchor — they do NOT replace the anchor line. To change an existing line, you must include BOTH a `-` line to remove the old content AND a `+` line to add the new content. Do not omit the `-` line.\n", + "5. In a multi-line file, lone `+` lines AFTER an `@@` anchor APPEND below the anchor — they do NOT replace the anchor line. To change a line, you must include BOTH a `-` line to remove the old content AND a `+` line to add the new one; do not omit the `-`.\n", "\n", - "Following these rules avoids retry storms and improves the success rate on first attempt." + "Following the V4A grammar above plus these chat-path rules avoids retry storms and improves first-attempt success rate substantially on non-OpenAI providers." ); /// 检测 Responses request body 的 tools 数组是否注册了 `apply_patch` 工具。 diff --git a/crates/adapters/src/responses/request/tests.rs b/crates/adapters/src/responses/request/tests.rs index 43f658e7..822fc73b 100644 --- a/crates/adapters/src/responses/request/tests.rs +++ b/crates/adapters/src/responses/request/tests.rs @@ -1987,8 +1987,35 @@ fn apply_patch_chat_path_guidance_injected_when_tool_registered() { guidance.contains("apply_patch chat-path guidance"), "注入的指引必须带可识别 marker:{guidance}" ); - // 4 个核心 workaround 都要含进去 - assert!(guidance.contains("empty line") || guidance.contains("EMPTY LINE")); + let guidance_lc = guidance.to_lowercase(); + // (1) Tool selection 顶层引导(对抗 exec_command 偏好) + assert!( + guidance.contains("Tool selection") && guidance.contains("Do NOT shell out"), + "guidance 必须含 tool selection 顶层引导:{guidance}" + ); + // (2) Verbatim 借鉴的上游 V4A reference 已被 include_str! 拼装进来 + assert!( + guidance.contains("*** Begin Patch") && guidance.contains("*** End Patch"), + "guidance 必须含 V4A 完整 envelope 教学:{guidance}" + ); + assert!( + guidance.contains("Patch := Begin { FileOp } End"), + "guidance 必须含上游 EBNF 语法块(verbatim 引用证据):{guidance}" + ); + assert!( + guidance.contains("@@ class BaseClass"), + "guidance 必须含上游双锚点 example(verbatim 引用证据):{guidance}" + ); + // (3) chat-path 4 条 gotchas + byte-exact 规则 + assert!( + guidance_lc.contains("byte-exact") || guidance_lc.contains("byte-for-byte"), + "guidance 必须含 byte-exact 匹配规则(turn 0010 失败修复):{guidance}" + ); + // Empty-line anchor gotcha #2 是 chat-path 关键规则之一,锁定确切措辞防 wording drift + assert!( + guidance.contains("Empty-line `@@` anchors only work"), + "chat-path gotcha #2 (Empty-line anchor 仅当 blank 行存在) 必须保留:{guidance}" + ); assert!(guidance.contains("Add File") && guidance.contains("Update File")); assert!(guidance.contains("empty file") || guidance.contains("totally empty")); assert!(guidance.contains("APPEND") || guidance.contains("append")); @@ -2895,6 +2922,47 @@ fn tools_custom_apply_patch_injects_v4a_format_hint() { && (input_desc.contains("AFTER") || input_desc.contains("after")), "input description 必须保留 anchor 语义紧凑版:{input_desc}" ); + + // 回归保护(issue #235 真机 capture 28-turn 数据,7 个 apply_patch tool_call + // 中 6 个失败 → 失败率 85.7% 触发): + // tool description 在 3 个新维度上必须有显式引导。 + let outer_lc = outer.to_lowercase(); + + // (a) Tool selection 顶层引导(对抗模型选 exec_command 而非 apply_patch) + assert!( + outer.contains("Use this tool for ANY file content change") + && outer_lc.contains("do not shell out"), + "outer description 必须显式指引用 apply_patch 而非 sed/echo/python -c 等 shell 改文件:{outer}" + ); + + // (b) Byte-exact 匹配规则(对抗 turn 0010 \"Failed to find expected lines\") + assert!( + outer_lc.contains("byte-exact") || outer_lc.contains("byte-for-byte"), + "outer description 必须含 `-` 行 byte-exact 匹配规则:{outer}" + ); + assert!( + input_desc.to_lowercase().contains("byte-exact"), + "input description 紧凑版必须同步含 byte-exact 规则:{input_desc}" + ); + + // (c) 至少 3 个 positive example(Anthropic best practice + 上游借鉴): + // Example 1: 改一行 + // Example 2: Add File 新建 + // Example 3: 多 hunk Update(双锚点) + assert!( + outer.contains("EXAMPLE 1") && outer.contains("EXAMPLE 2") && outer.contains("EXAMPLE 3"), + "outer description 必须含 3 个 positive example:{outer}" + ); + assert!( + outer.contains("*** Add File: hello.py"), + "outer description 必须含 Add File example:{outer}" + ); + + // (d) Anti-pattern reminder(对抗 turn 0016/0019/0022 直接吐 Python 代码) + assert!( + outer.contains("NOT pass raw source code"), + "outer description 必须显式提醒不要直接传 raw source code:{outer}" + ); } #[test] diff --git a/crates/adapters/src/responses/request/tools.rs b/crates/adapters/src/responses/request/tools.rs index c8ef490d..797485c3 100644 --- a/crates/adapters/src/responses/request/tools.rs +++ b/crates/adapters/src/responses/request/tools.rs @@ -29,7 +29,11 @@ pub(crate) const APPLY_PATCH_TOOL_NAME: &str = "apply_patch"; /// 找不到双行 → 整个 patch 拒收。本 description 通过显式规则 + 一个最小 /// 可执行的更新文件 example 让模型看到正确形态。 pub(crate) const APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT: &str = concat!( - "Edit files using the apply_patch tool. ", + "Edit files using the apply_patch tool. Use this tool for ANY file content ", + "change — creating, modifying, refactoring, or deleting code. Do NOT shell out ", + "to `sed`, `awk`, `cat < file`, or `printf > file` for ", + "edits; those bypass the diff UI and frequently leave whitespace mismatches. ", + "Reserve shell tools for reads (`cat`, `sed -n '1,80p' `), execution, and tests.\n\n", "Call this function with a single `input` string containing a V4A patch. ", "The patch must start with `*** Begin Patch` and end with `*** End Patch`. ", "Each file operation header is one of `*** Add File: `, ", @@ -44,12 +48,17 @@ pub(crate) const APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT: &str = concat!( "Every space-prefixed line that follows the `@@` marker corresponds to lines ", "AFTER the anchor in the file (not the anchor itself). ", "Do NOT repeat the anchor line as the first space-prefixed line — the parser will reject it.\n\n", - "EXAMPLE — to change `let x = 1;` to `let x = 2;` in a file whose lines around the change read:\n", + "BYTE-EXACT MATCHING (#1 cause of `Failed to find expected lines` on chat-completions providers):\n", + "Every `-` line MUST match the file's current line byte-for-byte — same leading whitespace, ", + "no trimmed trailing spaces, exact characters. If you are not certain what the line looks like, ", + "first run `cat ` or `sed -n '1,80p' ` via shell to read it, then compose the patch ", + "from the actual bytes. Guessing or paraphrasing `-` content WILL fail.\n\n", + "EXAMPLE 1 — change `let x = 1;` to `let x = 2;` in a file whose lines around the change read:\n", " fn main() {\n", " let x = 1;\n", " println!(\"{}\", x);\n", " }\n", - "The correct patch is:\n", + "Correct patch:\n", "*** Begin Patch\n", "*** Update File: src/main.rs\n", "@@ fn main() {\n", @@ -59,14 +68,40 @@ pub(crate) const APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT: &str = concat!( "*** End Patch\n", "Notice: `fn main() {` appears in `@@ ... @@` once as the anchor, NOT again as a space-prefixed line below. ", "The first content line under the anchor is the line immediately after `fn main() {` in the file.\n\n", + "EXAMPLE 2 — create a brand new file `hello.py` with two lines of content. ", + "For Add File there are NO hunks and NO `@@` markers; every line of the new file's contents is prefixed with `+`:\n", + "*** Begin Patch\n", + "*** Add File: hello.py\n", + "+def greet(name: str) -> str:\n", + "+ return f\"Hello, {name}!\"\n", + "*** End Patch\n\n", + "EXAMPLE 3 — refactor four top-level functions into one class. Multiple hunks, each independently anchored:\n", + "*** Begin Patch\n", + "*** Update File: src/calc.py\n", + "@@ def add(a: int, b: int) -> int:\n", + "-def add(a: int, b: int) -> int:\n", + "- return a + b\n", + "+class Calculator:\n", + "+ def add(self, a: int, b: int) -> int:\n", + "+ return a + b\n", + "@@ def sub(a: int, b: int) -> int:\n", + "-def sub(a: int, b: int) -> int:\n", + "- return a - b\n", + "+ def sub(self, a: int, b: int) -> int:\n", + "+ return a - b\n", + "*** End Patch\n", + "Notice: each hunk has its OWN `@@` anchor naming an existing line, and the `-`/`+` lines change just that region.\n\n", "CHAT-PATH GOTCHAS (the lark grammar is gone here; observed empirically with non-OpenAI providers):\n", - "1. Prefer matching an empty line as the `@@` anchor. Non-empty anchors often fail to match on this path. ", - "If needed, run `printf '\\n' >> ` first to seed a blank line, then `@@` with empty content, then clean up extra blank lines afterward.\n", - "2. Do NOT combine `*** Add File: foo` and `*** Update File: foo` in the SAME patch — Update reads the file before Add lands on disk. ", + "1. `-` lines must be byte-exact to file content (see BYTE-EXACT MATCHING above). When in doubt, read the file via shell first.\n", + "2. Empty-line `@@` anchors only work if a blank line actually exists at that position in the file. ", + "Prefer a non-empty `@@
` anchor (e.g. `@@ def add(a, b):`) over forcing a blank line.\n", + "3. Do NOT combine `*** Add File: foo` and `*** Update File: foo` in the SAME patch — Update reads the file before Add lands on disk. ", "Either make Add File write the final content in one shot, or split into two separate patches.\n", - "3. `*** Update File:` cannot operate on a completely empty file. Use shell to write at least one line first, then apply_patch.\n", - "4. In a multi-line file, lone `+` lines AFTER an `@@` anchor APPEND below the anchor — they do NOT replace the anchor line. ", - "To change a line, use `-` to remove the old line AND `+` to add the new one; do not omit the `-`." + "4. `*** Update File:` cannot operate on a completely empty file. Use shell to write at least one line first, then apply_patch.\n", + "5. In a multi-line file, lone `+` lines AFTER an `@@` anchor APPEND below the anchor — they do NOT replace the anchor line. ", + "To change a line, use `-` to remove the old line AND `+` to add the new one; do not omit the `-`.\n\n", + "REMEMBER: the `input` value MUST be a V4A patch enclosed in `*** Begin Patch` / `*** End Patch`. ", + "Do NOT pass raw source code (e.g. `def main():\\n pass`) — that is not a valid hunk header and will be rejected." ); /// Chat-path replacement for the freeform `input` parameter description. @@ -82,9 +117,12 @@ pub(crate) const APPLY_PATCH_INPUT_DESCRIPTION_FOR_CHAT: &str = concat!( "`@@ @@` hunks with ` `/`+`/`-` line prefixes. Relative paths only. ", "CRITICAL: in an Update hunk the `@@ @@` anchor is a SINGLE existing file line; ", "the space-prefixed lines following the anchor describe lines AFTER it (do not repeat the anchor). ", - "Chat-path gotchas: prefer empty-line anchors (seed with `printf '\\n' >> file` if needed); ", + "`-` lines MUST be byte-exact to the file's current content (read via `cat ` first if unsure); ", + "guessing or paraphrasing them is the #1 cause of `Failed to find expected lines`. ", + "Chat-path gotchas: prefer non-empty `@@
` anchors over forcing blank-line anchors; ", "do not Add+Update the same path in one patch; Update cannot operate on a totally empty file; ", - "lone `+` lines after `@@` APPEND below the anchor (use `-` + `+` to replace a line)." + "lone `+` lines after `@@` APPEND below the anchor (use `-` + `+` to replace a line). ", + "NEVER pass raw source code — only a V4A patch envelope is accepted." ); /// Responses tool 定义 → Chat tool 定义. diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index 895ff97a..0851e345 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -2,12 +2,21 @@ 逐版本要点。详细变更见 [GitHub Releases](https://github.com/Cmochance/codex-app-transfer/releases) 与 `docs/release-notes/v*.md`。 -## Unreleased — PR #153 draft +## Unreleased — PR #153 draft + post-#236 apply_patch chat-path prompt quality **Anthropic Messages 协议适配**:新增 canonical `apiFormat=anthropic_messages`,将 Codex CLI Responses 请求转换到 Anthropic `/v1/messages`,并把 Anthropic Messages SSE 还原为 Responses SSE。当前 PR 已覆盖 text、thinking、tool_use、tool_result repair、`previous_response_id`、compact response、upstream error、provider test/model list 与 UI 保存显示路径。 Claude preset 暂不开放:需要 P7 真实 Claude text、tool-call、`previous_response_id`、upstream error 验证通过后再加入默认 preset。 +**apply_patch chat-path prompt-quality 增强**(post-PR #236 follow-up,issue #235):PR #236 修复了 wire 层(`custom_tool_call` SSE 桥接 + 多轮历史回放 + 首版 system 注入 + description 增强),但真机 capture(28-turn / 26MB / DeepSeek + Kimi)显示 7 个 apply_patch tool_call 中 6 个因模型生成的 V4A patch 内容质量被 Codex Desktop 端验证器拒绝(直接吐 Python 代码 / `-` 行不 byte-exact / 选 `exec_command` 而非 `apply_patch`)。本次改动针对模型生成质量,wire 层不再变化。改动点: + +- `APPLY_PATCH_CHAT_PATH_SYSTEM_GUIDANCE`(重写 system message 内容):结构 = Tool selection 引导(对抗 exec_command 偏好)+ 上游 V4A 完整教学(verbatim, openai/codex @ `0b4f86095c8005d8f74e9c62b971d72c1670aa88`, Apache-2.0,经 `include_str!` 嵌入)+ 5 条 chat-path 实测 gotcha(byte-exact / Empty-line anchor 仅当 blank 存在 / Add+Update 同 patch / 空文件 / 纯 `+` 行不替换)。 +- `APPLY_PATCH_TOOL_DESCRIPTION_FOR_CHAT`(扩展 custom tool description):加入 Tool selection 顶层引导、BYTE-EXACT 匹配规则、3 个 positive example(modify line / Add File / 多 hunk Update)、anti-pattern reminder("NEVER pass raw source code")。 +- `APPLY_PATCH_INPUT_DESCRIPTION_FOR_CHAT`(参数级 description mirror):部分 provider 在长历史中降权 tool-level description,参数级补 byte-exact + anti-pattern 紧凑版保持可见性。 +- 新增 `crates/adapters/src/responses/apply_patch_v4a_reference.md`:verbatim 镜像上游 L277-L351,头部加 adapter note 显式 override "shell command" 字眼为 "function-call tool"。NOTICE 文件 + ACKNOWLEDGEMENTS.md + README 中英致谢段同步加 attribution。 + +成功率改善幅度待本 PR push 后真机 regression 测试出 round2 数据,届时在 PR description / 后续 release notes 补具体数字。 + ## v2.1.6 — 2026-05-12 **关键修复**:MiniMax `role=system` 整请求 400(close #139)/ grok_web 多轮历史完整化(`assistant.tool_calls` flatten + `session_cache` 类型层面禁止 foot-gun)/ cloud_code(Gemini OAuth)多轮历史 silent loss prod bug。 diff --git a/docs/investigation/protocol-conversion-3way-comparison.md b/docs/investigation/protocol-conversion-3way-comparison.md index 5c0fe1e3..4ca29487 100644 --- a/docs/investigation/protocol-conversion-3way-comparison.md +++ b/docs/investigation/protocol-conversion-3way-comparison.md @@ -24,7 +24,7 @@ | instructions(dict.text/content) | ✓ | ✓ | ✓ | | | max_output_tokens → max_tokens | ✓ | ✓ | ✓ | | | tools.function | ✓ | ✓ | ✓ | parameters 缺 type 自动补 object | -| tools.custom → function (input:string) | ✓ | 部分 | ✓ | litellm 走 apply_patch 单分支 | +| tools.custom → function (input:string) | ✓ | 部分 | ✓ | litellm 走 apply_patch 单分支;Rust 自 PR #236 起对 apply_patch 特判重写 description,post-#236 进一步扩展(verbatim 借鉴上游 V4A 完整教学 + BYTE-EXACT 规则 + 3 正例 + 5 gotcha + Tool selection 顶层引导) | | tools.web_search/file_search/mcp/computer_use 等丢弃 | ✓ | ✗(转译) | ✓ | Codex 不需要 | | tool_choice "auto"/"none"/"required" | ✓ | ✓ | ✓ | | | tool_choice {type:function, function:{name}} | ✓ | ✓ | ✓ | | @@ -32,6 +32,8 @@ | input.message(含多模态 blocks) | ✓ | ✓ | ✓ | | | input.function_call | ✓ | ✓ | ✓ | | | input.function_call_output(call_id 别名 tool_call_id/id 兜底) | ✓ | ✓ | ✓ | | +| input.custom_tool_call(apply_patch 多轮回放 → assistant.tool_calls) | ✗ | ✗ | ✓ | PR #236 新增;`call_id`/`id` 兜底;arguments 序列化为 `{"input":""}` JSON 字符串 | +| input.custom_tool_call_output(apply_patch 结果回放 → role:tool) | ✗ | ✗ | ✓ | PR #236 新增;`call_id`/`tool_call_id`/`id` 三级兜底;output 走同 normalize_tool_output_for_context | | input.input_image / input_file / input_audio / input_video | ✓ | ✓ | ✓ | | | input.reasoning(opaque)挂下一条 assistant | ✓ | 部分 | ✓ | Rust 用单空格占位 | | 连续 user / assistant 合并 | ✓ | 部分 | ✓ | |