Skip to content

Yuyuyu/relocation#13

Closed
vhjcgja-789 wants to merge 4 commits intomainfrom
yuyuyu/relocation
Closed

Yuyuyu/relocation#13
vhjcgja-789 wants to merge 4 commits intomainfrom
yuyuyu/relocation

Conversation

@vhjcgja-789
Copy link
Copy Markdown
Contributor

@vhjcgja-789 vhjcgja-789 commented May 6, 2026

接入重定位

PR摘要:集成重定位功能

概述

本PR将外部重定位仓库 rmcs_relocation 集成到 rmcs-navigation 中,新增重定位服务调用链与 Lua 接口,支持初始(Initial)、局部(Local)和广域(Wide)三种重定位模式,并在CI中检出该依赖仓库以供构建。

主要改动

依赖与构建

  • package.xml:新增依赖 <depend>rmcs_relocation</depend>
  • CMakeLists.txt:
    • cmake_minimum_required 改为 3.28。
    • 新增 find_package(rmcs_relocation REQUIRED)。
    • 将 ${rmcs_relocation_INCLUDE_DIRS} 加入 target_include_directories,${rmcs_relocation_LIBRARIES} 加入 target_link_libraries。
  • CI (.github/workflows/ci.yml):新增步骤克隆 rmcs_relocation 仓库到 RMCS workspace(/tmp/RMCS/...),并在 colcon 构建前包含该包。

C++实现

  • src/cxx/util/localization/engine.hh / engine.cc

    • 全新基于 rmcs_relocation 服务的 Localization 实现(替换此前基于 PCL/NDT 的实现)。
    • 新增类型:
      • enum RelocalizeState { IDLE, IN_FLIGHT, SUCCEEDED, FAILED }
      • enum RelocalizeMode { Initial = 0, Local = 1, Wide = 2 }
      • struct RelocalizeStatus(包含 state、success、message、fitness_score、confidence、估计位姿等字段)
    • 新增 public API:
      • Localization(Config{ rclcpp::Node&, service_name, request_timeout_sec })
      • bool relocalize(RelocalizeMode mode, double x, double y, double yaw) — 发起重定位请求(异步,返回是否接收)
      • RelocalizeStatus relocalize_status() const — 获取当前状态快照
    • 实现细节:
      • 使用 rclcpp::Client<rmcs_relocation::srv::Relocalize> 发起异步请求,维护 Session 管理 pending 请求 / 超时 / 响应处理,过滤迟到响应并转换服务响应为 RelocalizeStatus。
  • src/cxx/component.cc

    • 引入 Localization 头并添加成员 std::unique_ptr localization 和 enable 标志 relocalization_enabled。
    • 基于参数条件实例化本地化引擎(由 enable_relocalization 控制,若不可用则调用被忽略)。
    • 在 Lua API 注入中新增重定位绑定(通过 bind_relocalize):
      • relocalize_initial, relocalize_local, relocalize_wide:调用 localization->relocalize(...)(若被禁用或 localization 为空则返回 false)。
      • relocalize_status:将 RelocalizeStatus 转换为 Lua table 返回(包含 state、success、message、fitness_score、confidence 以及估计位姿字段)。

Lua层

  • src/lua/action.lua

    • 新增重定位流程管理函数 send_and_await(发起请求并轮询 relocalize_status 直至终态)。
    • 新增公开动作方法:
      • action:relocalize_initial(x, y, yaw)
      • action:relocalize_local() — 使用 blackboard.user 作为 validator(缺失时跳过)
      • action:relocalize_wide() — 使用 blackboard.user 作 prior,缺失时回退到原点
      • action:try_relocalize_local_then_wide() — 局部重定位尝试两次失败后降级为广域重定位
      • action:relocalize_status() — 返回 api.relocalize_status()
    • 行为细节:局部模式在 validator 缺失时跳过;send_and_await 会在 IN_FLIGHT 时轮询并根据 SUCCEEDED/FAILED 返回结果。
  • src/lua/api.lua

    • 在 Lua API 注释/类型注解中添加:
      • relocalize_initial(x, y, yaw): boolean
      • relocalize_local(x, y, yaw): boolean
      • relocalize_wide(x, y, yaw): boolean
      • relocalize_status(): { state, success, message, fitness_score, confidence, ... }

移除/替换

  • 移除了原有基于 PCL/NDT 的点云加载、收集与 NDT 本地化实现,替换为基于 rmcs_relocation 服务的实现(engine.hh/engine.cc 已实现新的轻量 API)。

代码变更量(大致)

  • package.xml:+2/-1
  • CMakeLists.txt:+4/-1
  • .github/workflows/ci.yml:+8/-0
  • src/cxx/component.cc:+52/-1
  • src/cxx/util/localization/engine.hh:+29/-17
  • src/cxx/util/localization/engine.cc:+194/-64
  • src/lua/action.lua:+72/-0
  • src/lua/api.lua:+4/-0

注意点 / 后续检查建议

  • 确认运行时 rmcs_relocation 服务接口与请求/响应字段兼容(服务名与消息常量 mode 映射已在代码中使用)。
  • 验证在服务不可用或超时时的行为(Localization 使用超时定时器与 in-flight 保护)。
  • CI 中拉取的 rmcs_relocation 是否在该工作流环境能正确构建并被 colcon 发现。

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 6, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 10cf6e7d-86e5-4d66-8a0f-ce498cfaa4ec

📥 Commits

Reviewing files that changed from the base of the PR and between 88076fd and b04bac7.

📒 Files selected for processing (2)
  • .github/workflows/ci.yml
  • CMakeLists.txt

Walkthrough

本PR引入基于 ROS2 服务的重定位支持:新增 rmcs_relocation 依赖、替换旧的 PCL/NDT 实现为以 Relocalize 服务驱动的 Localization,添加 C++ 与 Lua 的 API 绑定及相应构建/CI 调整(CMake、package.xml、CI 步骤)。

Changes

重定位功能集成(单一变更 DAG)

Layer / File(s) Summary
数据结构与 API 定义
src/cxx/util/localization/engine.hh
新增 RelocalizeStateRelocalizeModeRelocalizeStatus;引入 Localization::Config 构造参数;公开 relocalize(mode,x,y,yaw)relocalize_status() 方法,移除旧的采集/本地化多步 API。
服务协议与工具函数
src/cxx/util/localization/engine.cc
新增模式到消息映射、状态构建工具与响应转换函数;提供将 (x,y,yaw) 转为 Pose 的助手。
会话管理与实现
src/cxx/util/localization/engine.cc
新增 Session 用于管理飞行请求、互斥/超时、结果快照;实现 Localization::Impl 使用 ROS2 Relocalize 客户端,包含 send()、超时装配与响应回调逻辑。
公开接口实现
src/cxx/util/localization/engine.cc
实现 Localization::relocalize()(发起服务请求)与 Localization::relocalize_status()(返回 Session 快照)。
组件集成与 Lua 绑定
src/cxx/component.cc
添加 Localization 成员与 relocalization_enabled 参数读取;根据参数实例化 Localization 并在 Lua API 中绑定 relocalize_initialrelocalize_localrelocalize_widerelocalize_status,含 guard helper。
Lua 操作与类型注解
src/lua/action.lua, src/lua/api.lua
新增 send_and_await 辅助函数及四个 action:relocalize_initialrelocalize_localrelocalize_widetry_relocalize_local_then_wide;在 api.lua 添加相应类型注解字段(包括 relocalize_status 返回表结构)。
构建与依赖
CMakeLists.txt, package.xml
在 CMake 中新增 find_package(rmcs_relocation REQUIRED) 并链接/包含其头库;在 package.xml 中新增 <depend>rmcs_relocation</depend>
CI 工作流
.github/workflows/ci.yml
新增 CI 步骤以克隆 rmcs_relocation 仓库(替换原先的 tar 解包步骤)。

Sequence Diagram(s)

sequenceDiagram
    participant Lua as Lua Action
    participant Nav as Navigation Component
    participant Loc as Localization (client)
    participant Service as Relocalize ROS2 Service

    Lua->>Nav: call relocalize_initial(x,y,yaw)
    Nav->>Loc: relocalize(Initial, x,y,yaw)
    Loc->>Service: async request (mode, pose)
    activate Service
    Note over Service: 处理重定位请求并计算指标
    Service-->>Loc: response (success, pose, fitness, confidence)
    deactivate Service

    Loc->>Loc: 更新 Session 状态(SUCCEEDED / FAILED)
    Lua->>Loc: poll relocalize_status()
    Loc-->>Lua: RelocalizeStatus {state, success, message, fitness_score, confidence, pose}
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 我在草丛轻嗅风,服务来替旧引擎,

初次、本地、广域一声问,
会话守护忙不惊,超时与回调都记心,
Lua 轻唤,导航又回春。

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.59% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The pull request title "Yuyuyu/relocation" is vague and does not clearly convey the actual changes. It appears to be a branch name rather than a descriptive title explaining what was integrated or modified. 改进标题以清晰描述主要变化,例如"集成rmcs_relocation以支持重定位功能"或"添加初始化、本地和广域重定位支持"。
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch yuyuyu/relocation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
src/lua/api.lua (1)

23-23: 💤 Low value

relocalize_status 类型注解漏写 estimated_ 字段。*

src/cxx/component.cc 的绑定(114-140 行)实际把 estimated_x/y/zestimated_qx/qy/qz/qw 一并塞进了返回 table;这里的 @field 只声明了 5 个字段,会让下游在 try_relocalize_local_then_wide 等流程里用 LSP 取这些坐标时丢失补全/类型校验。

📝 建议补全注解
---- `@field` relocalize_status fun(): { state: integer, success: boolean, message: string, fitness_score: number, confidence: number }
+--- `@field` relocalize_status fun(): { state: integer, success: boolean, message: string, fitness_score: number, confidence: number, estimated_x: number, estimated_y: number, estimated_z: number, estimated_qx: number, estimated_qy: number, estimated_qz: number, estimated_qw: number }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lua/api.lua` at line 23, The relocalize_status Lua type annotation is
missing the estimated pose fields; update the `@field` declaration for
relocalize_status in src/lua/api.lua to include estimated_x, estimated_y,
estimated_z and estimated_qx, estimated_qy, estimated_qz, estimated_qw (all as
numbers) so LSP/completion and type checks used by callers like
try_relocalize_local_then_wide can see the estimated_* fields returned by the
C++ binding (component.cc bind code around the relocalize return).
src/lua/action.lua (1)

8-13: 💤 Low value

Minor: Lua 侧 RelocalizeState 枚举值与 C++ enum class RelocalizeState 是手工镜像的,存在漂移风险。

如果 src/cxx/util/localization/engine.hhRelocalizeState 顺序/值改动(目前是 IDLE=0/IN_FLIGHT=1/SUCCEEDED=2/FAILED=3,经 static_cast<int> 暴露),这里需要同步修改,否则 send_and_await 的轮询判断会静默失效。建议在 C++ 绑定层额外暴露一个常量表(例如在 relocalize_status() 返回中包含 state_name 或单独 api.relocalize_state 表),或在该文件顶部加一行强约束注释引用 C++ 枚举位置。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lua/action.lua` around lines 8 - 13, 当前 Lua 文件中硬编码的 RelocalizeState 枚举可能与
C++ enum class RelocalizeState 不同步,导致 send_and_await 的轮询判断静默失败;请修改绑定层以从 C++
暴露一个常量表(例如在 relocalize_status() 返回中附带 state_name 或单独导出
api.relocalize_state),然后在本 Lua 文件中用该导出的表替代硬编码的 RelocalizeState 常量(并更新所有引用该表的地方,如
send_and_await 的状态判断),或者至少在文件顶部添加一条强约束注释指向 C++ 枚举定义位置以提醒同步;确保使用的标识符包括
RelocalizeState 和 send_and_await 以便定位修改点。
src/cxx/util/localization/engine.cc (1)

168-173: 💤 Low value

可简化冗余的服务可用性检查

service_is_ready() 已是非阻塞探测;后置 wait_for_service(std::chrono::seconds(0)) 亦为非阻塞(零超时时立刻返回,等价于再次检查当前状态),故该 AND 条件冗余。建议直接保留单一检查:

♻️ 建议简化
-        if (!session->client->service_is_ready()
-            && !session->client->wait_for_service(std::chrono::seconds(0))) {
+        if (!session->client->service_is_ready()) {
             warn("relocalize service unavailable: {}", config.service_name);
             session->end(failed_status("service unavailable: " + config.service_name));
             return false;
         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cxx/util/localization/engine.cc` around lines 168 - 173, The service
availability check is redundant because session->client->service_is_ready() and
session->client->wait_for_service(std::chrono::seconds(0)) are both
non-blocking; remove the duplicate check and keep a single call (preferably
session->client->service_is_ready()) in the relocalization precondition, leaving
the warn(...) and session->end(failed_status(...)) behavior unchanged; update
the conditional that currently references service_is_ready() and
wait_for_service(...) so it only calls session->client->service_is_ready().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cxx/util/localization/engine.cc`:
- Around line 148-160: arm_timeout installs a timer that can fire before the
session's request_id is written, so the timer lambda may see *pending_id == -1
and call s->end(...) for the wrong id; to fix, delay creating/assigning the
timer until after the request id is known (i.e., call arm_timeout or create the
create_wall_timer and assign s->timeout_timer only after
track(result.request_id) / after writing the real request id), ensure the timer
closure captures the shared_ptr pending_id that points to the finalized id, and
keep the existing cancellation logic (s->timeout_timer->cancel() /
s->timeout_timer = ...) around the new placement so the timeout always observes
the correct request id when invoking s->end(failed_status(...), *pending_id).
- Around line 180-203: The async callback can run before *pending_id is set,
causing the lambda to call s->end(..., -1) and have Session::pending_id still
nullopt so the response is dropped; fix by making request_id assignment and
session tracking atomic before the call returns — i.e., obtain result.request_id
and set *pending_id and session->pending_id (or call
session->track(result.request_id)) before yielding control where the callback
may execute (move the assignment of *pending_id and the call to session->track
to occur synchronously prior to or as part of invoking
session->client->async_send_request, or protect the sequence with the same lock
used by end()), referencing pending_id, async_send_request, result.request_id,
session->track, and s->end.

In `@src/lua/action.lua`:
- Around line 92-95: When fn(x,y,yaw) returns false, do not always log "reloc
skip %s (in_flight)" and return false,nil; instead call api.relocalize_status()
to fetch the actual terminal status, include that status in the warning (use the
existing self:warn call) and return false along with the status so
try_relocalize_local_then_wide can distinguish "in_flight" from "service
unavailable" (i.e., replace the unconditional warn/return in the fn failure
branch with a call to api.relocalize_status(), log the returned status, and
return it).

---

Nitpick comments:
In `@src/cxx/util/localization/engine.cc`:
- Around line 168-173: The service availability check is redundant because
session->client->service_is_ready() and
session->client->wait_for_service(std::chrono::seconds(0)) are both
non-blocking; remove the duplicate check and keep a single call (preferably
session->client->service_is_ready()) in the relocalization precondition, leaving
the warn(...) and session->end(failed_status(...)) behavior unchanged; update
the conditional that currently references service_is_ready() and
wait_for_service(...) so it only calls session->client->service_is_ready().

In `@src/lua/action.lua`:
- Around line 8-13: 当前 Lua 文件中硬编码的 RelocalizeState 枚举可能与 C++ enum class
RelocalizeState 不同步,导致 send_and_await 的轮询判断静默失败;请修改绑定层以从 C++ 暴露一个常量表(例如在
relocalize_status() 返回中附带 state_name 或单独导出 api.relocalize_state),然后在本 Lua
文件中用该导出的表替代硬编码的 RelocalizeState 常量(并更新所有引用该表的地方,如 send_and_await
的状态判断),或者至少在文件顶部添加一条强约束注释指向 C++ 枚举定义位置以提醒同步;确保使用的标识符包括 RelocalizeState 和
send_and_await 以便定位修改点。

In `@src/lua/api.lua`:
- Line 23: The relocalize_status Lua type annotation is missing the estimated
pose fields; update the `@field` declaration for relocalize_status in
src/lua/api.lua to include estimated_x, estimated_y, estimated_z and
estimated_qx, estimated_qy, estimated_qz, estimated_qw (all as numbers) so
LSP/completion and type checks used by callers like
try_relocalize_local_then_wide can see the estimated_* fields returned by the
C++ binding (component.cc bind code around the relocalize return).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a9364807-486b-41b2-aef4-ab117ab96179

📥 Commits

Reviewing files that changed from the base of the PR and between a0bcb9f and 88076fd.

📒 Files selected for processing (6)
  • package.xml
  • src/cxx/component.cc
  • src/cxx/util/localization/engine.cc
  • src/cxx/util/localization/engine.hh
  • src/lua/action.lua
  • src/lua/api.lua

Comment on lines +148 to +160
auto arm_timeout(const std::shared_ptr<Session>& s, const std::shared_ptr<std::int64_t>& pending_id)
-> void {
const auto timeout = std::chrono::duration_cast<std::chrono::nanoseconds>(
std::chrono::duration<double>{std::max(0.1, config.request_timeout_sec)});
auto timer = node.create_wall_timer(timeout, [s, pending_id]() {
if (s->end(failed_status("request timeout: " + s->service_name), *pending_id))
RCLCPP_WARN(s->logger, "relocalize request timeout: %s", s->service_name.c_str());
});
auto lock = std::scoped_lock{s->mutex};
if (s->timeout_timer)
s->timeout_timer->cancel();
s->timeout_timer = std::move(timer);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Minor: arm_timeout 在拿到 request_id 之前装填,超时回调里读到 *pending_id == -1

定时器在 arm_timeout 内即开始计时;如果在 *pending_id = result.request_id(第 201 行)之前先触发(极端情况下 0.1s 也可能赶上 executor 抖动 / 进程冻结恢复),s->end(failed_status(...), -1) 会与 Session::pending_id(此时仍为 nullopt 或已被 track 写入真实 id)做对比,导致超时无法生效或对错误的请求生效。

建议把 arm_timeout 推迟到 track(result.request_id) 之后再装上,这样定时器闭包看到的 *pending_id 始终是真实 id;上一条 critical 修复同时解决此处。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cxx/util/localization/engine.cc` around lines 148 - 160, arm_timeout
installs a timer that can fire before the session's request_id is written, so
the timer lambda may see *pending_id == -1 and call s->end(...) for the wrong
id; to fix, delay creating/assigning the timer until after the request id is
known (i.e., call arm_timeout or create the create_wall_timer and assign
s->timeout_timer only after track(result.request_id) / after writing the real
request id), ensure the timer closure captures the shared_ptr pending_id that
points to the finalized id, and keep the existing cancellation logic
(s->timeout_timer->cancel() / s->timeout_timer = ...) around the new placement
so the timeout always observes the correct request id when invoking
s->end(failed_status(...), *pending_id).

Comment on lines +180 to +203
auto pending_id = std::make_shared<std::int64_t>(-1);
arm_timeout(session, pending_id);
auto s = session;
auto result = session->client->async_send_request(
std::move(request), [s, pending_id](RelocalizeClient::SharedFuture future) {
auto response = future.get();
if (!response) {
RCLCPP_WARN(s->logger, "relocalize response is null");
s->end(failed_status("response is null"), *pending_id);
return;
}
if (!response->success) {
RCLCPP_WARN(s->logger, "relocalize failed: %s", response->message.c_str());
s->end(status_from_response(*response, false), *pending_id);
return;
}
RCLCPP_INFO(
s->logger, "relocalize success: score=%.4f, confidence=%.3f",
response->fitness_score, response->confidence);
s->end(status_from_response(*response, true), *pending_id);
});
*pending_id = result.request_id;
session->track(result.request_id);
return true;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

🧩 Analysis chain

🌐 Web query:

rclcpp Client async_send_request callback thread invocation timing multi-threaded executor

💡 Result:

In rclcpp, when using Client::async_send_request(request, callback) with a user-provided callback, the callback is scheduled as regular work in the executor and invoked by one of the executor's threads when the response arrives. This applies to MultiThreadedExecutor, where the exact thread depends on scheduling and callback group rules (MutuallyExclusive or Reentrant).[1][2][3][4] The response is detected via the RMW layer during executor's wait_for_work (rcl_wait/take), triggering handle_response in the executor thread, which sets the future and queues the user callback if provided.[1][5][6] Timing: The callback executes promptly after response receipt, subject to executor load, wait timeouts, and callback group contention. In MultiThreadedExecutor, multiple threads poll waitsets concurrently; whichever detects the response executes the associated work.[3] Without user callback (just future), use spin_until_future_complete to block until handled.[1] Official docs confirm callbacks from async_send_request are "scheduled in the executor".[1][2]

Citations:


🏁 Script executed:

# First, find the engine.cc file and examine the Session::end() implementation
find . -name "engine.cc" -o -name "engine.h" | head -20

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 116


🏁 Script executed:

# Look at the directory structure to understand the layout
ls -la src/cxx/util/localization/ 2>/dev/null || find . -type f -name "*engine*" | head -10

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 308


🏁 Script executed:

# Search for Session struct and end() method implementation
rg -A 20 "bool\s+end\(" --type cpp --type h

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 60


🏁 Script executed:

# Search for the track() method implementation
rg -A 5 "void\s+track\(" --type cpp --type h

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 60


🏁 Script executed:

# Search for arm_timeout implementation
rg -B 2 -A 15 "arm_timeout" --type cpp --type h

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 3180


🏁 Script executed:

cat -n src/cxx/util/localization/engine.cc | head -220

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 9455


🏁 Script executed:

cat -n src/cxx/util/localization/engine.hh

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 1638


🏁 Script executed:

# Check for brace initialization compliance in the snippet lines 180-203
sed -n '175,205p' src/cxx/util/localization/engine.cc | grep -E "(make_shared|auto.*=)"

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 370


🏁 Script executed:

# Search for any other async_send_request usages in the codebase to see if this pattern is repeated
rg "async_send_request" --type cpp --type h -B 3 -A 10

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 1368


关键:异步回调与 pending_id 赋值存在竞态,导致响应被丢弃。

session->client->async_send_request(...) 在第 183 行调用后即可在 executor 线程触发回调(尤其在多线程 executor 中),但 *pending_id = result.request_id 的赋值延迟到第 201 行。若回调在赋值前先执行:

  • lambda 内 *pending_id 仍为 -1,调用 s->end(status, -1)
  • 此时 Session::pending_id 仍为 std::nullopttrack() 未执行),触发 end() 中的过滤条件(行 104):pending_id != *expected_idnullopt != -1 → true,返回 false;
  • 响应被静默丢弃last_status 滞留 IN_FLIGHT,最终仅由 arm_timeout 触发超时失败。

建议在 async_send_request 前同步设置 *pending_idsession->pending_id,确保回调读取时已是正确的 request_id:

修复思路
        auto pending_id = std::make_shared<std::int64_t>(-1);
-       arm_timeout(session, pending_id);
        auto s          = session;
+       auto result     = session->client->async_send_request(...)
+           [返回后同步赋值]
+       *pending_id = result.request_id;
+       session->track(result.request_id);
+       arm_timeout(session, pending_id);  // 在赋值后创建超时

或在持锁状态下完成「dispatch + 记录 id」的原子化,防止回调抢先读取 -1。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cxx/util/localization/engine.cc` around lines 180 - 203, The async
callback can run before *pending_id is set, causing the lambda to call
s->end(..., -1) and have Session::pending_id still nullopt so the response is
dropped; fix by making request_id assignment and session tracking atomic before
the call returns — i.e., obtain result.request_id and set *pending_id and
session->pending_id (or call session->track(result.request_id)) before yielding
control where the callback may execute (move the assignment of *pending_id and
the call to session->track to occur synchronously prior to or as part of
invoking session->client->async_send_request, or protect the sequence with the
same lock used by end()), referencing pending_id, async_send_request,
result.request_id, session->track, and s->end.

Comment thread src/lua/action.lua
Comment on lines +92 to +95
if not fn(x, y, yaw) then
self:warn(string.format("reloc skip %s (in_flight)", mode))
return false, nil
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Minor: fn(...)==false 时日志/返回值与 C++ 真实失败原因不一致。

src/cxx/util/localization/engine.ccImpl::send 在以下两种情况都返回 false

  1. session->begin() 失败(确实 in_flight);
  2. service 不可用(且会同步把 last_status 标为 FAILED:"service unavailable: ...")。

这里直接打 "reloc skip %s (in_flight)"return false, nil,会让上层 try_relocalize_local_then_wide 也丢掉「service unavailable」的诊断状态,无法区分这两类。建议在 fn 返回 false 时再 api.relocalize_status() 取一次末态返回,便于上层告警/降级判断:

📝 建议
 	if not fn(x, y, yaw) then
-		self:warn(string.format("reloc skip %s (in_flight)", mode))
-		return false, nil
+		local st = api.relocalize_status()
+		self:warn(string.format("reloc skip %s | %s", mode, tostring(st and st.message or "in_flight")))
+		return false, st
 	end
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lua/action.lua` around lines 92 - 95, When fn(x,y,yaw) returns false, do
not always log "reloc skip %s (in_flight)" and return false,nil; instead call
api.relocalize_status() to fetch the actual terminal status, include that status
in the warning (use the existing self:warn call) and return false along with the
status so try_relocalize_local_then_wide can distinguish "in_flight" from
"service unavailable" (i.e., replace the unconditional warn/return in the fn
failure branch with a call to api.relocalize_status(), log the returned status,
and return it).

@vhjcgja-789 vhjcgja-789 closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant