Skip to content

Reapply "Yuyuyu/relocation"#16

Open
vhjcgja-789 wants to merge 11 commits intomainfrom
yuyuyu/reloc
Open

Reapply "Yuyuyu/relocation"#16
vhjcgja-789 wants to merge 11 commits intomainfrom
yuyuyu/reloc

Conversation

@vhjcgja-789
Copy link
Copy Markdown
Contributor

@vhjcgja-789 vhjcgja-789 commented May 7, 2026

接入重定位——重新提了PR

重定位功能集成 PR 重新提交

概述

本 PR 重新应用了重定位功能集成,涉及与 rmcs_relocation 依赖库的接入。项目中 main 分支之前的错误操作已被撤销。

主要变更

依赖项和构建系统

  • CI 工作流 (.github/workflows/ci.yml):在测试步骤中新增一个步骤,克隆 rmcs_relocation 依赖到 RMCS 工作区 (/tmp/RMCS/rmcs_ws/src/skills/rmcs_relocation),使用浅克隆 (--depth 1)。

  • CMakeLists.txt

    • CMake 最低版本从 3.30 降至 3.28
    • 添加 rmcs_relocation 依赖并配置其包含目录和链接库
  • package.xml:添加了 rmcs_relocation 的运行时依赖声明

C++ 实现层面

本地化引擎重构 (src/cxx/util/localization/)

  • engine.hhengine.cc:完全重构本地化实现,从基于 NDT/点云的方案改为基于 ROS2 服务的重定位引擎
    • 引入 RelocalizeStateRelocalizeMode 枚举
    • 添加 RelocalizeStatus 结构体,包含位置、旋转、适应度和置信度信息
    • 简化 Localization::Config 为仅包含服务名称和超时配置
    • 移除旧的 start_collecting()start_localizing() 方法
    • 新增 relocalize(mode, x, y, yaw)relocalize_status() 接口
    • 实现会话管理和超时控制机制

导航组件 (src/cxx/component.cc)

  • 新增可选的 Localization 实例和 relocalization_enabled 参数
  • 在构造期间根据 enable_relocalization 参数(默认启用)初始化重定位功能
  • 为 Lua 注册三个重定位函数:relocalize_initialrelocalize_localrelocalize_wide
  • 新增 relocalize_status() Lua API,用于查询重定位状态

Lua 脚本层面

动作层 (src/lua/action.lua)

  • 新增 RelocalizeState 枚举
  • 实现通用的"发送 → 轮询 → 终止"控制流(send_and_await 辅助函数)
  • 新增三个公开方法:
    • relocalize_initial(x, y, yaw):执行初始重定位
    • relocalize_local():进行本地重定位,使用黑板中的用户位置作为初始值
    • relocalize_wide():进行全局重定位,如无有效位置则使用原点
  • 新增 relocalize_status() 方法用于查询重定位状态

API 层 (src/lua/api.lua)

  • 扩展 Api 类型注解,新增重定位相关字段:relocalize_initialrelocalize_localrelocalize_widerelocalize_status

主要特点

  • 采用 ROS2 服务架构进行重定位,提高了系统的模块化和可配置性
  • 支持三种重定位模式(初始、本地、全局),满足不同的定位场景
  • 完整的 Lua API 绑定,使上层脚本能够便捷调用重定位功能
  • 实现了请求超时和会话管理,增强了系统的鲁棒性

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 7, 2026

概述

该拉取请求通过ROS2服务集成了rmcs_relocation包,添加了从Lua层到底层的重定位功能,包括构建依赖声明、C++接口定义、服务驱动的实现、组件级集成和Lua工作流编排。

更改

重定位服务集成

层级 / 文件(s) 摘要
API契约定义
src/cxx/util/localization/engine.hh
替换点云收集接口为relocalize(RelocalizeMode, x, y, yaw) → boolrelocalize_status() → RelocalizeStatus。引入RelocalizeStateRelocalizeMode枚举和包含状态、成功标志、适应度及位姿估计的RelocalizeStatus结构。简化Config以仅保留节点引用、服务名称和请求超时。
服务客户端实现
src/cxx/util/localization/engine.cc
使用ROS2服务客户端替代NDT/PCL实现。添加Session结构以序列化在途请求,引入助手函数用于姿态转换和响应映射,实现基于壁计时器的超时,通过请求ID过滤延迟响应,提供异步服务回调处理。
组件配置与绑定
src/cxx/component.cc
新增std::unique_ptr<Localization>成员和relocalization_enabled标志。在构造器中从ROS参数读取配置(enable_relocalization默认为真),并条件性地初始化Localization实例。通过共享lambda在Lua API中注册三个重定位函数和状态查询,执行在途门控和禁用检查。
构建与部署配置
.github/workflows/ci.yml, CMakeLists.txt, package.xml
CI工作流添加rmcs_relocation的浅克隆至RMCS工作空间。CMakeLists.txt降低CMake最低版本需求(3.30→3.28),声明find_package(rmcs_relocation REQUIRED)并链接其头文件与库。package.xml添加运行时依赖。
Lua工作流编排
src/lua/action.lua, src/lua/api.lua
action.lua实现send_and_await()通用机制以发送请求、轮询状态直至终止状态。添加三个公开方法relocalize_initial()relocalize_local()(带验证器检查)、relocalize_wide()(备用位姿逻辑)及状态透传。api.lua扩展类型注解以暴露新的重定位接口。

序列图

sequenceDiagram
    participant Lua as Lua脚本
    participant Action as action模块
    participant Component as C++组件
    participant Localization as Localization
    participant Service as rmcs_relocation<br/>服务
    
    Lua->>Action: relocalize_initial(x,y,yaw)
    Action->>Action: send_and_await()检查在途门控
    Action->>Component: api.relocalize_initial()
    Component->>Localization: relocalize(mode,x,y,yaw)
    Localization->>Service: 发送异步服务请求
    Service-->>Localization: 异步响应回调
    Localization->>Localization: 映射响应→RelocalizeStatus
    
    loop 轮询直至终止
        Action->>Component: api.relocalize_status()
        Component->>Localization: relocalize_status()
        Localization-->>Component: 返回当前状态快照
        Component-->>Action: 状态表
        alt 状态==SUCCEEDED或失败
            Action->>Action: 返回(ok, status)
        else 状态==IN_FLIGHT
            Action->>Action: 继续轮询
        end
    end
    
    Action-->>Lua: (success_bool, status_table)
Loading

评估代码审查工作量

🎯 3 (中等) | ⏱️ ~20 分钟

🐰 一声短促的跳跃声,
重定位的魔法降临——
服务相互握手,Lua轻声呼唤,
导航系统找到自己,
在迷茫的坐标里重生!

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch yuyuyu/reloc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (7)
package.xml (1)

31-33: 💤 Low value

<depend> 顺序与缩进微调建议

新增的 <depend>rmcs_relocation</depend> 被放在 <exec_depend> 组之后,与上方其他 <depend> 项分隔开了;同时 Line 33 末尾有多余空格。建议把它移回到上方 <depend> 组(紧挨 rmcs_msgs/nav2_msgs),以保持一致的分组顺序。

♻️ 建议改动
   <depend>rmcs_msgs</depend>
   <depend>nav2_msgs</depend>
+  <depend>rmcs_relocation</depend>

   <exec_depend>launch</exec_depend>
   ...
   <exec_depend>rmcs_local_map</exec_depend>
-  <depend>rmcs_relocation</depend>
-   
   <export>
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@package.xml` around lines 31 - 33, The new <depend>rmcs_relocation</depend>
entry is misplaced after the <exec_depend>rmcs_local_map</exec_depend> and has a
trailing space; move the <depend>rmcs_relocation</depend> line back into the
main <depend> group next to rmcs_msgs and nav2_msgs to preserve grouping order
and remove the extra trailing whitespace at the end of the line; ensure
indentation matches the other <depend> entries.
.github/workflows/ci.yml (1)

70-76: 💤 Low value

CI 跟随 rmcs_relocationmain 分支,非确定性构建

这里 --branch main --depth 1 直接拉取上游 rmcs_relocation/main。一旦上游有非兼容改动,与本仓库无关的 PR 也会突然在 CI 里失败。短期可以接受,长期建议改成 pin 一个稳定的 tag/commit,并在 env: 里以 RMCS_RELOCATION_REF 暴露便于后续升级。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/ci.yml around lines 70 - 76, The CI step "Clone
rmcs_relocation dependency" currently clones the upstream main branch causing
non-deterministic builds; change it to use an environment variable
RMCS_RELOCATION_REF (exposed in the workflow env:) and pin the clone to that ref
(a stable tag or commit) instead of hardcoding --branch main; update the step to
use RMCS_RELOCATION_REF when cloning (or fetch+checkout that ref) so maintainers
can update the pinned version intentionally and CI remains deterministic.
src/lua/action.lua (2)

8-13: 💤 Low value

Lua 与 C++ 的 RelocalizeState 数值需保持手工同步

这里硬编码了 IDLE=0/IN_FLIGHT=1/SUCCEEDED=2/FAILED=3,与 engine.hhenum class RelocalizeState : std::uint8_t 对齐。两侧任何一边调整顺序都会让 send_and_await 误判终态而不会有任何警告。建议在 C++ 侧通过 make_api_injection 把这几个常量也注入到 Lua(例如 api.RelocalizeState.SUCCEEDED),让 Lua 这里直接消费而不是各写一份。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lua/action.lua` around lines 8 - 13, The RelocalizeState numeric values
are hardcoded in Lua and must instead consume the constants injected from C++ to
avoid drift; remove the local RelocalizeState table and replace its usages in
this file (e.g., comparisons inside send_and_await and any references to
RelocalizeState.IDLE/IN_FLIGHT/SUCCEEDED/FAILED) with the injected
api.RelocalizeState.* symbols (ensure the C++ side provides these via
make_api_injection so api.RelocalizeState.SUCCEEDED etc. exist), and update any
local references or imports so the code reads from api.RelocalizeState rather
than duplicating the enum.

100-114: ⚡ Quick win

轮询缺少上限,依赖 C++ 端必须按时退出 IN_FLIGHT

while true do ... request:sleep(0.1) end 在 Lua 侧没有任何兜底超时。一旦 C++ 端的 session 因为 engine.cc 的 timer/race 问题卡在 IN_FLIGHT(参见对 engine.cc 的另一条评论),这个协程会永远占着调度。建议加一个 wall-clock 上限(例如 request_timeout_sec * 2 + 安全余量),到时强制返回 false, st

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lua/action.lua` around lines 100 - 114, The loop polling
api.relocalize_status() has no wall-clock timeout and can spin forever if state
stays IN_FLIGHT; update the relocalization loop in action.lua (the while true
block that calls api.relocalize_status(), checks RelocalizeState, and uses
request:sleep) to track elapsed real time and break after a configured limit
(for example request_timeout_sec * 2 plus a small safety margin), returning
false, st when the timeout is reached and preserving the existing self:warn
logging (include the timeout reason in the message). Ensure the timeout uses a
monotonic/wall-clock timer available to the coroutine and does not rely on the
C++ side to flip IN_FLIGHT.
src/cxx/util/localization/engine.hh (1)

10-21: 💤 Low value

两个枚举的命名风格不一致

RelocalizeStateIDLE/IN_FLIGHT/SUCCEEDED/FAILED(全大写下划线),而紧挨着的 RelocalizeModeInitial/Local/Wide(PascalCase)。同一文件、同一前缀的两个枚举类风格不一致,跨文件使用时容易写错。建议统一其中一种。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cxx/util/localization/engine.hh` around lines 10 - 21, The two enum
classes RelocalizeState and RelocalizeMode use inconsistent naming styles; pick
a single style (either ALL_CAPS_WITH_UNDERSCORES or PascalCase) and rename the
enum members in one of them to match the other (e.g., change
RelocalizeMode::Initial/Local/Wide to INITIAL/LOCAL/WIDE to match
RelocalizeState, or change RelocalizeState::IDLE/IN_FLIGHT/SUCCEEDED/FAILED to
Idle/InFlight/Succeeded/Failed to match RelocalizeMode); update all
references/usages of these symbols (RelocalizeState, RelocalizeMode and their
members) across the codebase to the chosen style to avoid build errors and keep
naming consistent.
src/cxx/component.cc (1)

277-280: 💤 Low value

has_parameter + get_parameter_or 二选一即可

get_parameter_or<bool>("enable_relocalization", true) 本身就在参数缺失时返回默认值,不需要再额外 has_parameter 三目;现在的写法逻辑等价,但读起来像在做不同的判断。

♻️ 建议简化
-        relocalization_enabled =
-            has_parameter("enable_relocalization")
-                ? get_parameter_or<bool>("enable_relocalization", true)
-                : true;
+        relocalization_enabled = get_parameter_or<bool>("enable_relocalization", true);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cxx/component.cc` around lines 277 - 280, The ternary using has_parameter
with get_parameter_or for relocalization_enabled is redundant; replace the
expression with a single call to get_parameter_or<bool>("enable_relocalization",
true) so relocalization_enabled is set directly from get_parameter_or (remove
has_parameter and the conditional) — update the assignment for
relocalization_enabled in component.cc to rely solely on get_parameter_or.
src/cxx/util/localization/engine.cc (1)

168-173: 💤 Low value

service_is_ready + wait_for_service(0s) 的双重判断基本等价

wait_for_service(std::chrono::seconds(0)) 实际上是一次非阻塞探测,与 service_is_ready() 在效果上重叠。这里写成 !ready && !wait(0) 并没有比单独 !ready 多检测什么状态。如果想要"短超时再等一次",建议给 wait_for_service 一个非零的小窗口(例如 50–200ms),否则可以直接简化成 service_is_ready()

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cxx/util/localization/engine.cc` around lines 168 - 173, The current
check uses both session->client->service_is_ready() and
wait_for_service(std::chrono::seconds(0)), which overlap because
wait_for_service(0) is a non-blocking probe; simplify by using a single check or
make the wait non-zero: either replace the condition with if
(!session->client->service_is_ready()) { ... } to keep the fast non-blocking
check, or call session->client->wait_for_service(std::chrono::milliseconds(100))
(or another small 50–200ms value) instead of seconds(0) to allow a brief wait
before failing, keeping the same error path that calls warn(...) and
session->end(failed_status("service unavailable: " + config.service_name)).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cxx/component.cc`:
- Around line 134-146: The disabled branch in the
api.set_function("relocalize_status") lambda returns RelocalizeStatus{.message =
"disabled"} which leaves state at the default IDLE; update that branch to return
a status with an explicit terminal state (e.g., RelocalizeState::FAILED or add a
new RelocalizeState::DISABLED) and a suitable message so callers of
relocalize_status() see a consistent terminal state; change the returned
RelocalizeStatus construction in the disabled path inside the relocalize_status
lambda (or add a new enum value if choosing DISABLED) so the "state" field is
not left as IDLE.

In `@src/cxx/util/localization/engine.cc`:
- Around line 148-202: The timeout timer can repeatedly fire and pending_id
races because arm_timeout is called before we obtain and track the actual
request_id; fix send by calling session->client->async_send_request(...) first,
capture result.request_id, call session->track(result.request_id), assign
*pending_id = result.request_id, then call arm_timeout(session, pending_id) so
the timer sees the real id; inside arm_timeout's lambda capture the timer (e.g.,
store the created timer in a shared_ptr and capture it) and call timer->cancel()
at the start of the callback (in addition to s->timeout_timer cancel handling)
to ensure the timer is stopped even if s->end(...) returns false, referencing
functions/symbols: send, async_send_request, result.request_id, session->track,
arm_timeout, create_wall_timer, timeout_timer, pending_id.

In `@src/lua/action.lua`:
- Around line 95-99: The false return from fn(...) in send_and_await conflates
"in-flight" and "service unavailable" cases; modify send_and_await to call
api.relocalize_status() when fn(x,y,yaw) returns false, inspect the returned
status/state from that API, and log a distinct message (e.g., "reloc skip <mode>
(service_unavailable: ...)" vs the existing "reloc skip <mode> (in_flight)") and
propagate the same false,nil return; use the function send_and_await and
api.relocalize_status() as the insertion points to read the accurate state and
include it in the process warning.

In `@src/lua/api.lua`:
- Line 23: relocalize_status 的注解缺少组件实际返回的 estimated_* 字段:在声明
relocalize_status(Lua 注解的 fun() 返回表)中补充 estimated_x、estimated_y、estimated_z 和
estimated_qx、estimated_qy、estimated_qz、estimated_qw 字段并把类型设为 number,以与
src/cxx/component.cc 中实际返回的 table 匹配,从而让 LSP 补全与静态检查识别这些字段。

---

Nitpick comments:
In @.github/workflows/ci.yml:
- Around line 70-76: The CI step "Clone rmcs_relocation dependency" currently
clones the upstream main branch causing non-deterministic builds; change it to
use an environment variable RMCS_RELOCATION_REF (exposed in the workflow env:)
and pin the clone to that ref (a stable tag or commit) instead of hardcoding
--branch main; update the step to use RMCS_RELOCATION_REF when cloning (or
fetch+checkout that ref) so maintainers can update the pinned version
intentionally and CI remains deterministic.

In `@package.xml`:
- Around line 31-33: The new <depend>rmcs_relocation</depend> entry is misplaced
after the <exec_depend>rmcs_local_map</exec_depend> and has a trailing space;
move the <depend>rmcs_relocation</depend> line back into the main <depend> group
next to rmcs_msgs and nav2_msgs to preserve grouping order and remove the extra
trailing whitespace at the end of the line; ensure indentation matches the other
<depend> entries.

In `@src/cxx/component.cc`:
- Around line 277-280: The ternary using has_parameter with get_parameter_or for
relocalization_enabled is redundant; replace the expression with a single call
to get_parameter_or<bool>("enable_relocalization", true) so
relocalization_enabled is set directly from get_parameter_or (remove
has_parameter and the conditional) — update the assignment for
relocalization_enabled in component.cc to rely solely on get_parameter_or.

In `@src/cxx/util/localization/engine.cc`:
- Around line 168-173: The current check uses both
session->client->service_is_ready() and
wait_for_service(std::chrono::seconds(0)), which overlap because
wait_for_service(0) is a non-blocking probe; simplify by using a single check or
make the wait non-zero: either replace the condition with if
(!session->client->service_is_ready()) { ... } to keep the fast non-blocking
check, or call session->client->wait_for_service(std::chrono::milliseconds(100))
(or another small 50–200ms value) instead of seconds(0) to allow a brief wait
before failing, keeping the same error path that calls warn(...) and
session->end(failed_status("service unavailable: " + config.service_name)).

In `@src/cxx/util/localization/engine.hh`:
- Around line 10-21: The two enum classes RelocalizeState and RelocalizeMode use
inconsistent naming styles; pick a single style (either
ALL_CAPS_WITH_UNDERSCORES or PascalCase) and rename the enum members in one of
them to match the other (e.g., change RelocalizeMode::Initial/Local/Wide to
INITIAL/LOCAL/WIDE to match RelocalizeState, or change
RelocalizeState::IDLE/IN_FLIGHT/SUCCEEDED/FAILED to
Idle/InFlight/Succeeded/Failed to match RelocalizeMode); update all
references/usages of these symbols (RelocalizeState, RelocalizeMode and their
members) across the codebase to the chosen style to avoid build errors and keep
naming consistent.

In `@src/lua/action.lua`:
- Around line 8-13: The RelocalizeState numeric values are hardcoded in Lua and
must instead consume the constants injected from C++ to avoid drift; remove the
local RelocalizeState table and replace its usages in this file (e.g.,
comparisons inside send_and_await and any references to
RelocalizeState.IDLE/IN_FLIGHT/SUCCEEDED/FAILED) with the injected
api.RelocalizeState.* symbols (ensure the C++ side provides these via
make_api_injection so api.RelocalizeState.SUCCEEDED etc. exist), and update any
local references or imports so the code reads from api.RelocalizeState rather
than duplicating the enum.
- Around line 100-114: The loop polling api.relocalize_status() has no
wall-clock timeout and can spin forever if state stays IN_FLIGHT; update the
relocalization loop in action.lua (the while true block that calls
api.relocalize_status(), checks RelocalizeState, and uses request:sleep) to
track elapsed real time and break after a configured limit (for example
request_timeout_sec * 2 plus a small safety margin), returning false, st when
the timeout is reached and preserving the existing self:warn logging (include
the timeout reason in the message). Ensure the timeout uses a
monotonic/wall-clock timer available to the coroutine and does not rely on the
C++ side to flip IN_FLIGHT.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2a58e101-2f11-4c70-a3cf-6e205a570397

📥 Commits

Reviewing files that changed from the base of the PR and between 3370655 and 0f7dd4d.

📒 Files selected for processing (8)
  • .github/workflows/ci.yml
  • CMakeLists.txt
  • package.xml
  • src/cxx/component.cc
  • src/cxx/util/localization/engine.cc
  • src/cxx/util/localization/engine.hh
  • src/lua/action.lua
  • src/lua/api.lua

Comment thread src/cxx/component.cc
Comment on lines +134 to +146
// status 转 lua table:disabled/enabled。
api.set_function("relocalize_status", [this]() {
const auto status = (relocalization_enabled && localization)
? localization->relocalize_status()
: RelocalizeStatus{.message = "disabled"};
return lua->create_table_with(
"state", static_cast<int>(status.state), "success", status.success,
"message", status.message, "fitness_score", status.fitness_score, "confidence",
status.confidence, "estimated_x", status.estimated_x, "estimated_y",
status.estimated_y, "estimated_z", status.estimated_z, "estimated_qx",
status.estimated_qx, "estimated_qy", status.estimated_qy, "estimated_qz",
status.estimated_qz, "estimated_qw", status.estimated_qw);
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

disabled 分支返回的 stateIDLE 而非终态

RelocalizeStatus{.message = "disabled"} 会得到默认的 state = IDLE,但语义上"功能被关闭"应该是一个终态(FAILED 或新增的 DISABLED)。Lua 端 send_and_await 不会走到这里(disabled 时 relocalize_* 已直接返回 false),但任何只查询 relocalize_status() 的调用方会看到 state==IDLE / message=="disabled" 的奇怪组合。

🛠️ 建议
-            const auto status = (relocalization_enabled && localization)
-                                  ? localization->relocalize_status()
-                                  : RelocalizeStatus{.message = "disabled"};
+            const auto status = (relocalization_enabled && localization)
+                                  ? localization->relocalize_status()
+                                  : RelocalizeStatus{
+                                        .state   = RelocalizeState::FAILED,
+                                        .success = false,
+                                        .message = "disabled",
+                                    };
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cxx/component.cc` around lines 134 - 146, The disabled branch in the
api.set_function("relocalize_status") lambda returns RelocalizeStatus{.message =
"disabled"} which leaves state at the default IDLE; update that branch to return
a status with an explicit terminal state (e.g., RelocalizeState::FAILED or add a
new RelocalizeState::DISABLED) and a suitable message so callers of
relocalize_status() see a consistent terminal state; change the returned
RelocalizeStatus construction in the disabled path inside the relocalize_status
lambda (or add a new enum value if choosing DISABLED) so the "state" field is
not left as IDLE.

Comment on lines +148 to +202
auto arm_timeout(const std::shared_ptr<Session>& s, const std::shared_ptr<std::int64_t>& pending_id)
-> void {
const auto timeout = std::chrono::duration_cast<std::chrono::nanoseconds>(
std::chrono::duration<double>{std::max(0.1, config.request_timeout_sec)});
auto timer = node.create_wall_timer(timeout, [s, pending_id]() {
if (s->end(failed_status("request timeout: " + s->service_name), *pending_id))
RCLCPP_WARN(s->logger, "relocalize request timeout: %s", s->service_name.c_str());
});
auto lock = std::scoped_lock{s->mutex};
if (s->timeout_timer)
s->timeout_timer->cancel();
s->timeout_timer = std::move(timer);
}

auto send(RelocalizeMode mode, double x, double y, double yaw) -> bool {
if (!session->begin()) {
warn("relocalize skipped: previous request is still in flight");
return false;
}

if (!session->client->service_is_ready()
&& !session->client->wait_for_service(std::chrono::seconds(0))) {
warn("relocalize service unavailable: {}", config.service_name);
session->end(failed_status("service unavailable: " + config.service_name));
return false;
}

auto request = std::make_shared<Relocalize::Request>();
request->mode = mode_to_msg(mode);
request->initial_guess_world_base = yaw_to_pose(x, y, yaw);

// executor 异步触发 callback。
auto pending_id = std::make_shared<std::int64_t>(-1);
arm_timeout(session, pending_id);
auto s = session;
auto result = session->client->async_send_request(
std::move(request), [s, pending_id](RelocalizeClient::SharedFuture future) {
auto response = future.get();
if (!response) {
RCLCPP_WARN(s->logger, "relocalize response is null");
s->end(failed_status("response is null"), *pending_id);
return;
}
if (!response->success) {
RCLCPP_WARN(s->logger, "relocalize failed: %s", response->message.c_str());
s->end(status_from_response(*response, false), *pending_id);
return;
}
RCLCPP_INFO(
s->logger, "relocalize success: score=%.4f, confidence=%.3f",
response->fitness_score, response->confidence);
s->end(status_from_response(*response, true), *pending_id);
});
*pending_id = result.request_id;
session->track(result.request_id);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

Does rclcpp Node::create_wall_timer create a one-shot timer or a recurring timer in ROS 2 Jazzy?

💡 Result:

In ROS 2 Jazzy, rclcpp::Node::create_wall_timer creates a recurring (periodic) timer, not a one-shot timer. Evidence: the rclcpp timer that create_wall_timer builds on is described as a “Generic timer. Periodically executes a user-specified callback” [1]. The create_wall_timer API takes a “period – period to execute callback” [2], and the WallTimer docs describe the callback as “to execute every interval” [3]. If you want one-shot behavior, you typically cancel the timer after the first callback (or use cancel/reset appropriately); rclcpp timers expose cancel and reset APIs [4]. So, create_wall_timer’s default behavior is periodic/recurring; it does not automatically fire only once.

Citations:


🏁 Script executed:

# Find the Session struct and examine end() method
rg -A 30 "struct Session" src/cxx/util/localization/engine.cc | head -100

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 1182


🏁 Script executed:

# Search for the end() method implementation
rg -B 5 -A 20 "bool.*end\(" src/cxx/util/localization/engine.cc

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 60


🏁 Script executed:

# Search for clear_locked implementation
rg -B 5 -A 15 "clear_locked" src/cxx/util/localization/engine.cc

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 1969


🏁 Script executed:

# Search for track() method
rg -B 3 -A 10 "void.*track\(" src/cxx/util/localization/engine.cc

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 60


🏁 Script executed:

# Check the full arm_timeout and send method context
sed -n '140,210p' src/cxx/util/localization/engine.cc

Repository: Alliance-Algorithm/rmcs-navigation

Length of output: 3161


超时定时器存在重复触发与 pending_id 的竞态隐患

代码确实存在两个紧密相关的问题:

  1. create_wall_timer 是周期定时器,当 end() 失败时不会取消。 定时器回调调用 s->end(failed_status(...), *pending_id),若 expected_idpending_id 不匹配,end() 直接返回 false 而不调用 clear_locked(),定时器会持续按 request_timeout_sec 周期触发。

  2. *pending_idresult.request_id 间存在竞态。 L180 初始化 pending_id = -1,L181 已将其传入 arm_timeout 的 lambda 和 L183 响应回调,但真正赋值是 L201。若任何回调在 L202 前执行(快速服务、同进程调用):

    • 回调捕获的 *pending_id == -1
    • 此时 session->pending_id 仍为空(track() 在 L202)
    • end()expected_id 检查失败:if (expected_id && pending_id != *expected_id) 返回 false
    • clear_locked() 不被调用,定时器持续触发
    • 响应或超时状态被丢弃,session 永远卡在 IN_FLIGHT 状态

建议先调用 async_send_request 并获取 result.request_id,随后再调用 track()arm_timeout();同时在定时器回调内显式调用 timer->cancel() 确保不会重复触发。

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cxx/util/localization/engine.cc` around lines 148 - 202, The timeout
timer can repeatedly fire and pending_id races because arm_timeout is called
before we obtain and track the actual request_id; fix send by calling
session->client->async_send_request(...) first, capture result.request_id, call
session->track(result.request_id), assign *pending_id = result.request_id, then
call arm_timeout(session, pending_id) so the timer sees the real id; inside
arm_timeout's lambda capture the timer (e.g., store the created timer in a
shared_ptr and capture it) and call timer->cancel() at the start of the callback
(in addition to s->timeout_timer cancel handling) to ensure the timer is stopped
even if s->end(...) returns false, referencing functions/symbols: send,
async_send_request, result.request_id, session->track, arm_timeout,
create_wall_timer, timeout_timer, pending_id.

Comment thread src/lua/action.lua
Comment on lines +95 to +99
local function send_and_await(self, mode, fn, x, y, yaw)
if not fn(x, y, yaw) then
self:warn(string.format("reloc skip %s (in_flight)", mode))
return false, nil
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

fn(...)==false 的两种语义被合并成同一条日志

C++ 侧 Localization::send 返回 false 有两种情况:

  • 已有请求在飞 (session->begin() 失败);
  • 服务不可用 (service_is_ready() 检查失败)。

第二种情况下 session->end(failed_status("service unavailable: ...")) 已经把状态写成 FAILED 了,但这里只输出 reloc skip <mode> (in_flight),会误导排障。建议在 false 分支再读一次 api.relocalize_status(),按 state 输出更准确的原因。

🛠️ 建议
-	if not fn(x, y, yaw) then
-		self:warn(string.format("reloc skip %s (in_flight)", mode))
-		return false, nil
-	end
+	if not fn(x, y, yaw) then
+		local st = api.relocalize_status()
+		self:warn(string.format("reloc skip %s | state=%d msg=%s",
+			mode, st.state, tostring(st.message)))
+		return false, st
+	end
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lua/action.lua` around lines 95 - 99, The false return from fn(...) in
send_and_await conflates "in-flight" and "service unavailable" cases; modify
send_and_await to call api.relocalize_status() when fn(x,y,yaw) returns false,
inspect the returned status/state from that API, and log a distinct message
(e.g., "reloc skip <mode> (service_unavailable: ...)" vs the existing "reloc
skip <mode> (in_flight)") and propagate the same false,nil return; use the
function send_and_await and api.relocalize_status() as the insertion points to
read the accurate state and include it in the process warning.

Comment thread src/lua/api.lua
--- @field relocalize_initial fun(x: number, y: number, yaw: number): boolean
--- @field relocalize_local fun(x: number, y: number, yaw: number): boolean
--- @field relocalize_wide fun(x: number, y: number, yaw: number): boolean
--- @field relocalize_status fun(): { state: integer, success: boolean, message: string, fitness_score: number, confidence: number }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

relocalize_status 注解漏了 estimated_* 字段

src/cxx/component.ccrelocalize_status 实际返回的 table 还包含 estimated_x/y/zestimated_qx/qy/qz/qw,但这里只声明了 state/success/message/fitness_score/confidence。LSP 补全和静态检查都会少识别 7 个字段。

🛠️ 建议补全注解
---- `@field` relocalize_status fun(): { state: integer, success: boolean, message: string, fitness_score: number, confidence: number }
+--- `@field` relocalize_status fun(): { state: integer, success: boolean, message: string, fitness_score: number, confidence: number, estimated_x: number, estimated_y: number, estimated_z: number, estimated_qx: number, estimated_qy: number, estimated_qz: number, estimated_qw: number }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
--- @field relocalize_status fun(): { state: integer, success: boolean, message: string, fitness_score: number, confidence: number }
--- `@field` relocalize_status fun(): { state: integer, success: boolean, message: string, fitness_score: number, confidence: number, estimated_x: number, estimated_y: number, estimated_z: number, estimated_qx: number, estimated_qy: number, estimated_qz: number, estimated_qw: number }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lua/api.lua` at line 23, relocalize_status 的注解缺少组件实际返回的 estimated_*
字段:在声明 relocalize_status(Lua 注解的 fun() 返回表)中补充
estimated_x、estimated_y、estimated_z 和
estimated_qx、estimated_qy、estimated_qz、estimated_qw 字段并把类型设为 number,以与
src/cxx/component.cc 中实际返回的 table 匹配,从而让 LSP 补全与静态检查识别这些字段。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant