Skip to content

fix: use mixed arch for sectioned sync kernels#484

Merged
zhangstevenunity merged 3 commits intohw-native-sys:mainfrom
HecreReed:codex/fix-a5-remote-arch
Apr 15, 2026
Merged

fix: use mixed arch for sectioned sync kernels#484
zhangstevenunity merged 3 commits intohw-native-sys:mainfrom
HecreReed:codex/fix-a5-remote-arch

Conversation

@HecreReed
Copy link
Copy Markdown
Collaborator

Summary

  • treat sectioned DAV cube/vector kernels with explicit pipe sync as mixed kernels in generate_testcase.py
  • select dav-c310/dav-c220 for those kernels instead of forcing dav-c310-vec/dav-c220-vec
  • keep single-section and non-sync sectioned kernels on the existing vector-arch path

Why

run_remote_npu_validation.sh was generating testcase CMake for some A5 sectioned kernels with --cce-aicore-arch=dav-c310-vec while still forcing both -D__DAV_CUBE__ and -D__DAV_VEC__. For kernels that also use explicit set_flag/wait_flag or set_intra_block/wait_intra_block across those sections, that diverges from PTO-ISA's mixed-kernel build mode and can trigger compile-time illegal sync parameter errors.

This change aligns those sectioned-sync kernels with the mixed-kernel compile path without broadening the behavior for ordinary single-section kernels.

Validation

  • imported generate_testcase.py and verified _infer_aicore_arch() returns dav-c310 for A5 mixed-section kernels with set_flag/wait_flag
  • ran generate_testcase.py on a synthetic A5 mixed-section kernel and confirmed generated CMakeLists.txt uses --cce-aicore-arch=dav-c310

Scope

This PR only targets the compile-time arch selection issue in remote validation. It does not attempt to address separate runtime MTE OOR failures.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refines the AI Core architecture inference logic by introducing a check for mixed section synchronization, which now accounts for both intra-block and explicit pipe synchronization flags. This ensures that sectioned kernels on A5 and 910B platforms are correctly compiled using the dav-c310 mixed-kernel mode. Review feedback suggests improving the robustness of synchronization function detection by using regular expressions instead of simple string matching to handle whitespace variations. Additionally, there is a recommendation to ensure consistent architecture assignment for other platforms, such as Ascend910, when mixed section synchronization is identified to avoid potential compilation errors.

Comment on lines +934 to 935
if has_mixed_section_sync:
return "dav-c310"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This logic correctly identifies the need for dav-c310 on A5/910B platforms when has_mixed_section_sync is true, but it fails to handle other platforms (like Ascend910) which should use dav-c220 in this case. This is inconsistent with the logic implemented in generate_testcase (lines 1425-1432) and may lead to compilation errors on those platforms. Consider updating this helper to return dav-c220 for non-A5/910B SoCs when mixed section synchronization is detected.

#
# IMPORTANT: the default arch depends on the Ascend SoC.
has_mix_macros = "__DAV_CUBE__" in kernel_text and "__DAV_VEC__" in kernel_text
has_flag_sync = "set_flag(" in kernel_text or "wait_flag(" in kernel_text
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The string matching for set_flag( and wait_flag( is fragile as it does not account for potential whitespace between the function name and the parenthesis (e.g., set_flag (PIPE_V, ...)). Using a regular expression with word boundaries and optional whitespace is more robust and consistent with other checks in this script (like has_packed_pred_mask).

Suggested change
has_flag_sync = "set_flag(" in kernel_text or "wait_flag(" in kernel_text
has_flag_sync = re.search(r"\b(set|wait)_flag\s*\(", kernel_text) is not None

has_packed_pred_mask = re.search(r"\bTCMPS?\s*\(", raw_kernel_for_analysis) is not None
has_dav_cube = "__DAV_CUBE__" in raw_kernel
has_dav_vec = "__DAV_VEC__" in raw_kernel
has_flag_sync = "set_flag(" in raw_kernel or "wait_flag(" in raw_kernel
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the check in _infer_aicore_arch, this string matching is fragile. Using a regular expression would be more robust against variations in code formatting.

Suggested change
has_flag_sync = "set_flag(" in raw_kernel or "wait_flag(" in raw_kernel
has_flag_sync = re.search(r"\b(set|wait)_flag\s*\(", raw_kernel) is not None

@reedhecre
Copy link
Copy Markdown

reedhecre commented Apr 15, 2026

Codex Review

该评论由 review 机器人自动更新。

  • PR: fix: use mixed arch for sectioned sync kernels #484 fix: use mixed arch for sectioned sync kernels
  • Author: HecreReed
  • Base/Head: main / codex/fix-a5-remote-arch
  • Head SHA: 63cf3d92ea3d
  • Trigger: PR 有新提交
  • Generated At: 2026-04-15T07:09:24Z
  • Previous Head SHA: c8cca2a42497
  • Status: completed

Summary

未检查到 PR #484 存在问题,并返回 findings=[]。

Findings

No issues found.

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 qwen3_decode_layer_incore_0,qwen3_decode_layer_incore_1,qwen3_decode_layer_incore_2,qwen3_decode_layer_incore_3,qwen3_decode_layer_incore_4,qwen3_decode_layer_incore_5,qwen3_decode_layer_incore_6,qwen3_decode_layer_incore_7,qwen3_decode_layer_incore_8,qwen3_decode_layer_incore_9,qwen3_decode_layer_incore_10,qwen3_decode_layer_incore_11,qwen3_decode_layer_incore_12,qwen3_decode_layer_incore_13,qwen3_decode_layer_incore_14,qwen3_decode_layer_incore_15,qwen3_decode_layer_incore_16,qwen3_decode_layer_incore_17,qwen3_decode_layer_incore_18,qwen3_decode_layer_incore_19 --pto-level=level3

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:7893e5cc2494
  • 源码策略:origin/main + PR merge commit 7893e5cc2494
  • 结果汇总:OK 14 / FAIL 6 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260415_140005_manual_pr484.log
  • 手动指令:/run a5 qwen3_decode_layer_incore_0 qwen3_decode_layer_incore_1 qwen3_decode_layer_incore_2 qwen3_decode_layer_incore_3 qwen3_decode_layer_incore_4 qwen3_decode_layer_incore_5 qwen3_decode_layer_incore_6 qwen3_decode_layer_incore_7 qwen3_decode_layer_incore_8 qwen3_decode_layer_incore_9 qwen3_decode_layer_incore_10 qwen3_decode_layer_incore_11 qwen3_decode_layer_incore_12 qwen3_decode_layer_incore_13 qwen3_decode_layer_incore_14 qwen3_decode_layer_incore_15 qwen3_decode_layer_incore_16 qwen3_decode_layer_incore_17 qwen3_decode_layer_incore_18 qwen3_decode_layer_incore_19 --pto-level=level3
  • 触发人:HecreReed
  • 指定用例:qwen3_decode_layer_incore_0,qwen3_decode_layer_incore_1,qwen3_decode_layer_incore_2,qwen3_decode_layer_incore_3,qwen3_decode_layer_incore_4,qwen3_decode_layer_incore_5,qwen3_decode_layer_incore_6,qwen3_decode_layer_incore_7,qwen3_decode_layer_incore_8,qwen3_decode_layer_incore_9,qwen3_decode_layer_incore_10,qwen3_decode_layer_incore_11,qwen3_decode_layer_incore_12,qwen3_decode_layer_incore_13,qwen3_decode_layer_incore_14,qwen3_decode_layer_incore_15,qwen3_decode_layer_incore_16,qwen3_decode_layer_incore_17,qwen3_decode_layer_incore_18,qwen3_decode_layer_incore_19
  • PTOAS 参数:--pto-level=level3
  • 触发评论:fix: use mixed arch for sectioned sync kernels #484 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • qwen3_decode_layer_incore_8 (run, exit=2)
  • qwen3_decode_layer_incore_6 (run, exit=1)
  • qwen3_decode_layer_incore_5 (run, exit=2)
  • qwen3_decode_layer_incore_4 (run, exit=2)
  • qwen3_decode_layer_incore_3 (run, exit=2)
  • qwen3_decode_layer_incore_17 (run, exit=1)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #484

qwen3_decode_layer_incore_8

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:97:23: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID0);
                      ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:100:24: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID0);
                       ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:105:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:105:23: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
                      ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:108:13: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:108:24: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
                       ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:110:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_FIX, EVENT_ID0);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:110:20: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_FIX, EVENT_ID0);
                   ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:114:13: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_M, PIPE_FIX, EVENT_ID0);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_8/qwen3_decode_layer_incore_8_kernel.cpp:114:21: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_M, PIPE_FIX, EVENT_ID0);
                    ^
10 errors generated.
gmake[2]: *** [CMakeFiles/qwen3_decode_layer_incore_8_kernel.dir/build.make:76: CMakeFiles/qwen3_decode_layer_incore_8_kernel.dir/qwen3_decode_layer_incore_8_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/qwen3_decode_layer_incore_8_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-04-15 14:02:39] ERROR: testcase failed (exit 2): qwen3_decode_layer_incore_8
qwen3_decode_layer_incore_6

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_6/main.cpp:141)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 141733] 2026-04-15-14:03:08.376.247 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 13, there is an aivec error exception, core id is 0, error code = 95, dump info: pc start: 0x100040800000, current: 0x1000408004ac, sc error info: 0xffffffffffff, su error info: 0xf7f7d23d139c5bd7,0xcc3fd0e010009bfd, mte error info: 0x200a1, vec error info: 0xe7dbff9e0017db84, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0x80000000.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(95) errorStr: The DDR address of the MTE instruction is out of range. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=62, report_stream_id=62, task_id=0, flip_num=0, fault kernel_name=_Z27qwen3_decode_layer_incore_6PfS_Pu6__bf16S_S_S_S0_S_ii, fault kernel info ext=_Z27qwen3_decode_layer_incore_6PfS_Pu6__bf16S_S_S_S0_S_ii, program id=0, hash=1851969691321044957.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-04-15 14:03:13] ERROR: testcase failed (exit 1): qwen3_decode_layer_incore_6
qwen3_decode_layer_incore_5

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:93:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID2);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:94:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID3);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:95:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID4);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:96:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID5);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:97:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID1);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:97:20: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID1);
                   ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:99:23: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID0);
                      ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:106:23: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID1);
                      ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:109:24: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID0);
                       ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:111:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID0);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:114:24: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID1);
                       ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:116:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:116:23: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
                      ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:117:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID1);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:120:13: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:120:24: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
                       ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:122:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID0);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:122:20: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID0);
                   ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_5/qwen3_decode_layer_incore_5_kernel.cpp:123:13: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID0);
            ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
gmake[2]: *** [CMakeFiles/qwen3_decode_layer_incore_5_kernel.dir/build.make:76: CMakeFiles/qwen3_decode_layer_incore_5_kernel.dir/qwen3_decode_layer_incore_5_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/qwen3_decode_layer_incore_5_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-04-15 14:03:15] ERROR: testcase failed (exit 2): qwen3_decode_layer_incore_5
qwen3_decode_layer_incore_4

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:93:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID2);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:94:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID3);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:95:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID4);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:96:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID5);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:97:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID1);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:97:20: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID1);
                   ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:99:23: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID0);
                      ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:106:23: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID1);
                      ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:109:24: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID0);
                       ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:111:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID0);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:114:24: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID1);
                       ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:116:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:116:23: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
                      ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:117:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID1);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:120:13: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:120:24: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
                       ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:122:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID0);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:122:20: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID0);
                   ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_4/qwen3_decode_layer_incore_4_kernel.cpp:123:13: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID0);
            ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
gmake[2]: *** [CMakeFiles/qwen3_decode_layer_incore_4_kernel.dir/build.make:76: CMakeFiles/qwen3_decode_layer_incore_4_kernel.dir/qwen3_decode_layer_incore_4_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/qwen3_decode_layer_incore_4_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-04-15 14:03:17] ERROR: testcase failed (exit 2): qwen3_decode_layer_incore_4
qwen3_decode_layer_incore_3

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:92:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID2);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:93:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID3);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:94:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID4);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:95:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID5);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:96:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID1);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:96:20: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID1);
                   ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:98:23: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID0);
                      ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:105:23: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID1);
                      ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:108:24: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID0);
                       ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:110:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID0);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:113:24: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE2, PIPE_MTE1, EVENT_ID1);
                       ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:115:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:115:23: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
                      ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:116:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID1);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:119:13: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
            ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:119:24: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE1, PIPE_M, EVENT_ID0);
                       ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:121:12: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID0);
           ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:121:20: error: the ranges of 2nd parameter must be [0, 1], [4, 5]
  set_flag(PIPE_M, PIPE_MTE1, EVENT_ID0);
                   ^
/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_3/qwen3_decode_layer_incore_3_kernel.cpp:122:13: error: the ranges of 1st parameter must be [0, 1], [4, 5]
  wait_flag(PIPE_MTE1, PIPE_MTE2, EVENT_ID0);
            ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
gmake[2]: *** [CMakeFiles/qwen3_decode_layer_incore_3_kernel.dir/build.make:76: CMakeFiles/qwen3_decode_layer_incore_3_kernel.dir/qwen3_decode_layer_incore_3_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/qwen3_decode_layer_incore_3_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-04-15 14:03:19] ERROR: testcase failed (exit 2): qwen3_decode_layer_incore_3
qwen3_decode_layer_incore_17

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507015 (/tmp/ptoas-board-monitor-a5/runs/20260415_140005_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_17/main.cpp:108)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 146616] 2026-04-15-14:04:57.451.130 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 14, there is an aicore error exception, core id is 0, error code = 95, dump info: pc start: 0x100040800008, current: 0x10004080016c, sc error info: 0xffffffffffff, su error info: 0xfefffeec1efe9387,0x7fddfebff8007fff, mte error info: 0x22601000002005d, vec error info: 0, cube error info: 0, l1 error info: 0xffbf0017f6ee, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0x80000000.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(95) errorStr: The DDR address of the MTE instruction is out of range. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       The error from device(chipId:0, dieId:0), serial number is 15, there is an aivec error exception, core id is 0, error code = 0, dump info: pc start: 0x100040800764, current: 0x1000408008b8, sc error info: 0xffffffffffff, su error info: 0xf7f7d23d139c5bd7,0xcc3fd0e010009bfd, mte error info: 0x200a1, vec error info: 0xe7dbff9e0017db84, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
       The extend info: errcode:(0) errorStr: timeout or trap error. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       The error from device(chipId:0, dieId:0), serial number is 15, there is an aivec error exception, core id is 1, error code = 0, dump info: pc start: 0x100040800764, current: 0x100040800b68, sc error info: 0xffffffffffff, su error info: 0x2985b4fc1dfeefdf,0xe64ef56bc000acdb, mte error info: 0xdebf63730007defe, vec error info: 0x4d6c3f7f001cfccf, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
       Kernel task happen error, retCode=0x26, [aicore exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AICORE Kernel task happen error, retCode=0x26.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:(no result)[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=62, report_stream_id=62, task_id=0, flip_num=0, fault kernel_name=_Z28qwen3_decode_layer_incore_17Pu6__bf16S_S_S_i, fault kernel info ext=_Z28qwen3_decode_layer_incore_17Pu6__bf16S_S_S_i, program id=0, hash=583490179710791017.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=aicore exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507015[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-04-15 14:05:01] ERROR: testcase failed (exit 1): qwen3_decode_layer_incore_17

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 qwen3_decode_layer_incore_0,qwen3_decode_layer_incore_1,qwen3_decode_layer_incore_2,qwen3_decode_layer_incore_3,qwen3_decode_layer_incore_4,qwen3_decode_layer_incore_5,qwen3_decode_layer_incore_6,qwen3_decode_layer_incore_7,qwen3_decode_layer_incore_8,qwen3_decode_layer_incore_9,qwen3_decode_layer_incore_10,qwen3_decode_layer_incore_11,qwen3_decode_layer_incore_12,qwen3_decode_layer_incore_13,qwen3_decode_layer_incore_14,qwen3_decode_layer_incore_15,qwen3_decode_layer_incore_16,qwen3_decode_layer_incore_17,qwen3_decode_layer_incore_18,qwen3_decode_layer_incore_19 --pto-level=level3

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:9ad088ee2511
  • 源码策略:origin/main + PR merge commit 9ad088ee2511
  • 结果汇总:OK 18 / FAIL 2 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260415_142907_manual_pr484.log
  • 手动指令:/run a5 qwen3_decode_layer_incore_0 qwen3_decode_layer_incore_1 qwen3_decode_layer_incore_2 qwen3_decode_layer_incore_3 qwen3_decode_layer_incore_4 qwen3_decode_layer_incore_5 qwen3_decode_layer_incore_6 qwen3_decode_layer_incore_7 qwen3_decode_layer_incore_8 qwen3_decode_layer_incore_9 qwen3_decode_layer_incore_10 qwen3_decode_layer_incore_11 qwen3_decode_layer_incore_12 qwen3_decode_layer_incore_13 qwen3_decode_layer_incore_14 qwen3_decode_layer_incore_15 qwen3_decode_layer_incore_16 qwen3_decode_layer_incore_17 qwen3_decode_layer_incore_18 qwen3_decode_layer_incore_19 --pto-level=level3
  • 触发人:HecreReed
  • 指定用例:qwen3_decode_layer_incore_0,qwen3_decode_layer_incore_1,qwen3_decode_layer_incore_2,qwen3_decode_layer_incore_3,qwen3_decode_layer_incore_4,qwen3_decode_layer_incore_5,qwen3_decode_layer_incore_6,qwen3_decode_layer_incore_7,qwen3_decode_layer_incore_8,qwen3_decode_layer_incore_9,qwen3_decode_layer_incore_10,qwen3_decode_layer_incore_11,qwen3_decode_layer_incore_12,qwen3_decode_layer_incore_13,qwen3_decode_layer_incore_14,qwen3_decode_layer_incore_15,qwen3_decode_layer_incore_16,qwen3_decode_layer_incore_17,qwen3_decode_layer_incore_18,qwen3_decode_layer_incore_19
  • PTOAS 参数:--pto-level=level3
  • 触发评论:fix: use mixed arch for sectioned sync kernels #484 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • qwen3_decode_layer_incore_6 (run, exit=1)
  • qwen3_decode_layer_incore_17 (run, exit=1)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #484

qwen3_decode_layer_incore_6

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor-a5/runs/20260415_142907_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_6/main.cpp:141)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 169312] 2026-04-15-14:32:22.232.869 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 16, there is an aivec error exception, core id is 0, error code = 95, dump info: pc start: 0x100040800000, current: 0x1000408004ac, sc error info: 0xffffffffffff, su error info: 0xf7f7d23d139c5bd7,0xcc3fd0e010009bfd, mte error info: 0x200a1, vec error info: 0xe7dbff9e0017db84, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0x80000000.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(95) errorStr: The DDR address of the MTE instruction is out of range. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=62, report_stream_id=62, task_id=0, flip_num=0, fault kernel_name=_Z27qwen3_decode_layer_incore_6PfS_Pu6__bf16S_S_S_S0_S_ii, fault kernel info ext=_Z27qwen3_decode_layer_incore_6PfS_Pu6__bf16S_S_S_S0_S_ii, program id=0, hash=1851969691321044957.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-04-15 14:32:27] ERROR: testcase failed (exit 1): qwen3_decode_layer_incore_6
qwen3_decode_layer_incore_17

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507015 (/tmp/ptoas-board-monitor-a5/runs/20260415_142907_manual_pr484/npu_validation/Qwen3Tilelet/qwen3_decode_layer_incore_17/main.cpp:108)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 176164] 2026-04-15-14:35:09.257.957 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 17, there is an aicore error exception, core id is 0, error code = 95, dump info: pc start: 0x100040800008, current: 0x10004080016c, sc error info: 0xffffffffffff, su error info: 0xfefffeec1efe9387,0x7fddfebff8007fff, mte error info: 0x22601000002005d, vec error info: 0, cube error info: 0, l1 error info: 0xffbf0017f6ee, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0x80000000.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
        TraceBack (most recent call last):
       The extend info: errcode:(95) errorStr: The DDR address of the MTE instruction is out of range. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       The error from device(chipId:0, dieId:0), serial number is 18, there is an aivec error exception, core id is 0, error code = 0, dump info: pc start: 0x100040800764, current: 0x1000408008b8, sc error info: 0xffffffffffff, su error info: 0xf7f7d23d139c5bd7,0xcc3fd0e010009bfd, mte error info: 0x200a1, vec error info: 0xe7dbff9e0017db84, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
       The extend info: errcode:(0) errorStr: timeout or trap error. subErrType: 0x4.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:583]
       The error from device(chipId:0, dieId:0), serial number is 18, there is an aivec error exception, core id is 1, error code = 0, dump info: pc start: 0x100040800764, current: 0x100040800b68, sc error info: 0xffffffffffff, su error info: 0x2985b4fc1dfeefdf,0xe64ef56bc000acdb, mte error info: 0xdebf63730007defe, vec error info: 0x4d6c3f7f001cfccf, cube error info: 0, l1 error info: 0, aic error mask: 0x395856, para base: 0x100040200000, mte error: 0.[FUNC:ProcessDavidStarsCoreErrorInfo][FILE:device_error_proc_c.cc][LINE:580]
       Kernel task happen error, retCode=0x26, [aicore exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1728]
       AICORE Kernel task happen error, retCode=0x26.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [AIC_INFO] after execute:(no result)[FUNC:GetError][FILE:stream.cc][LINE:1478]
       [DFX_INFO]Aicore kernel execute failed, device_id=1, stream_id=62, report_stream_id=62, task_id=0, flip_num=0, fault kernel_name=_Z28qwen3_decode_layer_incore_17Pu6__bf16S_S_S_i, fault kernel info ext=_Z28qwen3_decode_layer_incore_17Pu6__bf16S_S_S_i, program id=0, hash=583490179710791017.[FUNC:GetError][FILE:stream.cc][LINE:1478]
       rtStreamSynchronize execution failed, reason=aicore exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507015[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-04-15 14:35:14] ERROR: testcase failed (exit 1): qwen3_decode_layer_incore_17

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 qwen3_decode_layer_incore_0,qwen3_decode_layer_incore_1,qwen3_decode_layer_incore_2,qwen3_decode_layer_incore_3,qwen3_decode_layer_incore_4,qwen3_decode_layer_incore_5,qwen3_decode_layer_incore_6,qwen3_decode_layer_incore_7,qwen3_decode_layer_incore_8,qwen3_decode_layer_incore_9,qwen3_decode_layer_incore_10,qwen3_decode_layer_incore_11,qwen3_decode_layer_incore_12,qwen3_decode_layer_incore_13,qwen3_decode_layer_incore_14,qwen3_decode_layer_incore_15,qwen3_decode_layer_incore_16,qwen3_decode_layer_incore_17,qwen3_decode_layer_incore_18,qwen3_decode_layer_incore_19 --pto-level=level3

@HecreReed HecreReed marked this pull request as ready for review April 15, 2026 07:09
@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:1bce1d471dad
  • 源码策略:origin/main + PR merge commit 1bce1d471dad
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260415_151004_manual_pr484.log
  • 手动指令:/run a5 qwen3_decode_layer_incore_0 qwen3_decode_layer_incore_1 qwen3_decode_layer_incore_2 qwen3_decode_layer_incore_3 qwen3_decode_layer_incore_4 qwen3_decode_layer_incore_5 qwen3_decode_layer_incore_6 qwen3_decode_layer_incore_7 qwen3_decode_layer_incore_8 qwen3_decode_layer_incore_9 qwen3_decode_layer_incore_10 qwen3_decode_layer_incore_11 qwen3_decode_layer_incore_12 qwen3_decode_layer_incore_13 qwen3_decode_layer_incore_14 qwen3_decode_layer_incore_15 qwen3_decode_layer_incore_16 qwen3_decode_layer_incore_17 qwen3_decode_layer_incore_18 qwen3_decode_layer_incore_19 --pto-level=level3
  • 触发人:HecreReed
  • 指定用例:qwen3_decode_layer_incore_0,qwen3_decode_layer_incore_1,qwen3_decode_layer_incore_2,qwen3_decode_layer_incore_3,qwen3_decode_layer_incore_4,qwen3_decode_layer_incore_5,qwen3_decode_layer_incore_6,qwen3_decode_layer_incore_7,qwen3_decode_layer_incore_8,qwen3_decode_layer_incore_9,qwen3_decode_layer_incore_10,qwen3_decode_layer_incore_11,qwen3_decode_layer_incore_12,qwen3_decode_layer_incore_13,qwen3_decode_layer_incore_14,qwen3_decode_layer_incore_15,qwen3_decode_layer_incore_16,qwen3_decode_layer_incore_17,qwen3_decode_layer_incore_18,qwen3_decode_layer_incore_19
  • PTOAS 参数:--pto-level=level3
  • 触发评论:fix: use mixed arch for sectioned sync kernels #484 (comment)
  • 失败阶段:fetch-source-archive-fallback / exit=128

日志尾部

nual_pr484/repo
cd /tmp/ptoas-board-monitor-a5/runs/20260415_151004_manual_pr484/repo
git init -q .
git remote add origin https://github.com/hw-native-sys/PTOAS.git
rc=0
for attempt in 1 2 3; do
  if timeout --signal=TERM --kill-after=10 180s git -c http.version=HTTP/1.1 -c http.lowSpeedLimit=1 -c http.lowSpeedTime=60 fetch --depth 1 --no-tags origin +refs/pull/484/merge:refs/remotes/origin/pr/484/merge; then
    git checkout -q -f refs/remotes/origin/pr/484/merge
    exit 0
  else
    rc=$?
  fi
  sleep $((attempt * 2))
done
if timeout --signal=TERM --kill-after=10 60s git -c http.version=HTTP/1.1 -c http.lowSpeedLimit=1 -c http.lowSpeedTime=30 ls-remote --exit-code origin refs/pull/484/head >/dev/null 2>&1; then
  echo 'merge conflict against origin/main; board validation skipped' >&2
  exit 86
fi
exit "$rc"
fatal: unable to access 'https://github.com/hw-native-sys/PTOAS.git/': Failed to connect to 127.0.0.1 port 17890 after 0 ms: Couldn't connect to server
fatal: unable to access 'https://github.com/hw-native-sys/PTOAS.git/': Failed to connect to 127.0.0.1 port 17890 after 0 ms: Couldn't connect to server
fatal: unable to access 'https://github.com/hw-native-sys/PTOAS.git/': connection to proxy closed
===== END STAGE fetch-source rc=128 @ 2026-04-15 15:10:47 =====
git fetch path failed for fetch-source with rc=128; falling back to archive 1bce1d471dadd9ad7427679d89b012db2910faed

===== STAGE fetch-source-archive-fallback @ 2026-04-15 15:10:47 =====
download https://codeload.github.com/hw-native-sys/PTOAS/tar.gz/1bce1d471dadd9ad7427679d89b012db2910faed
archive fallback failed: curl: (7) Failed to connect to codeload.github.com port 443 via 127.0.0.1 after 0 ms: Could not connect to server
===== END STAGE fetch-source-archive-fallback rc=128 @ 2026-04-15 15:10:47 =====

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a5 qwen3_decode_layer_incore_0,qwen3_decode_layer_incore_1,qwen3_decode_layer_incore_2,qwen3_decode_layer_incore_3,qwen3_decode_layer_incore_4,qwen3_decode_layer_incore_5,qwen3_decode_layer_incore_6,qwen3_decode_layer_incore_7,qwen3_decode_layer_incore_8,qwen3_decode_layer_incore_9,qwen3_decode_layer_incore_10,qwen3_decode_layer_incore_11,qwen3_decode_layer_incore_12,qwen3_decode_layer_incore_13,qwen3_decode_layer_incore_14,qwen3_decode_layer_incore_15,qwen3_decode_layer_incore_16,qwen3_decode_layer_incore_17,qwen3_decode_layer_incore_18,qwen3_decode_layer_incore_19 --pto-level=level3

@reedhecre
Copy link
Copy Markdown

A5 板测成功

  • 触发方式:manual
  • 源码提交:1bce1d471dad
  • 源码策略:origin/main + PR merge commit 1bce1d471dad
  • 结果汇总:OK 20 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260415_152115_manual_pr484.log
  • 结果 TSV:/root/ptoas-board-monitor-a5/logs/20260415_152115_manual_pr484.tsv
  • 手动指令:/run a5 qwen3_decode_layer_incore_0 qwen3_decode_layer_incore_1 qwen3_decode_layer_incore_2 qwen3_decode_layer_incore_3 qwen3_decode_layer_incore_4 qwen3_decode_layer_incore_5 qwen3_decode_layer_incore_6 qwen3_decode_layer_incore_7 qwen3_decode_layer_incore_8 qwen3_decode_layer_incore_9 qwen3_decode_layer_incore_10 qwen3_decode_layer_incore_11 qwen3_decode_layer_incore_12 qwen3_decode_layer_incore_13 qwen3_decode_layer_incore_14 qwen3_decode_layer_incore_15 qwen3_decode_layer_incore_16 qwen3_decode_layer_incore_17 qwen3_decode_layer_incore_18 qwen3_decode_layer_incore_19 --pto-level=level3
  • 触发人:HecreReed
  • 指定用例:qwen3_decode_layer_incore_0,qwen3_decode_layer_incore_1,qwen3_decode_layer_incore_2,qwen3_decode_layer_incore_3,qwen3_decode_layer_incore_4,qwen3_decode_layer_incore_5,qwen3_decode_layer_incore_6,qwen3_decode_layer_incore_7,qwen3_decode_layer_incore_8,qwen3_decode_layer_incore_9,qwen3_decode_layer_incore_10,qwen3_decode_layer_incore_11,qwen3_decode_layer_incore_12,qwen3_decode_layer_incore_13,qwen3_decode_layer_incore_14,qwen3_decode_layer_incore_15,qwen3_decode_layer_incore_16,qwen3_decode_layer_incore_17,qwen3_decode_layer_incore_18,qwen3_decode_layer_incore_19
  • PTOAS 参数:--pto-level=level3
  • 触发评论:fix: use mixed arch for sectioned sync kernels #484 (comment)

@zhangstevenunity zhangstevenunity merged commit 122796b into hw-native-sys:main Apr 15, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants