Skip to content

unify api calling#49

Merged
wine99 merged 4 commits intoravi9:dev_backend_openvinofrom
zhaixuejun1993:xuejun/unify-api-get_ov_output_tensor
Mar 3, 2026
Merged

unify api calling#49
wine99 merged 4 commits intoravi9:dev_backend_openvinofrom
zhaixuejun1993:xuejun/unify-api-get_ov_output_tensor

Conversation

@zhaixuejun1993
Copy link
Copy Markdown
Collaborator

@zhaixuejun1993 zhaixuejun1993 commented Feb 28, 2026

  1. Create new interface "create_ov_output_tensor()" to create the output tensors
  2. Fixed the issue caused by the shape change in llama-bench runtime
  3. Unified the calling for this to avoid many sub code snippet in many place
  4. this change is also for the future changes (hash index replace the name index)

@wine99
Copy link
Copy Markdown
Collaborator

wine99 commented Feb 28, 2026

3259921#diff-f03e5dcc8ad94468267d70b7a2ff7926977ecbc1395285e416ee5ad3b13fab78R357-R359

LGTM. The modified lines were originally introduced in the commit above by @cavusmustafa — could you take a look?

Copy link
Copy Markdown
Collaborator

@cavusmustafa cavusmustafa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The earlier change was to fix llama-bench -d 1024 failure on NPU. After this change it fails again on my side.

@cavusmustafa cavusmustafa self-requested a review March 2, 2026 21:02
@zhaixuejun1993
Copy link
Copy Markdown
Collaborator Author

The earlier change was to fix llama-bench -d 1024 failure on NPU. After this change it fails again on my side.

@cavusmustafa
the failed caused by the output shape of "result_norm" changed from 1 iter to 2 iter on prefill phase ([1, 1, 1, 2048] -> [1, 1, 0, 2048]). Will enable your solution get shape from inferrequest to avoid this kind of issue.

@zhaixuejun1993
Copy link
Copy Markdown
Collaborator Author

Verified with llama-simple & llama-bench (-d 1024). PASS.
pls review again

@wine99 wine99 merged commit bc87902 into ravi9:dev_backend_openvino Mar 3, 2026
58 of 75 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants