benchncnn_llm#6711
Conversation
|
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d4da0f8dd2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
|
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
There was a problem hiding this comment.
Pull request overview
Adds a new LLM-focused benchmark target (benchncnn_llm) to the benchmark/ suite, along with embedded NCNN decoder param assets for several small LLMs so they can be benchmarked without external model files.
Changes:
- Introduce
benchncnn_llmexecutable that runs prefill (1k) and decode (1 token with 1k KV cache) timing for multiple decoder graphs. - Extend
benchmark/CMakeLists.txtto generate and embed.ncnn.paramheaders for LLM decoder models (benchncnn_llm_param_data.h) and build the new target. - Add new LLM decoder
.ncnn.paramfiles underbenchmark/models/llm/.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| benchmark/CMakeLists.txt | Adds param-header generation + build rules for new benchncnn_llm benchmark target. |
| benchmark/benchncnn_llm.cpp | New benchmark driver that loads embedded LLM decoder params, builds inputs, runs warmup, and reports timing/tokens-per-second. |
| benchmark/models/llm/tinyllama_1.1b_decoder.ncnn.param | Adds embedded TinyLlama decoder graph for benchmarking. |
| benchmark/models/llm/qwen2.5_0.5b_decoder.ncnn.param | Adds embedded Qwen2.5 0.5B decoder graph for benchmarking. |
| benchmark/models/llm/llama3.2_1b_decoder.ncnn.param | Adds embedded Llama 3.2 1B decoder graph for benchmarking. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
No description provided.