Skip to content

unroll 16 for convolution x86 bf16s avx512#6680

Open
nihui wants to merge 10 commits intoTencent:masterfrom
nihui:convolution-x86-bf16s-2
Open

unroll 16 for convolution x86 bf16s avx512#6680
nihui wants to merge 10 commits intoTencent:masterfrom
nihui:convolution-x86-bf16s-2

Conversation

@nihui
Copy link
Copy Markdown
Member

@nihui nihui commented Apr 14, 2026

No description provided.

@github-actions github-actions Bot added the x86 label Apr 14, 2026
@tencent-adm
Copy link
Copy Markdown
Member

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 14, 2026

Codecov Report

❌ Patch coverage is 93.76580% with 74 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.93%. Comparing base (10cee2a) to head (24aefe3).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
src/layer/x86/convolution_im2col_gemm.h 94.62% 42 Missing ⚠️
src/layer/x86/convolution_3x3_winograd.h 92.11% 32 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6680      +/-   ##
==========================================
+ Coverage   93.92%   93.93%   +0.01%     
==========================================
  Files         933      933              
  Lines      300879   303335    +2456     
==========================================
+ Hits       282599   284942    +2343     
- Misses      18280    18393     +113     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@nihui nihui closed this May 9, 2026
@nihui nihui reopened this May 9, 2026
@nihui nihui requested a review from Copilot May 9, 2026 07:53
@nihui
Copy link
Copy Markdown
Member Author

nihui commented May 9, 2026

@codex review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates x86 AVX512 convolution GEMM kernels to use a wider N-dimension unroll of 16 (vs 12), and adjusts tiling/packing paths accordingly to better match AVX512 register width and improve throughput in im2col+GEMM and Winograd codepaths.

Changes:

  • Update AVX512F GEMM microkernels to compute/store 16 output columns per inner loop (new sum registers, updated pointer increments, transpose/store paths).
  • Add AVX512F-specific jj+=16 packing/compute loops in several remaining kernels, keeping 12/8/4 fallbacks where applicable.
  • Update optimal tile selection to prefer TILE_N = 16 (AVX512F) and TILE_N = 12 (AVX).

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/layer/x86/convolution_im2col_gemm.h Expands AVX512F im2col+GEMM N-tile unroll to 16 and updates tiling heuristics accordingly.
src/layer/x86/convolution_3x3_winograd.h Updates Winograd BT packing and GEMM kernels to use 16-wide AVX512F N-tiles and adjusts related tiling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/layer/x86/convolution_3x3_winograd.h
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@github-actions github-actions Bot added the test label May 9, 2026
@nihui nihui requested a review from Copilot May 9, 2026 11:09
@nihui
Copy link
Copy Markdown
Member Author

nihui commented May 9, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Chef's kiss.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 4 changed files in this pull request and generated 1 comment.

Comment thread src/layer/x86/convolution_im2col_gemm.h Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants