meta-pytorch / MSLK Public

Notifications You must be signed in to change notification settings
Fork 46
Star 107

Code
Issues
Pull requests 77
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security and quality
Insights

Pull requests: meta-pytorch/MSLK

Labels 15 Milestones 0

New pull request New

77 Open 276 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Enable per-token scaled FP4 grouped gemm on B200 cla signed fb-exported meta-exported

#356 opened May 22, 2026 by jwfromm Contributor

Loading…

Deprecate usage of hypothesis cla signed fb-exported meta-exported

#353 opened May 21, 2026 by cthi Contributor

Loading…

Apply fixup patch to fbsource cla signed fb-exported meta-exported

#349 opened May 19, 2026 by jwfromm Contributor

Loading…

Add pyre-strict to mslk/mslk/conv/_meta.py cla signed fb-exported meta-exported

#346 opened May 18, 2026 by jwfromm Contributor

Loading…

Switch to upstream Cutlass dependency cla signed fb-exported meta-exported

#342 opened May 12, 2026 by jwfromm Contributor

Loading…

Remove FBGEMM as a conda dependency cla signed fb-exported meta-exported

#334 opened Apr 26, 2026 by jwfromm Contributor

Loading…

Update GB200 RL Conda package with MSLK Fixes. cla signed fb-exported meta-exported

#333 opened Apr 25, 2026 by jwfromm Contributor

Loading…

[CUDA] [PERFORMANCE] Increase speed of bf16bf16bf16_grouped_wgrad via indicating that ElementC is void / nullptr cla signed

#329 opened Apr 19, 2026 by benediktjohannes

Loading…

[WIP] Migrate MOE kernels to libtorch ABI stable cla signed

#325 opened Apr 14, 2026 by janeyx99 Contributor • Draft

Add native MX8×MX4 mixed-precision GEMM kernel (f8f4bf16) cla signed fb-exported meta-exported

#313 opened Apr 6, 2026 by isratnisa

Loading…

CUDA graph support — 5x speedup at small N cla signed fb-exported meta-exported

#309 opened Apr 2, 2026 by jduprat Contributor

Loading…

Block-sparse compressed attention (sub-quadratic compressed branch) (#308) cla signed fb-exported meta-exported

#308 opened Apr 2, 2026 by jduprat Contributor

Loading…

NSA backward — benchmarks and performance documentation (#307) cla signed fb-exported meta-exported

#307 opened Apr 2, 2026 by jduprat Contributor

Loading…

NSA backward — autograd function (fixed-length + varlen) (#306) cla signed fb-exported meta-exported

#306 opened Apr 2, 2026 by jduprat Contributor

Loading…

NSA backward — FA4 backward wrapper, block-sparse index transpose (#305) cla signed fb-exported meta-exported

#305 opened Apr 2, 2026 by jduprat Contributor

Loading…

NSA backward — compression and gating backward kernels (#304) cla signed fb-exported meta-exported

#304 opened Apr 2, 2026 by jduprat Contributor

Loading…

Fix int32 overflow in CuteDSL kernels for N >= 2M cla signed fb-exported meta-exported

#303 opened Apr 2, 2026 by jduprat Contributor

Loading…

Fused CuteDSL kernel for KV compression (#302) cla signed fb-exported meta-exported

#302 opened Apr 2, 2026 by jduprat Contributor

Loading…

Fused CuteDSL kernel for block selection scoring (#301) cla signed fb-exported meta-exported

#301 opened Apr 2, 2026 by jduprat Contributor

Loading…

Fused CuteDSL gating kernel (#300) cla signed fb-exported meta-exported

#300 opened Apr 2, 2026 by jduprat Contributor

Loading…

NSA forward — foundation, reference implementations, compact metadata cla signed fb-exported meta-exported

#299 opened Apr 2, 2026 by jduprat Contributor

Loading…

Update benchmarks to 1M tokens, add memory diagnostics cla signed fb-exported meta-exported

#297 opened Mar 30, 2026 by jduprat Contributor

Loading…

Update FINDINGS.md with optimization round results cla signed fb-exported meta-exported

#296 opened Mar 30, 2026 by jduprat Contributor

Loading…

Replace CuteDSL compress + gating kernels with pure PyTorch cla signed fb-exported meta-exported

#295 opened Mar 30, 2026 by jduprat Contributor

Loading…

Complete compress_factor: backward path + remove mask_mod from NSA cla signed fb-exported meta-exported

#294 opened Mar 30, 2026 by jduprat Contributor

Loading…

Previous 1 2 3 4 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!