sam3.cpp : Segment Anything 3/2/2.1/EdgeTAM using ggml #21426
PABannier
started this conversation in
Show and tell
Replies: 1 comment
-
|
You mention here CUDA acceleration, but your repo seems to support just Apple Metal. How is it with CUDA right now? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi everyone,
sam3.cpp is a C++ port of Meta's Segment Anything models (SAM 2, SAM 2.1, SAM 3, EdgeTAM) using ggml. It runs real-time video segmentation at 30 FPS on CUDA, with no Python, no PyTorch, and no heavy dependencies.
What it does:
The full SAM 3 model includes a ViT backbone, text encoder, and DETR decoder for open-vocabulary detection. The lighter models (EdgeTAM, SAM 2 Tiny) are small enough to run on a phone.
I went from PyTorch's ~2 FPS to 30 FPS by rewriting the entire pipeline in C++ with ggml, using fused Metal/CUDA kernels, aggressive quantization, and careful memory management to avoid unnecessary copies between pipeline stages.
52 pretrained models are available on Hugging Face in multiple precisions (f32, f16, q8_0, q4_1, q4_0).
Code: https://github.com/PABannier/sam3.cpp
Models: https://huggingface.co/PABannier/sam3.cpp
--
Along the way I needed several Metal operations that upstream only has on CPU/Vulkan/CUDA. Happy to upstream these if there's interest.
What I'd contribute:
CONV_2D_DW: first Metal impl, needed for depthwise separable convolutionsWIN_PART/WIN_UNPART: first Metal impl, needed for window-based vision transformershead_dim=56: new template instantiations for Hiera modelshead_dim=16whitelist fix: templates existed, supports_op didn't list themconv_transpose_2d: rewritten gather algorithm, F16 vectorizationWould any of these be welcome as PRs? Happy to split into separate PRs per operation.
Beta Was this translation helpful? Give feedback.
All reactions