Skip to content

CUDA backend branch on my fork #11

@peters

Description

@peters

I have a full CUDA backend implementation on my fork:

Includes CUDA backend support, demos, benchmark/test tooling, and the ggml CUDA fix needed for PCS/text-prompt mode.

Perf notes:
https://github.com/peters/sam3.cpp/blob/bf831442965c4918b566086b9e3aaa8ad1ab40ff/docs/cuda_performance.md#cuda-performance-optimization-log

RTX 4080:

  • SAM3 f16 ~1100 ms/frame
  • SAM3 q4_0 1030 ms/frame
  • SAM2.1 tiny f16 118 ms/frame
  • SAM2.1 tiny q8_0 111 ms/frame
  • SAM2.1 tiny q4_0 125 ms/frame

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions