Bias92

Follow

🎯

Focusing

노란토끼 Bias92

🎯

Focusing

Follow

Hongik University Undergraduate

7 followers · 5 following

Achievements

Achievements

Pinned Loading

sdpa-attention-benchmark sdpa-attention-benchmark Public

Benchmark PyTorch SDPA backends (math vs flash) on RTX 4060 Ti with Nsight Systems profiling

Python 2
flashattn-cuda-metal flashattn-cuda-metal Public

FlashAttention CUDA kernel implementation and Metal port (RTX 4060 Ti, Apple M4 Pro)

Cuda 3
fused-qkv-int8-attention fused-qkv-int8-attention Public

Fused INT8 KV-cache dequantization + FlashAttention-style tiled decode attention CUDA kernel on A100

Python