Skip to content

Releases: MapleSilicon/SparseFlow

SparseFlow v0.3-alpha — GPU Lowering Pipeline (Kernel Stub)

17 Dec 13:36

Choose a tag to compare

🚀 SparseFlow v0.3-alpha

This is an architectural alpha release that completes SparseFlow’s end-to-end GPU lowering pipeline.

This release validates the compiler flow from structured sparsity analysis to GPU IR generation using MLIR.


✅ What’s Included

Compiler Passes

  • sparseflow-spa

    • Static N:M sparsity propagation at compile time
  • sparseflow-rewrite-matmul

    • Rewrites sparse linalg.matmul into backend calls
  • sparseflow-gpu-rewrite

    • Lowers sparse calls into:
      • gpu.module
      • gpu.func (kernel stub)
      • gpu.launch

Verified Pipeline

→linalg.matmul
→ sparseflow-spa
→ func.call @sparse_matmul_N_M
→ gpu.launch + gpu.func

The full pipeline executes successfully with verifier-clean MLIR.


⚠️ Limitations (By Design)

  • GPU kernel contains no computation (gpu.return only)
  • No bufferization or memory lowering
  • No correctness or performance claims on GPU
  • Output tensor is currently a placeholder

This release is about compiler correctness and architecture, not performance.


🧱 Why This Matters

SparseFlow now has:

  • Stable pass registration
  • Stable plugin loading
  • Verified CPU → GPU lowering boundary
  • A clean foundation for real GPU kernel development

This is the last structural milestone before implementing GPU kernels.


🔜 What’s Next

v0.3-beta

  • Define GPU kernel ABI
  • Memory layout and bufferization
  • Replace stub kernel with real N:M computation

📄 License

MIT

Full Changelog: v0.7.1...v0.3-alpha

SparseFlow v0.7.1 – SPA + CPU Runtime + Python CLI (Dev Preview)

04 Dec 15:27

Choose a tag to compare

SparseFlow v0.7.1 – SPA + CPU Runtime + Python CLI (Dev Preview)

This release packages the current SparseFlow stack into a usable, reproducible demo
for researchers, compiler engineers, and hardware partners.

Highlights

  • SPA v0.6 (2D Sparsity)

    • Propagates row + column sparsity through a small MLIR pipeline
    • Exports JSON metadata with row/col masks and sparsity stats
    • Proven on structured sparse matmuls with 50% row + 50% col sparsity (≈75% FLOP reduction)
  • C++ CPU Runtime (OpenMP)

    • Consumes SPA JSON masks and skips zero rows/cols
    • Blocked matmul kernel with OpenMP parallelism
    • Achieves ≈3–5× speedup on larger matmuls (512–1024) at ~75% sparsity
    • Honest behavior: small sizes (<512) can be 1–3× due to OpenMP and cache overhead
  • End-to-End Demo Script

    • ./spa-runner.sh:
      • Rebuilds passes/runtime if needed
      • Runs SPA on test MLIR
      • Exports spa_sparsity.json
      • Benchmarks dense vs sparse CPU matmuls
  • Python CLI (Developer Preview)
    Install:

    cd sparseflow_package
    pip install -e .
    

    Tools:

    sparseflow-demo
    sparseflow-analyze tests/test_spa_v6_full_2d.mlir
    sparseflow-benchmark
    

Quick Start

git clone https://github.com/MapleSilicon/SparseFlow.git
cd SparseFlow
./spa-runner.sh

SparseFlow v0.1.1 — JSON-Driven N:M Runtime + Dimension Support

29 Nov 06:41

Choose a tag to compare

This release includes:

  • Full MLIR pass pipeline (AnnotateNmPass, ExportMetadataPass, etc.)
  • JSON-driven compiler → runtime integration
  • Dimension-aware runtime (M, N, K correctly read from metadata)
  • Sparse matmul simulator with N:M skipping logic
  • End-to-end validation scripts
  • Matrix + sparsity sweep tool