- 9 visualization modes: spectrogram, mel, chroma, hpss, selfsim, loudness, tempogram, mfcc, flux
- 6 color palettes: classic, magma, inferno, viridis, gray, claw
- Auto-contrast: per-panel percentile normalization for readable heatmaps
- Combine modes: stack multiple visualizations in one grid image
- Universal input: WAV, MP3, or anything ffmpeg can handle
- Fast: native Go, no Python dependencies
- Flexible output: PNG or JPEG, customizable dimensions
brew install steipete/tap/songseego install github.com/steipete/songsee/cmd/songsee@latestDocker:
docker build -t songsee .
docker run --rm -v "$PWD:/input:ro" -v "$PWD/out:/output" songsee /input/track.mp3 --output /output/track.pngThe image includes ffmpeg, so batch and server runs do not depend on host audio tooling.
# Basic spectrogram
songsee track.mp3
# Mel spectrogram with magma palette
songsee track.mp3 --viz mel --style magma
# All 9 modes combined
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux
# Custom output
songsee track.mp3 --viz hpss,chroma --style inferno -o viz.png --width 2560 --height 1440| Mode | Description |
|---|---|
spectrogram |
Time Ă— frequency magnitude |
mel |
Perceptual frequency scale |
chroma |
12-bin pitch class |
hpss |
Harmonic vs percussive separation |
selfsim |
Self-similarity matrix |
loudness |
Volume over time |
tempogram |
Tempo variation |
mfcc |
Timbre fingerprint |
flux |
Spectral change detection |
classic · magma · inferno · viridis · gray · claw
--output Output path (default: input name + extension)
--format jpg or png (default: jpg)
--width Output width (default: 1920)
--height Output height (default: 1080)
--window FFT window size (default: 2048)
--hop Hop size (default: 512)
--min-freq Minimum frequency in Hz
--max-freq Maximum frequency in Hz
--start Start time in seconds
--duration Duration in seconds
--style Palette name
--viz Visualization list (repeatable or comma-separated)
Built by @steipete
