07 Mar 22:40

6ee76ed

v1.2.0 Latest

Latest

EchoNotes v1.2.0

This release turns EchoNotes into a more capable long-running transcription worker and upgrades the Obsidian output into a structured note format.

Highlights

Migrated the audio pipeline from openai-whisper to WhisperX and changed the runtime so transcription models load once and stay hot inside the worker process.
Refactored the app into a queue-backed worker architecture with a worker pool, so the watcher stays responsive while heavy transcription and summarization run in the background.
Hardened ingestion:
- waits for files to become stable before processing
- ignores temporary download files
- gracefully skips encrypted PDFs and missing-OCR-dependency cases
- periodically rescans /app/incoming so Syncthing and Windows-backed mounts still get picked up even when filesystem events are missed
Expanded ingest support by normalizing many common audio formats through FFmpeg before transcription.
Added provider-based LLM support with YAML configuration for:
- Open WebUI
- Ollama
- OpenAI
- Anthropic / Claude
- OpenRouter
Added chunked transcript formatting and summarization for long meetings so large inputs do not overflow model context.
Added Obsidian vault export:
- audio/video notes now produce vault-ready markdown, transcript, summary, and final MP3 artifacts
- timestamped transcript links are generated in Obsidian-friendly format
Added optional WhisperX diarization so transcripts can include Speaker 1, Speaker 2, and so on.
Added GPU runtime hardening:
- WhisperX/PyTorch compatibility fallback for newer torch.load behavior
- markdown fence cleanup for LLM outputs
- adaptive CUDA OOM retry with smaller batch sizes and optional CPU fallback per file
Refactored Docker delivery:
- separate CPU and CUDA images
- mounted runtime layout for incoming, vault, config, and optional model-cache
- warm-only model-cache workflow in build.sh
Upgraded Obsidian note generation:
- structured JSON extraction prompt
- deterministic Python rendering of final notes
- richer front matter and linked entity sections
- shipped default obsidian-template.md and obsidian-extract.md

Docker Images

This release publishes:

robchartier/echonotes:1.2.0
robchartier/echonotes:1.2.0-cuda12.8
robchartier/echonotes:latest
robchartier/echonotes:latest-cuda12.8
robchartier/echonotes:gpu

Operational Notes

For GPU systems, keep worker_count: 1 unless you have explicitly verified VRAM headroom for more.
If your files arrive through Syncthing or Windows-backed mounts, EchoNotes now has a periodic fallback scan so missed filesystem events do not strand files.
If you mount /app/model-cache, WhisperX, alignment, and diarization assets can be reused across container restarts.
Timestamp links inside Obsidian work best with a media plugin such as Media Extended.

Assets 5

07 Mar 17:42

github-actions

v1.0.2

aeef527

v1.0.2

EchoNotes v1.0.2

This release focuses on ingestion reliability and long-running GPU stability.

Highlights

Added a periodic fallback rescan for /app/incoming so files are still queued when filesystem events are missed on Syncthing, Windows-backed bind mounts, or other unreliable mounted paths.
Added a pending-job registry so watcher events and periodic rescans do not enqueue the same file multiple times.
Added adaptive WhisperX GPU OOM handling:
- retries with progressively smaller batch sizes on CUDA out-of-memory
- clears CUDA memory between retries
- optionally falls back to CPU for that file instead of wedging the queue
Added new runtime knobs for transcription memory behavior:
- whisper_batch_size
- whisper_min_batch_size
- gpu_oom_fallback

Docker Images

This release publishes:

robchartier/echonotes:1.0.2
robchartier/echonotes:1.0.2-cuda12.8
robchartier/echonotes:latest
robchartier/echonotes:latest-cuda12.8
robchartier/echonotes:gpu

Operational Notes

For GPU systems, keep worker_count: 1 unless you have explicitly verified VRAM headroom for multiple concurrent WhisperX workers.
If files are arriving through Syncthing or Windows-backed mounts, the new rescan loop should pick them up even when Docker does not surface create/move events into the container.
If very large audio still pressures VRAM, EchoNotes now degrades per file instead of failing the whole queue. You can tune this behavior with whisper_batch_size, whisper_min_batch_size, and gpu_oom_fallback.

Assets 5

06 Mar 23:28

github-actions

v1.0.1

74143cf

v1.0.1

EchoNotes v1.0.1

This release packages the work from the last development stretch into the first stable Dockerized worker release of EchoNotes.

Highlights

Switched audio transcription from Whisper to WhisperX.
Added persistent model loading so ASR models are loaded once and reused.
Refactored the app into a queue-backed worker pool instead of processing inline in the watcher.
Added transcript formatting, chunked summarization, and provider-based LLM support for Open WebUI, Ollama, OpenAI, Anthropic/Claude, and OpenRouter.
Added Obsidian vault export with linked MP3 timestamp references and speaker-aware transcript output.
Added WhisperX diarization support for Speaker 1, Speaker 2, and so on.
Expanded audio ingest to common FFmpeg-readable formats and normalized them to MP3 before transcription.
Hardened file ingestion against partial uploads, temporary files, encrypted PDFs, and OCR dependency failures.
Added CPU and CUDA Docker image variants with mounted config, incoming, vault, and model-cache directories.
Added warm-cache tooling for WhisperX models and documented the Docker build/runtime flow.

Docker Images

This release publishes:

robchartier/echonotes:1.0.1
robchartier/echonotes:1.0.1-cuda12.8
robchartier/echonotes:latest
robchartier/echonotes:latest-cuda12.8
robchartier/echonotes:gpu

Operational Notes

GPU deployments should generally use worker_count: 1 unless you have verified VRAM headroom for multiple concurrent WhisperX workers.
WhisperX diarization depends on a Hugging Face token and will fall back cleanly if diarization is not configured.
Files copied into Windows-backed bind mounts may not always emit reliable filesystem events into Docker; EchoNotes now queues files already present at startup, but Linux-side writes remain the most reliable path.

QA Notes

Validated against the current Testing/ corpus on the CUDA image:

Standard PDF, DOCX, TXT, FLAC, and WAV samples completed successfully.
Encrypted PDFs were skipped gracefully without crashing the worker.
Short audio files produced transcript, summary, Obsidian note, and vault copies correctly.

Known issues from QA:

Very large audio can still hit GPU memory limits depending on model choice and VRAM.
Very large text inputs can take long enough to block a single-worker queue under slow local LLMs.

Assets 5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

EchoNotes v1.2.0

Highlights

Docker Images

Operational Notes

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

EchoNotes v1.0.2

Highlights

Docker Images

Operational Notes

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

EchoNotes v1.0.1

Highlights

Docker Images

Operational Notes

QA Notes

Uh oh!

Releases: nothingmn/echonotes

v1.2.0

EchoNotes v1.2.0

Highlights

Docker Images

Operational Notes

Uh oh!

v1.0.2

EchoNotes v1.0.2

Highlights

Docker Images

Operational Notes

Uh oh!

v1.0.1

EchoNotes v1.0.1

Highlights

Docker Images

Operational Notes

QA Notes

Uh oh!