The modern, multilingual audio-to-text transcription tool powered by Vosk.
Cross-platform for Windows and Linux, with beautiful UI and AI-powered (GEMINI API REQUIRED) text enhancement.
Features • Download and Install • How It Works • Languages & Models • AI Enhancement • License
- Multi-format Support: MP3, WAV, M4A, OGG, FLAC
- Automatic Conversion: Converts audio to optimal format for speech recognition
- Real-time Progress: Live progress bar and status updates
- 8 Languages: Italian, English, French, German, Spanish, Portuguese, Russian, Chinese
- Auto-language Detection: UI automatically matches your system language
- Model Auto-download: Downloads speech recognition models on first use
- Google Gemini Integration: Improves punctuation and text flow
- Smart Punctuation: Automatic sentence capitalization and punctuation
- Text Refinement: Enhances readability while preserving meaning
- Dark Theme: Easy on the eyes modern UI
- CustomTkinter: Beautiful, customizable interface
- Intuitive Workflow: Simple and user-friendly
- Offline Capable: Works without internet (except AI features)
- Cross-Platform: Windows and Linux support
- Lightweight: Fast and efficient processing
Requirements:
- Windows 10 or later
- IMPORTANT NOTE: FFmpeg (
winget install Gyan.FFmpeg)
Installation:
- Install the latest version Transcibeer: 👉 Latest Windows Release
- Download FFmpeg by opening the command prompt/powershell and entering the following line of code: (
winget install Gyan.FFmpeg). - When downloading FFmpeg, Press Y to confirm the download and wait for it to be completed.
- Run Transcibeer with administrator.
- In Transcibeer, select a language model and download it.
Requirements:
- Ubuntu/Debian:
sudo apt install ffmpeg - Fedora:
sudo dnf install ffmpeg - Arch:
sudo pacman -S ffmpegTranscribeer: 👉 [Latest Linux Release] - No Releases for now. Coming soon.
git clone https://github.com/il-mangia/
cd Transcribeer
pip install -r requirements.txt
python main.py
- FFmpeg converts your audio to WAV 16 kHz mono
- Vosk (https://alphacephei.com/vosk/) transcribes and auto-detects the language
- Transcript is translated into italian only
- Both original and translated text are shown
- You can save everything into a .txt file
Supported languages:
- 🇮🇹 Italian (it) - Vosk Small IT 0.22
- 🇺🇸 English (en) - Vosk Small EN-US 0.15
- 🇫🇷 Francais (fr) - Vosk Small FR 0.22
- 🇩🇪 Detusch (de) - Vosk Small DE-Zamia 0.3
- 🇪🇸 Spanish (es) - Vosk Small ES 0.42
- 🇵🇹 Portoughese (pt) - Vosk Small PT 0.3
- 🇷🇺 Russian (ru) - Vosk Small RU 0.22
- 🇨🇳 Chinese (cn) - Vosk Small CN 0.22
Transcribeer can enhance your transcripts using Google's Gemini AI, to implement this feature follow this procedure:
- Visit the Google AI Studio.
- Log in your Google Account and generate an API key.
- Click "AI" button in main interface
- Enter your API key when prompted
- They Key will be saved locally for future use
- Punctuation: Adds proper periods, commas, question marks
- Capitalization: Correct sentence case
- Readability: Improves text flow and naturalness
- Preservation: Maintains original meaning and content
Install dependencies:
pip install -r requirements.txt
Run:
python main.py
- MP3
- WAV
- AAC
- M4A
(All are converted to WAV automatically.)
- Python 3
- Vosk
- Customtkinter
- FFmpeg
- Google Gemini API
- AI function requires Internet.
- GPU acceleration planned for future versions.
- Local translation model planned.
OPEN SOURCE!!!!!!!!!
Built with ❤️ by Il-Mangia — Powered by Whisper
