Skip to content

il-mangia/Transcribeer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Transcribeer Logo

Transcribeer

The modern, multilingual audio-to-text transcription tool powered by Vosk.
Cross-platform for Windows and Linux, with beautiful UI and AI-powered (GEMINI API REQUIRED) text enhancement.

FeaturesDownload and InstallHow It WorksLanguages & ModelsAI EnhancementLicense


🚀 Features

🎧 Audio Transcription

  • Multi-format Support: MP3, WAV, M4A, OGG, FLAC
  • Automatic Conversion: Converts audio to optimal format for speech recognition
  • Real-time Progress: Live progress bar and status updates

🌍 Multilingual Support

  • 8 Languages: Italian, English, French, German, Spanish, Portuguese, Russian, Chinese
  • Auto-language Detection: UI automatically matches your system language
  • Model Auto-download: Downloads speech recognition models on first use

🤖 AI-Powered Enhancement

  • Google Gemini Integration: Improves punctuation and text flow
  • Smart Punctuation: Automatic sentence capitalization and punctuation
  • Text Refinement: Enhances readability while preserving meaning

🎨 Modern Interface

  • Dark Theme: Easy on the eyes modern UI
  • CustomTkinter: Beautiful, customizable interface
  • Intuitive Workflow: Simple and user-friendly

⚡ Technical Features

  • Offline Capable: Works without internet (except AI features)
  • Cross-Platform: Windows and Linux support
  • Lightweight: Fast and efficient processing

📦 Download & Installation

Windows

Requirements:

  • Windows 10 or later
  • IMPORTANT NOTE: FFmpeg (winget install Gyan.FFmpeg)

Installation:

  1. Install the latest version Transcibeer: 👉 Latest Windows Release
  2. Download FFmpeg by opening the command prompt/powershell and entering the following line of code: (winget install Gyan.FFmpeg).
  3. When downloading FFmpeg, Press Y to confirm the download and wait for it to be completed.
  4. Run Transcibeer with administrator.
  5. In Transcibeer, select a language model and download it.

Linux

Requirements:

  • Ubuntu/Debian: sudo apt install ffmpeg
  • Fedora: sudo dnf install ffmpeg
  • Arch: sudo pacman -S ffmpeg Transcribeer: 👉 [Latest Linux Release] - No Releases for now. Coming soon.

From source code folder (dev only)


Extra information

Clone repository

git clone https://github.com/il-mangia/
cd Transcribeer

Install dependencies

pip install -r requirements.txt

Run application

python main.py

🧠 How It Works

  1. FFmpeg converts your audio to WAV 16 kHz mono
  2. Vosk (https://alphacephei.com/vosk/) transcribes and auto-detects the language
  3. Transcript is translated into italian only
  4. Both original and translated text are shown
  5. You can save everything into a .txt file

🌍 Languages & UI

Supported languages:

  • 🇮🇹 Italian (it) - Vosk Small IT 0.22
  • 🇺🇸 English (en) - Vosk Small EN-US 0.15
  • 🇫🇷 Francais (fr) - Vosk Small FR 0.22
  • 🇩🇪 Detusch (de) - Vosk Small DE-Zamia 0.3
  • 🇪🇸 Spanish (es) - Vosk Small ES 0.42
  • 🇵🇹 Portoughese (pt) - Vosk Small PT 0.3
  • 🇷🇺 Russian (ru) - Vosk Small RU 0.22
  • 🇨🇳 Chinese (cn) - Vosk Small CN 0.22

🤖AI Enhancement

Transcribeer can enhance your transcripts using Google's Gemini AI, to implement this feature follow this procedure:

Get API Key:

  1. Visit the Google AI Studio.
  2. Log in your Google Account and generate an API key.

Configuration in App:

  1. Click "AI" button in main interface
  2. Enter your API key when prompted
  3. They Key will be saved locally for future use

What AI Improves:

  1. Punctuation: Adds proper periods, commas, question marks
  2. Capitalization: Correct sentence case
  3. Readability: Improves text flow and naturalness
  4. Preservation: Maintains original meaning and content

🔧 Development Setup (Source code folder)

Install dependencies:

pip install -r requirements.txt  

Run:

python main.py  

🧪 Supported media Formats

  • MP3
  • WAV
  • AAC
  • M4A

(All are converted to WAV automatically.)

🧰 Tech Stack

  • Python 3
  • Vosk
  • Customtkinter
  • FFmpeg
  • Google Gemini API

📝 Known Limitations (Important)

  • AI function requires Internet.
  • GPU acceleration planned for future versions.
  • Local translation model planned.

❤️ License

OPEN SOURCE!!!!!!!!!


Built with ❤️ by Il-Mangia — Powered by Whisper

About

Transcribeer is a fast, multilingual audio-to-text tool powered by Vosk. It converts MP3, WAV, M4A and AAC files into accurate transcripts and instantly translates them into 10+ languages. Clean UI, automatic system language detection, and professional-grade results.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages