Transcribeer

The modern, multilingual audio-to-text transcription tool powered by Vosk.
Cross-platform for Windows and Linux, with beautiful UI and AI-powered (GEMINI API REQUIRED) text enhancement.

Features • Download and Install • How It Works • Languages & Models • AI Enhancement • License

🚀 Features

🎧 Audio Transcription

Multi-format Support: MP3, WAV, M4A, OGG, FLAC
Automatic Conversion: Converts audio to optimal format for speech recognition
Real-time Progress: Live progress bar and status updates

🌍 Multilingual Support

8 Languages: Italian, English, French, German, Spanish, Portuguese, Russian, Chinese
Auto-language Detection: UI automatically matches your system language
Model Auto-download: Downloads speech recognition models on first use

🤖 AI-Powered Enhancement

Google Gemini Integration: Improves punctuation and text flow
Smart Punctuation: Automatic sentence capitalization and punctuation
Text Refinement: Enhances readability while preserving meaning

🎨 Modern Interface

Dark Theme: Easy on the eyes modern UI
CustomTkinter: Beautiful, customizable interface
Intuitive Workflow: Simple and user-friendly

⚡ Technical Features

Offline Capable: Works without internet (except AI features)
Cross-Platform: Windows and Linux support
Lightweight: Fast and efficient processing

📦 Download & Installation

Windows

Requirements:

Windows 10 or later
IMPORTANT NOTE: FFmpeg (winget install Gyan.FFmpeg)

Installation:

Install the latest version Transcibeer: 👉 Latest Windows Release
Download FFmpeg by opening the command prompt/powershell and entering the following line of code: (winget install Gyan.FFmpeg).
When downloading FFmpeg, Press Y to confirm the download and wait for it to be completed.
Run Transcibeer with administrator.
In Transcibeer, select a language model and download it.

Linux

Requirements:

Ubuntu/Debian: sudo apt install ffmpeg
Fedora: sudo dnf install ffmpeg
Arch: sudo pacman -S ffmpeg Transcribeer: 👉 [Latest Linux Release] - No Releases for now. Coming soon.

From source code folder (dev only)

Extra information

Clone repository

git clone https://github.com/il-mangia/
cd Transcribeer

Install dependencies

pip install -r requirements.txt

Run application

python main.py

🧠 How It Works

FFmpeg converts your audio to WAV 16 kHz mono
Vosk (https://alphacephei.com/vosk/) transcribes and auto-detects the language
Transcript is translated into italian only
Both original and translated text are shown
You can save everything into a .txt file

🌍 Languages & UI

Supported languages:

🇮🇹 Italian (it) - Vosk Small IT 0.22
🇺🇸 English (en) - Vosk Small EN-US 0.15
🇫🇷 Francais (fr) - Vosk Small FR 0.22
🇩🇪 Detusch (de) - Vosk Small DE-Zamia 0.3
🇪🇸 Spanish (es) - Vosk Small ES 0.42
🇵🇹 Portoughese (pt) - Vosk Small PT 0.3
🇷🇺 Russian (ru) - Vosk Small RU 0.22
🇨🇳 Chinese (cn) - Vosk Small CN 0.22

🤖AI Enhancement

Transcribeer can enhance your transcripts using Google's Gemini AI, to implement this feature follow this procedure:

Get API Key:

Visit the Google AI Studio.
Log in your Google Account and generate an API key.

Configuration in App:

Click "AI" button in main interface
Enter your API key when prompted
They Key will be saved locally for future use

What AI Improves:

Punctuation: Adds proper periods, commas, question marks
Capitalization: Correct sentence case
Readability: Improves text flow and naturalness
Preservation: Maintains original meaning and content

🔧 Development Setup (Source code folder)

Install dependencies:

pip install -r requirements.txt

Run:

python main.py

🧪 Supported media Formats

MP3
WAV
AAC
M4A

(All are converted to WAV automatically.)

🧰 Tech Stack

Python 3
Vosk
Customtkinter
FFmpeg
Google Gemini API

📝 Known Limitations (Important)

AI function requires Internet.
GPU acceleration planned for future versions.
Local translation model planned.

❤️ License

OPEN SOURCE!!!!!!!!!

Built with ❤️ by Il-Mangia — Powered by Whisper

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
Source Code		Source Code
Banner.png		Banner.png
README.md		README.md
logo.png		logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transcribeer

🚀 Features

🎧 Audio Transcription

🌍 Multilingual Support

🤖 AI-Powered Enhancement

🎨 Modern Interface

⚡ Technical Features

📦 Download & Installation

Windows

Linux

From source code folder (dev only)

Extra information

Clone repository

Install dependencies

Run application

🧠 How It Works

🌍 Languages & UI

🤖AI Enhancement

Get API Key:

Configuration in App:

What AI Improves:

🔧 Development Setup (Source code folder)

🧪 Supported media Formats

🧰 Tech Stack

📝 Known Limitations (Important)

❤️ License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Transcribeer

🚀 Features

🎧 Audio Transcription

🌍 Multilingual Support

🤖 AI-Powered Enhancement

🎨 Modern Interface

⚡ Technical Features

📦 Download & Installation

Windows

Linux

From source code folder (dev only)

Extra information

Clone repository

Install dependencies

Run application

🧠 How It Works

🌍 Languages & UI

🤖AI Enhancement

Get API Key:

Configuration in App:

What AI Improves:

🔧 Development Setup (Source code folder)

🧪 Supported media Formats

🧰 Tech Stack

📝 Known Limitations (Important)

❤️ License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages