You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Whisper base model produces empty or inconsistent transcriptions on Hailo-8L hardware, while the system appears to be correctly configured and operational.
Testing Performed
1. Installation & Setup
✅ Ran python3 setup.py successfully
✅ Downloaded all required HEF files for Hailo-8L using download_resources.py
✅ Re-downloaded fresh HEF files to rule out corruption
✅ All dependencies installed correctly in virtual environment
Example successful transcription (1 out of 10+ attempts): "testing 123"
Example partial transcription: "is a 5 2" (from "This is a 5 second recording")
Majority of attempts: Empty string '' returned from decoder
Sample Output:
Audio loaded: 78674 samples, max level: 0.9793
After preprocessing: start_time=1.2, audio length: 78674 samples
Chunk offset: 1.00s
Raw transcription: ' is a 5 2'
Cleaned transcription: 'is a 5 2.'
Then subsequent recordings:
Audio loaded: 78535 samples, max level: 0.3499
After preprocessing: start_time=1.0, audio length: 78535 samples
Chunk offset: 0.50s
Raw transcription: ''
Cleaned transcription: '.'
System Configuration
Hardware
Software Versions
Installed Packages
Python Dependencies (in virtual environment)
Problem Description
The Whisper base model produces empty or inconsistent transcriptions on Hailo-8L hardware, while the system appears to be correctly configured and operational.
Testing Performed
1. Installation & Setup
python3 setup.pysuccessfullydownload_resources.py2. Hardware Verification
hailortcli scanshows device 0001:01:00.03. Audio Recording Tests
4. Model Testing
Base Model (5-second encoder)
Command:
python3 -m app.app_hailo_whisper --hw-arch hailo8l --variant base --duration 5Results:
"testing 123""is a 5 2"(from "This is a 5 second recording")''returned from decoderSample Output:
Then subsequent recordings:
Tiny Model (10-second encoder)
Command:
python3 -m app.app_hailo_whisper --hw-arch hailo8l --variant tiny --duration 10Results:
'%。...............,......''..... other alert hurt�... other........�..''%,,, ", to [,,,, " w,," [ st, -- "告诉 ',,, a, w'5. Configuration Variations Tested
--no-vadflag)--reuse-audio)Observed Behavior
What Works
What Fails
Evidence
Successful Transcription (happened once)
Typical Failed Output (common)
Garbled Output (tiny model)
Reproduction Steps
python3 setup.pyin speech_recognition directorypython3 -m app.app_hailo_whisper --hw-arch hailo8l --variant base --duration 5Any ideas?
Should I try HailoRt 4.21 ?
Tested with 2 mics and different volummes.
Generated .wav file is good, voice clear.
Thanks in advance.