| title | Google ASR and TTS |
|---|---|
| description | Google Cloud Speech-to-Text (ASR) and Text-to-Speech (TTS) Configuration |
| published | true |
| date | 2025-08-26 13:04:28 UTC |
| tags | asr, tts, google |
| editor | markdown |
| dateCreated | 2025-08-11 10:53:39 UTC |
Google Cloud provides high-quality Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) services with broad language support and advanced customization options.
The ASR service (Google Cloud Speech-to-Text) supports real-time and batch transcription, optimized for various use cases including telephony.
The TTS service (Google Cloud Text-to-Speech) offers a wide range of voices, languages, and control over speech parameters like speed and pitch.
- High Accuracy: Advanced machine learning models optimized for telephony, video, and general-purpose speech.
- Extensive Language Support: Dozens of languages and variants supported.
- Customizable Voices: Natural-sounding neural voices with adjustable speaking rate and pitch.
- Secure: Fully managed on Google Cloud with enterprise-grade security.
See the official documentation for:
- Sign in to Google Cloud Console.
- Create a new project or select an existing one.
- Enable the APIs:
- Go to APIs & Services → Credentials.
- Click Create credentials → Service account.
- Fill in the details and click Done.
- Select your new service account → Keys → Add Key → Create New Key.
- Choose JSON and click Create — the file will download automatically.
- Save this file securely and note its path — you will use it in the
GOOGLE_APPLICATION_CREDENTIALSvariable.
| Variable | Description | Example Value |
|---|---|---|
| PORT | Port where the ASR module listens | 6001 |
| GOOGLE_APPLICATION_CREDENTIALS | Path to your Google service account JSON key file | /path/to/google.json |
| SPEECH_RECOGNITION_LANGUAGE | Language code for recognition | en-US |
| SPEECH_RECOGNITION_MODEL | Recognition model (default, telephony, video, etc.) |
telephony |
avr-asr-google-cloud-speech:
image: agentvoiceresponse/avr-asr-google-cloud-speech
platform: linux/x86_64
container_name: avr-asr-google
restart: always
environment:
- PORT=6001
- GOOGLE_APPLICATION_CREDENTIALS=/path/to/google.json
- SPEECH_RECOGNITION_LANGUAGE=en-US
- SPEECH_RECOGNITION_MODEL=telephony
volumes:
- ./google.json:/path/to/google.json
networks:
- avrRepository: avr-tts-google-speech-tts
| Variable | Description | Example Value |
|---|---|---|
| PORT | Port where the TTS module listens | 6003 |
| GOOGLE_APPLICATION_CREDENTIALS | Path to your Google service account JSON key file | /path/to/google.json |
| TEXT_TO_SPEECH_LANGUAGE | Language code for TTS | en-AU |
| TEXT_TO_SPEECH_GENDER | Voice gender (MALE, FEMALE, NEUTRAL) |
FEMALE |
| TEXT_TO_SPEECH_NAME | Specific voice name from Google TTS voices | en-AU-Neural2-C |
| TEXT_TO_SPEECH_SPEAKING_RATE | Speaking rate multiplier | 1 |
avr-tts-google-speech-tts:
image: agentvoiceresponse/avr-tts-google-speech-tts
platform: linux/x86_64
container_name: avr-tts-google
restart: always
environment:
- PORT=6003
- GOOGLE_APPLICATION_CREDENTIALS=/path/to/google.json
- TEXT_TO_SPEECH_LANGUAGE=en-AU
- TEXT_TO_SPEECH_GENDER=FEMALE
- TEXT_TO_SPEECH_NAME=en-AU-Neural2-C
- TEXT_TO_SPEECH_SPEAKING_RATE=1
volumes:
- ./google.json:/path/to/google.json
networks:
- avr