Sentience

Where voices come alive - AI-powered text-to-speech and instant voice cloning, built with Next.js 16, React 19, and cutting-edge TTS technology.

✨ The Sentience Difference

Sentience isn't just another text-to-speech app - it's a voice ecosystem that gives your content a soul. Whether you're a content creator, developer, or enterprise, Sentience transforms written words into living, breathing audio that resonates with your audience.

🚀 Features

🎙️ Zero-Shot Voice Cloning - Upload or record just 10 seconds of any voice, and watch Sentience replicate it instantly. No training, no waiting, no technical expertise required.
🗣️ 20+ Premium Built-in Voices - A diverse cast of AI voices across 12 categories and 5 locales, ready to bring any project to life.
🎚️ Fine-tune Your Sound - Adjust creativity, variety, expression, and flow parameters to make each generation uniquely yours.
👥 Team Collaboration - Multi-tenant architecture with Clerk Organizations ensures complete data isolation for teams of any size.
💳 Smart Usage-Based Billing - Pay only for what you use with Polar's metered pricing. Start at $0/month and scale with your success.
📊 Generation History - Every voice you create, every word you generate - preserved with full metadata for easy recall and re-use.
📱 Responsive Everywhere - From desktop studios to mobile workflows, Sentience adapts seamlessly to any screen.
🎨 Waveform Visualization - Beautiful WaveSurfer.js audio player with seek, play/pause, and download capabilities.

🏁 Getting Started

Prerequisites

Node.js 20.9 or later
Prisma Postgres database
Clerk account (with Organizations enabled)
Cloudflare R2 bucket
Modal account (for GPU-hosted TTS)
Polar account (for billing)

1. Clone and install

git clone https://github.com/AlexGMAY/Sentience.git
cd Sentience
npm install

2. Configure environment

cp .env.example .env

Fill in the blank values in .env. Sensible defaults (Clerk routes, Polar meter names, APP_URL, etc.) are pre-filled.

3. Set up Polar billing

In your Polar dashboard, create two meters under Meters:

Voice Creation meter
- Filter: Name equals voice_creation
- Aggregation: Count
Text-to-Speech Characters meter
- Filter: Name equals tts_generation
- Aggregation: Sum over characters

Then create a new product with Recurring subscription pricing. Under Price Type, add two metered prices:

Click Add metered price and select the Text-to-Speech Characters meter
- Set the Amount per unit (price per character, e.g. $0.003)
- Optionally set a Cap amount (e.g. $100)
Click Add metered price again and select the Voice Creation meter
- Set the Amount per unit (price per voice generation, e.g. $0.25)
- Optionally set a Cap amount (e.g. $100)

With only metered prices, the subscription starts at $0/month and scales with usage. If you want a baseline subscription fee (e.g. $20/month), add a third price to the same product — select a fixed price instead of a metered price.

Ensure Allow multiple subscriptions is turned off under Settings > Billing (this is the Polar default).

Copy the product ID into POLAR_PRODUCT_ID. The meter filter names and aggregation property must match the POLAR_METER_* env variables.

4. Set up the database

npx prisma migrate deploy

5. Deploy the TTS engine

The included chatterbox_tts.py is adapted from Modal's official Chatterbox TTS example, modified to read voice reference audio directly from your R2 bucket instead of a Modal Volume.

Before deploying, update chatterbox_tts.py with your R2 credentials:

R2_BUCKET_NAME = "<your-r2-bucket-name-here>"
R2_ACCOUNT_ID = "<your-r2-account-id-here>"

Then create the required secrets in your Modal dashboard:

Secret Name	Keys	Description
`cloudflare-r2`	`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`	R2 API credentials (used for bucket mount)
`chatterbox-api-key`	`CHATTERBOX_API_KEY`	API key to protect the endpoint (use any strong random string)
`hf-token`	`HF_TOKEN`	Hugging Face token (for downloading the Chatterbox model weights)

Deploy to Modal:

modal deploy chatterbox_tts.py

This deploys Chatterbox TTS to a serverless NVIDIA A10G GPU on Modal. The container mounts your R2 bucket read-only for direct access to voice reference audio. Use the resulting Modal URL as CHATTERBOX_API_URL in your .env.local.

Note: The first request after a period of inactivity may take longer due to cold starts as Modal provisions the GPU container.

Once deployed, generate the type-safe Chatterbox client from the OpenAPI spec:

npm run sync-api

6. Seed voices

npx prisma db seed

Seeds 20 built-in voices to the database and R2. The system voice WAV files are included in the repository and originate from Modal's voice sample pack.

7. Run

npm run dev

Open http://localhost:3000.

🏭 Self-Hosting

Sentience is designed to be self-hosted. You'll need:

A PostgreSQL database - Prisma Postgres (recommended), or any managed Postgres
Cloudflare R2 - For audio storage (S3-compatible, generous free tier)
Modal - For serverless GPU inference (pay-per-second billing)
Clerk - For authentication and multi-tenancy
Polar - For metered billing (use sandbox mode with card 4242 4242 4242 4242 for testing)

Deploy the Next.js app to any Node.js host (Vercel, Railway, Docker, etc.).

📁 Project Structure

src/
├── app/                        # Next.js App Router
│   ├── (dashboard)/            # Protected routes (home, TTS, voices)
│   ├── api/                    # Audio proxy routes + tRPC handler
│   ├── sign-in/                # Clerk auth pages
│   └── sign-up/
├── components/                 # Shared UI components (shadcn/ui + custom)
├── features/
│   ├── dashboard/              # Home page, quick actions
│   ├── text-to-speech/         # TTS form, audio player, settings, history
│   ├── voices/                 # Voice library, creation, recording
│   └── billing/                # Usage display, checkout
├── hooks/                      # App-wide hooks
├── lib/                        # Core: db, r2, polar, env, chatterbox client
├── trpc/                       # tRPC routers, client, server helpers
├── generated/                  # Prisma client
└── types/                      # Generated API types

📜 Scripts

Command	Description
`npm run dev`	Start dev server
`npm run build`	Production build
`npm run start`	Start production server
`npm run lint`	Lint with ESLint
`npm run sync-api`	Regenerate Chatterbox API types from OpenAPI spec

🙏 Acknowledgements

Chatterbox TTS by Resemble AI - the open-source zero-shot voice cloning model powering speech generation
Modal - serverless GPU deployment and voice sample pack
shadcn/ui - beautiful, accessible components
WaveSurfer.js - audio visualization

📄 License

MIT © AlexGMAY

Built with ❤️ for voices that matter

```

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
prisma		prisma
public		public
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
chatterbox_tts.py		chatterbox_tts.py
components.json		components.json
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
prisma.config.ts		prisma.config.ts
sentry.edge.config.ts		sentry.edge.config.ts
sentry.server.config.ts		sentry.server.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentience

✨ The Sentience Difference

🚀 Features

🏁 Getting Started

Prerequisites

1. Clone and install

2. Configure environment

3. Set up Polar billing

4. Set up the database

5. Deploy the TTS engine

6. Seed voices

7. Run

🏭 Self-Hosting

📁 Project Structure

📜 Scripts

🙏 Acknowledgements

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sentience

✨ The Sentience Difference

🚀 Features

🏁 Getting Started

Prerequisites

1. Clone and install

2. Configure environment

3. Set up Polar billing

4. Set up the database

5. Deploy the TTS engine

6. Seed voices

7. Run

🏭 Self-Hosting

📁 Project Structure

📜 Scripts

🙏 Acknowledgements

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages