This is an Express-based service that fetches YouTube video information and transcripts (subtitles) in any available language. It uses the ytdl-core library to extract video metadata and the youtube-captions-scraper library to retrieve subtitles from YouTube videos.
- Fetch basic video info like title, author, description, genre, and more.
- Extract subtitles in any available language for YouTube videos.
- Returns the start time, end time, and text of each subtitle.
- Provides simplified and smart options to avoid duplicate processing.
- Supports saving transcripts and summaries to Firebase Firestore for caching and reuse.
- Node.js (version 12.x or higher)
- npm or yarn
- yt-dlp installed on the server
- deno for yt-dlp JS challenges
- A valid
all_cookies.txt(Netscape format) for YouTube requests
-
Clone this repository:
git clone https://github.com/andresz74/youtube-transcript-generator.git
-
Navigate to the project directory:
cd youtube-transcript-generator -
Install the dependencies:
npm install
-
Create a
.envfile with your configuration:PORT=3004 CHATGPT_VERCEL_URL=https://xxxxxxxxxx.vercel.app/api/openai-chat API_ACCESS_KEY=your-shared-api-access-key SMART_SUMMARY_MAX_TRANSCRIPT_CHARS=300000 SMART_SUMMARY_DIRECT_MAX_CHARS=24000 SMART_SUMMARY_CHUNK_TARGET_CHARS=8000 SMART_SUMMARY_CHUNK_OVERLAP_CHARS=200 SMART_SUMMARY_MAX_CHUNKS=32 SMART_SUMMARY_STAGE_TIMEOUT_MS=25000 SMART_SUMMARY_STAGE_RETRIES=1 TRANSCRIPT_DEBUG=false SUMMARY_DEBUG=false
-
Create a Firebase service account key file as
firebaseServiceAccount.json(not committed to Git). Make sure you’ve set up Firestore. -
Add
all_cookies.txtin the project root (do not commit it).
Start the server:
npm startOr, using PM2:
pm2 start ecosystem.config.jsFetches captions for a YouTube video using the internal transcript fetcher (cookie + JS runtime aware).
Request:
/api/transcript?videoId=VIDEO_ID&lang=en
Response:
{
"videoId": "VIDEO_ID",
"lang": "en",
"captions": [
{ "start": 0, "duration": 3.2, "text": "Hello world" }
]
}Fetches full video info + timestamped transcript.
Request:
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID"
}Response: Full video info, subtitles with timestamps in all available languages.
Returns only the video title and concatenated transcript in the first available language.
Request:
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID"
}Response:
{
"duration": 14,
"title": "Video Title",
"transcript": "This is the transcript..."
}Returns the video title, duration, and concatenated transcript in a user-specified language (or falls back to English/first available language). Also includes a list of available languages if multiple exist.
Request:
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"lang": "es" // Optional: language code (e.g., "es" for Spanish)
}Response:
{
"duration": 14,
"title": "Video Title",
"transcript": "Transcript text in the requested language...",
"languages": [ // Only included if multiple languages available
{ "name": "English", "code": "en" },
{ "name": "Spanish", "code": "es" }
]
}Behavior:
- If
langis specified, returns the transcript in that language (or errors if unavailable). - If no
langis provided, prioritizes English (non-auto-generated) or falls back to the first available language. - Includes
languagesarray in the response when multiple subtitle tracks exist.
[Rest of the existing content remains unchanged...]
The new /simple-transcript-v2 endpoint improves upon /simple-transcript by:
- Supporting explicit language selection via
langparameter. - Providing transparency about available languages.
- Maintaining backward compatibility with the original response format when no language is specified.
To avoid fetching and reprocessing the same YouTube video over and over, this service provides smart endpoints that store and reuse results using Firebase Firestore. These endpoints check whether a transcript or summary already exists in the database before doing any expensive computation or API calls.
This is ideal when you're using the service from a frontend (e.g., a Chrome Extension) where caching can significantly speed things up and reduce costs (e.g., OpenAI API requests).
You need to set up your own Firebase project and configure Firestore access for the service. Here's what you need to do:
- Go to https://console.firebase.google.com
- Click Add project → name it → continue
- In the left panel, go to Firestore Database
- Click Create database, start in production or test mode
- Go to your project settings (⚙️ > Project settings)
- Click Service accounts
- Click Generate new private key under the Firebase Admin SDK
- Save the JSON file and rename it to:
firebaseServiceAccount.json- Place this file in the root of the project (where your
index.jslives)
⚠️ DO NOT COMMIT this file to Git or push it to any public repo.
Sometimes Firestore is not enabled by default in your Google Cloud project. You can enable it at:
https://console.cloud.google.com/apis/library/firestore.googleapis.com
This endpoint checks if the transcript is already stored in Firestore. If found, it returns the saved version. If not, it fetches it from YouTube, saves it, and returns it.
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID"
}{
"videoId": "VIDEO_ID",
"title": "Video Title",
"duration": 14,
"transcript": "Full transcript text..."
}Fetches the transcript and metadata for a YouTube video, stores it in Firestore if not already present.
Request:
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID"
}Response:
{
"videoId": "VIDEO_ID",
"title": "Video Title",
"duration": 14,
"transcript": "Transcript text...",
"description": "First line of the video description",
"date": "2025-01-26",
"image": "https://i.ytimg.com/vi/VIDEO_ID/maxresdefault.jpg",
"tags": ["ios", "automation", "shortcuts"],
"canonical_url": "https://blog.andreszenteno.com/notes/video-title"
}This endpoint checks Firestore for a summary of the video. If one exists, it's returned. If not, it uses the ChatGPT API to generate the summary (using the transcript), stores it in Firestore, and returns it.
You should send the transcript from the frontend if you already have it, to avoid duplicating the work of fetching it again.
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"transcript": "Optional: transcript string"
}{
"summary": "This is the summary of the transcript.",
"fromCache": true
}fromCache: true→ the summary was loaded from FirestorefromCache: false→ it was freshly generated using ChatGPT
This endpoint provides similar functionality to /smart-summary but offloads the summary creation and caching to Firestore itself.
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"model": "chatgpt" // Specify which model to use (chatgpt, deepseek, anthropic)
}{
"summary": "This is the summary of the transcript.",
"fromCache": true
}Generates a markdown-formatted AI summary with frontmatter and tags. Caches both in Firestore.
Request:
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"model": "openai"
}Response:
{
"summary": "---\ntitle: \"...\"\ndate: ...\ntags: [...]\n---\nSummary content...",
"fromCache": false
}- Uses OpenAI to generate both the summary and tags.
- Saves results to both
summariesandtranscriptscollections in Firestore. - Includes YouTube link and properly formatted frontmatter for markdown usage.
This endpoint improves upon v2 by retrieving extended video metadata from Firestore (such as category, video author, publish date) and formatting the summary as a Markdown document with a full YAML frontmatter block. The summary is cached in Firestore to avoid redundant AI calls. If no tags exist on the transcript, tags are generated from the title/description/summary.
Request:
{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"model": "chatgpt" // or "anthropic", "deepseek", etc.
}Response
{
"summary": "---\\ntitle: \\"...\\",\\ndate: ...\\ndescription: ...\\n...\\n---\\nSummary content...",
"fromCache": false
}Queues summary generation and returns immediately. Use this from UIs to avoid client timeouts on long videos.
Response (202):
{
"requestId": "req-123",
"status": "queued",
"statusUrl": "/summary-status/req-123"
}Polls async summary status and returns queued, processing, succeeded (with result.summary), or failed (with structured error).
To start the server with PM2:
pm2 start ecosystem.config.jsYour ecosystem.config.js may look like this:
module.exports = {
apps: [
{
name: 'youtube-transcript-generator',
script: './index.js',
watch: false,
env: {
PORT: 3004,
CHATGPT_VERCEL_URL: 'https://your-vercel-url/api/openai-chat',
API_ACCESS_KEY: 'your-shared-api-access-key'
}
}
]
};To monitor logs:
pm2 logs youtube-transcript-generator- Empty transcripts: Ensure
all_cookies.txtis fresh,yt-dlpis up to date, anddenois installed. Re-run a local check:yt-dlp --cookies all_cookies.txt --js-runtimes deno --list-subs "https://www.youtube.com/watch?v=VIDEO_ID" - yt-dlp missing: Install or update from the official release and ensure
/usr/local/binis inPATH. - Verbose logs: Set
TRANSCRIPT_DEBUG=trueorSUMMARY_DEBUG=truein.env. - 401 from model endpoints: Ensure
API_ACCESS_KEYmatches one of the keys configured inai-access(API_ACCESS_KEYS). - 413 on
/smart-summary-firebase-v3: Transcript exceededSMART_SUMMARY_MAX_TRANSCRIPT_CHARS. - Large transcript reliability:
/smart-summary-firebase-v3now uses direct mode for small transcripts and chunked hierarchical mode for large-but-valid transcripts.
- Refresh
all_cookies.txton the same server where PM2 runs. - Verify
yt-dlp --versionandyt-dlp --list-subswork from the server. - Confirm
denois installed and usable byyt-dlp --js-runtimes deno. - Restart PM2 after any cookie or dependency changes.
- Never commit
all_cookies.txtorfirebaseServiceAccount.json. - Rotate YouTube cookies if they are ever exposed in logs or chat.
- Treat
.envand any model endpoint URLs as secrets.
This project is licensed under the MIT License.
