YouTube Transcript and Video Info Service

This is an Express-based service that fetches YouTube video information and transcripts (subtitles) in any available language. It uses the ytdl-core library to extract video metadata and the youtube-captions-scraper library to retrieve subtitles from YouTube videos.

Features

Fetch basic video info like title, author, description, genre, and more.
Extract subtitles in any available language for YouTube videos.
Returns the start time, end time, and text of each subtitle.
Provides simplified and smart options to avoid duplicate processing.
Supports saving transcripts and summaries to Firebase Firestore for caching and reuse.

Prerequisites

Node.js (version 12.x or higher)
npm or yarn
yt-dlp installed on the server
deno for yt-dlp JS challenges
A valid all_cookies.txt (Netscape format) for YouTube requests

Installation

Clone this repository:

git clone https://github.com/andresz74/youtube-transcript-generator.git

Navigate to the project directory:
```
cd youtube-transcript-generator
```
Install the dependencies:
```
npm install
```

Create a .env file with your configuration:

PORT=3004
CHATGPT_VERCEL_URL=https://xxxxxxxxxx.vercel.app/api/openai-chat
API_ACCESS_KEY=your-shared-api-access-key
SMART_SUMMARY_MAX_TRANSCRIPT_CHARS=300000
SMART_SUMMARY_DIRECT_MAX_CHARS=24000
SMART_SUMMARY_CHUNK_TARGET_CHARS=8000
SMART_SUMMARY_CHUNK_OVERLAP_CHARS=200
SMART_SUMMARY_MAX_CHUNKS=32
SMART_SUMMARY_STAGE_TIMEOUT_MS=25000
SMART_SUMMARY_STAGE_RETRIES=1
TRANSCRIPT_DEBUG=false
SUMMARY_DEBUG=false

Create a Firebase service account key file as firebaseServiceAccount.json (not committed to Git). Make sure you’ve set up Firestore.
Add all_cookies.txt in the project root (do not commit it).

Usage

Start the server:

npm start

Or, using PM2:

pm2 start ecosystem.config.js

API Endpoints

✅ GET `/api/transcript`

Fetches captions for a YouTube video using the internal transcript fetcher (cookie + JS runtime aware).

Request:

/api/transcript?videoId=VIDEO_ID&lang=en

Response:

{
  "videoId": "VIDEO_ID",
  "lang": "en",
  "captions": [
    { "start": 0, "duration": 3.2, "text": "Hello world" }
  ]
}

✅ POST `/transcript`

Fetches full video info + timestamped transcript.

Request:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID"
}

Response: Full video info, subtitles with timestamps in all available languages.

✅ POST `/simple-transcript`

Returns only the video title and concatenated transcript in the first available language.

Request:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID"
}

Response:

{
  "duration": 14,
  "title": "Video Title",
  "transcript": "This is the transcript..."
}

✅ POST `/simple-transcript-v2`

Returns the video title, duration, and concatenated transcript in a user-specified language (or falls back to English/first available language). Also includes a list of available languages if multiple exist.

Request:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID",
  "lang": "es" // Optional: language code (e.g., "es" for Spanish)
}

Response:

{
  "duration": 14,
  "title": "Video Title",
  "transcript": "Transcript text in the requested language...",
  "languages": [ // Only included if multiple languages available
    { "name": "English", "code": "en" },
    { "name": "Spanish", "code": "es" }
  ]
}

Behavior:

If lang is specified, returns the transcript in that language (or errors if unavailable).
If no lang is provided, prioritizes English (non-auto-generated) or falls back to the first available language.
Includes languages array in the response when multiple subtitle tracks exist.

💡 Smart Caching with Firebase

[Rest of the existing content remains unchanged...]

Why Update?

The new /simple-transcript-v2 endpoint improves upon /simple-transcript by:

Supporting explicit language selection via lang parameter.
Providing transparency about available languages.
Maintaining backward compatibility with the original response format when no language is specified.

💡 Smart Caching with Firebase (for Transcripts and Summaries)

To avoid fetching and reprocessing the same YouTube video over and over, this service provides smart endpoints that store and reuse results using Firebase Firestore. These endpoints check whether a transcript or summary already exists in the database before doing any expensive computation or API calls.

This is ideal when you're using the service from a frontend (e.g., a Chrome Extension) where caching can significantly speed things up and reduce costs (e.g., OpenAI API requests).

🔐 Requirements for Using Smart Endpoints

You need to set up your own Firebase project and configure Firestore access for the service. Here's what you need to do:

1. Create a Firebase Project

Go to https://console.firebase.google.com
Click Add project → name it → continue
In the left panel, go to Firestore Database
Click Create database, start in production or test mode

2. Generate a Firebase Admin SDK Service Account

Go to your project settings (⚙️ > Project settings)
Click Service accounts
Click Generate new private key under the Firebase Admin SDK
Save the JSON file and rename it to:

firebaseServiceAccount.json

Place this file in the root of the project (where your index.js lives)

⚠️ DO NOT COMMIT this file to Git or push it to any public repo.

3. Enable Firestore API (if needed)

Sometimes Firestore is not enabled by default in your Google Cloud project. You can enable it at:

https://console.cloud.google.com/apis/library/firestore.googleapis.com

🧠 POST `/smart-transcript`

This endpoint checks if the transcript is already stored in Firestore. If found, it returns the saved version. If not, it fetches it from YouTube, saves it, and returns it.

Request:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID"
}

Response:

{
  "videoId": "VIDEO_ID",
  "title": "Video Title",
  "duration": 14,
  "transcript": "Full transcript text..."
}

🧠 POST `/smart-transcript-v2`

Fetches the transcript and metadata for a YouTube video, stores it in Firestore if not already present.

Request:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID"
}

Response:

{
  "videoId": "VIDEO_ID",
  "title": "Video Title",
  "duration": 14,
  "transcript": "Transcript text...",
  "description": "First line of the video description",
  "date": "2025-01-26",
  "image": "https://i.ytimg.com/vi/VIDEO_ID/maxresdefault.jpg",
  "tags": ["ios", "automation", "shortcuts"],
  "canonical_url": "https://blog.andreszenteno.com/notes/video-title"
}

🧠 POST `/smart-summary`

This endpoint checks Firestore for a summary of the video. If one exists, it's returned. If not, it uses the ChatGPT API to generate the summary (using the transcript), stores it in Firestore, and returns it.

You should send the transcript from the frontend if you already have it, to avoid duplicating the work of fetching it again.

Request:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID",
  "transcript": "Optional: transcript string"
}

Response:

{
  "summary": "This is the summary of the transcript.",
  "fromCache": true
}

fromCache: true → the summary was loaded from Firestore
fromCache: false → it was freshly generated using ChatGPT

🧠 POST `/smart-summary-firebase`

This endpoint provides similar functionality to /smart-summary but offloads the summary creation and caching to Firestore itself.

Request:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID",
  "model": "chatgpt" // Specify which model to use (chatgpt, deepseek, anthropic)
}

Response:

{
  "summary": "This is the summary of the transcript.",
  "fromCache": true
}

✅ POST `/smart-summary-firebase-v2`

Generates a markdown-formatted AI summary with frontmatter and tags. Caches both in Firestore.

Request:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID",
  "model": "openai"
}

Response:

{
  "summary": "---\ntitle: \"...\"\ndate: ...\ntags: [...]\n---\nSummary content...",
  "fromCache": false
}

Uses OpenAI to generate both the summary and tags.
Saves results to both summaries and transcripts collections in Firestore.
Includes YouTube link and properly formatted frontmatter for markdown usage.

🧠 POST `/smart-summary-firebase-v3`

This endpoint improves upon v2 by retrieving extended video metadata from Firestore (such as category, video author, publish date) and formatting the summary as a Markdown document with a full YAML frontmatter block. The summary is cached in Firestore to avoid redundant AI calls. If no tags exist on the transcript, tags are generated from the title/description/summary.

Request:

{
  "url": "https://www.youtube.com/watch?v=VIDEO_ID",
  "model": "chatgpt" // or "anthropic", "deepseek", etc.
}

Response

{
  "summary": "---\\ntitle: \\"...\\",\\ndate: ...\\ndescription: ...\\n...\\n---\\nSummary content...",
  "fromCache": false
}

✅ POST `/smart-summary-firebase-v3/async`

Queues summary generation and returns immediately. Use this from UIs to avoid client timeouts on long videos.

Response (202):

{
  "requestId": "req-123",
  "status": "queued",
  "statusUrl": "/summary-status/req-123"
}

✅ GET `/summary-status/:requestId`

Polls async summary status and returns `queued`, `processing`, `succeeded` (with `result.summary`), or `failed` (with structured error).

PM2 Notes

To start the server with PM2:

pm2 start ecosystem.config.js

Your ecosystem.config.js may look like this:

module.exports = {
  apps: [
    {
      name: 'youtube-transcript-generator',
      script: './index.js',
      watch: false,
      env: {
        PORT: 3004,
        CHATGPT_VERCEL_URL: 'https://your-vercel-url/api/openai-chat',
        API_ACCESS_KEY: 'your-shared-api-access-key'
      }
    }
  ]
};

To monitor logs:

pm2 logs youtube-transcript-generator

Troubleshooting

Empty transcripts: Ensure all_cookies.txt is fresh, yt-dlp is up to date, and deno is installed. Re-run a local check:
```
yt-dlp --cookies all_cookies.txt --js-runtimes deno --list-subs "https://www.youtube.com/watch?v=VIDEO_ID"
```
yt-dlp missing: Install or update from the official release and ensure /usr/local/bin is in PATH.
Verbose logs: Set TRANSCRIPT_DEBUG=true or SUMMARY_DEBUG=true in .env.
401 from model endpoints: Ensure API_ACCESS_KEY matches one of the keys configured in ai-access (API_ACCESS_KEYS).
413 on /smart-summary-firebase-v3: Transcript exceeded SMART_SUMMARY_MAX_TRANSCRIPT_CHARS.
Large transcript reliability: /smart-summary-firebase-v3 now uses direct mode for small transcripts and chunked hierarchical mode for large-but-valid transcripts.

Operational Checklist

Refresh all_cookies.txt on the same server where PM2 runs.
Verify yt-dlp --version and yt-dlp --list-subs work from the server.
Confirm deno is installed and usable by yt-dlp --js-runtimes deno.
Restart PM2 after any cookie or dependency changes.

Security Notes

Never commit all_cookies.txt or firebaseServiceAccount.json.
Rotate YouTube cookies if they are ever exposed in logs or chat.
Treat .env and any model endpoint URLs as secrets.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 196 Commits
lib		lib
tests		tests
.gitignore		.gitignore
.nvmrc		.nvmrc
AGENTS.md		AGENTS.md
README.md		README.md
documentation.md		documentation.md
fabric-youtube.js		fabric-youtube.js
index.js		index.js
logger.js		logger.js
package-lock.json		package-lock.json
package.json		package.json
youtube.js		youtube.js

Folders and files

Latest commit

History

Repository files navigation

YouTube Transcript and Video Info Service

Features

Prerequisites

Installation

Usage

API Endpoints

✅ GET /api/transcript

✅ POST /transcript

✅ POST /simple-transcript

✅ POST /simple-transcript-v2

💡 Smart Caching with Firebase

Why Update?

💡 Smart Caching with Firebase (for Transcripts and Summaries)

🔐 Requirements for Using Smart Endpoints

1. Create a Firebase Project

2. Generate a Firebase Admin SDK Service Account

3. Enable Firestore API (if needed)

🧠 POST /smart-transcript

Request:

Response:

🧠 POST /smart-transcript-v2

🧠 POST /smart-summary

Request:

Response:

🧠 POST /smart-summary-firebase

Request:

Response:

✅ POST /smart-summary-firebase-v2

🧠 POST /smart-summary-firebase-v3

✅ POST /smart-summary-firebase-v3/async

✅ GET /summary-status/:requestId

Polls async summary status and returns queued, processing, succeeded (with result.summary), or failed (with structured error).

PM2 Notes

Troubleshooting

Operational Checklist

Security Notes

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

✅ GET `/api/transcript`

✅ POST `/transcript`

✅ POST `/simple-transcript`

✅ POST `/simple-transcript-v2`

🧠 POST `/smart-transcript`

🧠 POST `/smart-transcript-v2`

🧠 POST `/smart-summary`

🧠 POST `/smart-summary-firebase`

✅ POST `/smart-summary-firebase-v2`

🧠 POST `/smart-summary-firebase-v3`

✅ POST `/smart-summary-firebase-v3/async`

✅ GET `/summary-status/:requestId`

Polls async summary status and returns `queued`, `processing`, `succeeded` (with `result.summary`), or `failed` (with structured error).

Packages