Skip to content

Latest commit

 

History

History
169 lines (125 loc) · 4.2 KB

File metadata and controls

169 lines (125 loc) · 4.2 KB

Caption Library Migration Summary

Date: January 20, 2026 Migration: youtube-transcript → youtube-caption-extractor


Why We Migrated

Problem with youtube-transcript

  • 0% success rate across all tested videos
  • Returns empty arrays [] for all videos and languages
  • YouTube changed their API format, breaking the library
  • No fix available, library appears unmaintained

Solution: youtube-caption-extractor

  • 100% success rate (3/3 videos tested)
  • 473ms average speed (2.7x faster than alternatives)
  • Uses modern YouTube Innertube API (future-proof)
  • Works with all video types (short, long, educational)
  • Actively maintained

Performance Comparison

Library Success Rate Avg Speed Result
youtube-caption-extractor 100% 473ms ✅ Winner
youtube-transcript-plus 100% 1285ms ✅ Works but slower
youtube-transcript (old) 0% N/A ❌ Broken
youtube-captions-scraper 0% N/A ❌ Failed
youtube-transcript-node 0% N/A ❌ Import error

Changes Made

1. Dependencies

# Removed
npm uninstall youtube-transcript

# Kept (already installed from testing)
youtube-caption-extractor

2. Import Statement

Before:

import { YoutubeTranscript } from "youtube-transcript";

After:

import { getSubtitles } from "youtube-caption-extractor";

3. API Call

Before:

const transcript = await YoutubeTranscript.fetchTranscript(videoId, { lang });

After:

const transcript = await getSubtitles({ videoID: videoId, lang });

4. Data Structure

Old format (youtube-transcript):

{
  offset: 9170,    // milliseconds
  duration: 7000,  // milliseconds
  text: "Hi lovely people..."
}

New format (youtube-caption-extractor):

{
  start: "9.17",   // seconds (string)
  dur: "7",        // seconds (string)
  text: "Hi lovely people..."
}

5. Timestamp Parsing

Updated to handle both formats for compatibility:

const startTime = Math.floor(Number(item.start || item.offset / 1000 || 0));
const duration = Number(item.dur || item.duration / 1000 || 0);
const endTime = Math.floor(startTime + duration);

Testing Results

Integration Test

✅ Success! Fetched 49 segments in 1416ms
✅ Timestamp parsing verified
✅ Data structure compatible

Test Videos

  1. Jamie Oliver Pizza (6:36) - ✅ 49 segments (351ms)
  2. Benchmark Video 1 (33min+) - ✅ 4076 segments (695ms)
  3. Benchmark Video 2 (15min) - ✅ 256 segments (372ms)

Benefits

  1. 60x faster than Whisper - 473ms vs 30s total processing time
  2. FREE for all videos - No API quota usage
  3. Better UX - Instant captions instead of 30s wait
  4. Saves resources - Preserves Groq/OpenAI quota for edge cases
  5. Future-proof - Uses modern Innertube API

Fallback Chain

The system now uses a reliable 3-tier approach:

  1. Primary: youtube-caption-extractor (473ms, FREE)
  2. Fallback 1: Groq Whisper (<25MB files, FREE)
  3. Fallback 2: OpenAI Whisper (>25MB files, $0.006/min)

Files Modified

  1. app/api/process-transcript/route.ts - Updated caption fetching
  2. package.json - Removed old library, kept new one
  3. TRANSCRIPTION-ANALYSIS.md - Updated with test results

Next Steps

  1. Deploy to Vercel production
  2. Monitor caption success rate
  3. Verify with real user videos
  4. Update any documentation

Rollback Plan

If issues occur (unlikely based on testing):

  1. Re-install youtube-transcript: npm install youtube-transcript
  2. Revert import and API calls in route.ts
  3. System will fall back to Whisper (already working 100%)

Conclusion

The migration from youtube-transcript to youtube-caption-extractor is complete and tested. The new library provides:

  • Better reliability (100% vs 0% success rate)
  • Faster performance (473ms vs broken)
  • Future-proof implementation (modern Innertube API)

This change significantly improves user experience while reducing resource usage.