A web application for merging timestamped VTT files with corrected TXT/PDF transcripts to produce accurate, timestamped transcriptions.
When transcribing oral history interviews:
- Automated transcription (e.g., MacWhisper) produces VTT files with timestamps but uncorrected text
- Manual corrections are made in TXT or PDF format without timestamps
- Need to combine corrected text with original timestamps
This tool aligns the corrected transcript with the timestamped version and generates a new VTT file with:
- Corrected text from the TXT/PDF
- Accurate timestamps from the original VTT
- Timestamps only at speaker changes
- Automatic splitting of segments longer than 2 minutes
- Client-side processing: All processing happens in your browser - no server needed, completely private
- Smart text alignment: Uses intelligent algorithms to match corrected text with timestamped versions
- Speaker detection: Automatically identifies and tracks speaker changes using labels (e.g., "John:", "Interviewer:")
- Smart segmentation: Timestamps only at speaker changes, with automatic splitting of segments longer than 2 minutes
- Drag-and-drop interface: Easy-to-use web interface
-
Open the app: Simply open
index.htmlin a modern web browser (Chrome, Firefox, Safari, Edge) -
Upload your files:
- Corrected Transcript (TXT or PDF): Your manually corrected transcript with speaker labels (TXT recommended for best results)
- Timestamped VTT: The original VTT file from MacWhisper (or similar tool) with timestamps
-
Process: Click the "Process Files" button
-
Download: Download your new, corrected VTT file with accurate timestamps
- Modern web browser with JavaScript enabled
- TXT or PDF with speaker labels in format: "Speaker Name:" or "INTERVIEWER:"
- VTT file with standard WebVTT format
- Recommendation: Export corrected transcripts as TXT from Adobe Acrobat Pro (File → Export To → Text) for cleanest results
- Frontend: Pure HTML/CSS/JavaScript (no build process required)
- PDF Processing: PDF.js library for text extraction
- Text Alignment: Custom algorithm using similarity scoring and sequence matching
- Client-side only: No data leaves your computer
- Extracts text from the corrected TXT or PDF, identifying speaker segments
- Parses the timestamped VTT file into structured data
- Aligns the corrected text with the uncorrected text using similarity matching
- Transfers timestamps from the original VTT to the corrected text
- Merges consecutive segments from the same speaker
- Splits segments longer than 2 minutes at natural boundaries
- Generates a new VTT file ready for use
To modify or extend the application:
# Clone the repository
git clone https://github.com/jeffpooley/transcript-synchronizer.git
cd transcript-synchronizer
# Open index.html in your browser
open index.htmlindex.html- Main application interfacestyles.css- Application stylingapp.js- Main application logic and UI handlingtranscript-parser.js- TXT/PDF text extraction and speaker detectionvtt-parser.js- VTT file parsing and generationtext-aligner.js- Text alignment and timestamp transfer algorithms
MIT License - feel free to use and modify for your projects.