Add YouTube design concept extractor tool (#432)

* feat: add YouTube design concept extractor tool

Extracts transcript, metadata, and keyframes from YouTube videos
into a structured markdown reference document for agent consumption.

Supports interval-based frame capture, scene-change detection, and
chapter-aware transcript grouping.

https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV

* feat: add OCR and color palette extraction to yt-design-extractor

- Add --ocr flag with Tesseract (fast) or EasyOCR (stylized text) engines
- Add --colors flag for dominant color palette extraction via ColorThief
- Add --full convenience flag to enable all extraction features
- Include OCR text alongside each frame in markdown output
- Add Visual Text Index section for searchable on-screen text
- Export ocr-results.json and color-palette.json for reuse
- Run OCR in parallel with ThreadPoolExecutor for performance

https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV

* feat: add requirements.txt and Makefile for yt-design-extractor

- requirements.txt with core and optional dependencies
- Makefile with install, deps check, and run targets
- Support for make run-full, run-ocr, run-transcript variants
- Cross-platform install-ocr target (apt/brew/dnf)

https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV

* chore: move Makefile to project root for easier access

Now `make install-full` works from anywhere in the project.

https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV

* fix: make easyocr truly optional, fix install targets

- Remove easyocr from install-full (requires PyTorch, causes conflicts)
- Add separate install-easyocr target with CPU PyTorch from official index
- Update requirements.txt with clear instructions for optional easyocr
- Improve make deps output with clearer status messages

https://claude.ai/code/session_01KZxeSK9A2F2oZUoHgxUUBV

* fix: harden error handling and fix silent failures in yt-design-extractor

- Check ffmpeg return codes instead of silently producing 0 frames
- Add upfront shutil.which() checks for yt-dlp and ffmpeg
- Narrow broad except Exception catches (transcript, OCR, color)
- Log OCR errors instead of embedding error strings in output data
- Handle subprocess.TimeoutExpired on all subprocess calls
- Wrap video processing in try/finally for reliable cleanup
- Error on missing easyocr when explicitly requested (no silent fallback)
- Fix docstrings: 720p fallback, parallel OCR, chunk duration, deps
- Split pytesseract/Pillow imports for clearer missing-dep messages
- Add run-transcript to Makefile .PHONY and help target
- Fix variable shadowing in round_color (step -> bucket_size)
- Handle json.JSONDecodeError from yt-dlp metadata
- Format with ruff

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Seth Hobson <wshobson@gmail.com>
This commit is contained in:
bentheautomator
2026-02-07 09:06:56 +08:00
committed by GitHub
parent 089740f185
commit 5d65aa1063
3 changed files with 950 additions and 0 deletions

21
tools/requirements.txt Normal file
View File

@@ -0,0 +1,21 @@
# Core dependencies
yt-dlp>=2024.0.0
youtube-transcript-api>=0.6.0
Pillow>=10.0.0
# OCR (Tesseract) - also requires: apt install tesseract-ocr
pytesseract>=0.3.10
# Color palette extraction
colorthief>=0.2.1
# ---------------------------------------------------------
# OPTIONAL: EasyOCR (better for stylized text)
# ---------------------------------------------------------
# EasyOCR requires PyTorch (~2GB). Install separately:
#
# pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
# pip install easyocr
#
# Or just use tesseract (default) - it works great for most videos.