Audio pipeline¶
Three modules sit behind AutoRAG.transcribe:
autorag.whisper_runner— whisperX (faster-whisper + wav2vec2 forced-alignment) transcription with frame-accurate word timestamps and a CUDA→CPU fallback.autorag.diarize— pyannote 3.1 speaker diarization. Adds thespeakerfield on everyWordSpan.autorag.audio_source— YouTube URL detection and a context manager that downloads remote audio to a temp file while exposing yt-dlp metadata.