Audio pipeline ============== Three modules sit behind :meth:`AutoRAG.transcribe `: * :mod:`autorag.whisper_runner` — whisperX (faster-whisper + wav2vec2 forced-alignment) transcription with frame-accurate word timestamps and a CUDA→CPU fallback. * :mod:`autorag.diarize` — pyannote 3.1 speaker diarization. Adds the ``speaker`` field on every :data:`~autorag.types.WordSpan`. * :mod:`autorag.audio_source` — YouTube URL detection and a context manager that downloads remote audio to a temp file while exposing yt-dlp metadata. .. toctree:: :maxdepth: 1 audio-pipeline/whisper_runner audio-pipeline/diarize audio-pipeline/audio_source