autorag.audio_source

class autorag.audio_source.AudioSource(path, source_url, video_id, title=None, upload_date=None, duration_s=None, uploader=None)[source]

Bases: object

Resolved audio input plus its original-source identity and metadata.

path is the local file the rest of the pipeline reads. source_url and video_id are populated only when the input was a YouTube URL. The remaining fields surface yt-dlp’s info dict (title, upload date, duration, uploader) so downstream persistence can record human-readable metadata instead of falling back to the temp filename / mtime.

Parameters:
path: Path
source_url: str | None
video_id: str | None
title: str | None = None
upload_date: str | None = None
duration_s: float | None = None
uploader: str | None = None
autorag.audio_source.is_youtube_url(value)[source]

Return True iff value parses as an http(s) URL on a YouTube host.

Parameters:

value (str)

Return type:

bool

autorag.audio_source.default_title_from(source)[source]

Derive a clip title from a local path or YouTube URL.

YouTube URLs resolve to the video id; local paths resolve to the file stem. Used as a fallback when neither a caller-supplied title nor a yt-dlp-provided title is available.

Parameters:

source (str)

Return type:

str

autorag.audio_source.resolve_audio_input(source)[source]

Yield an AudioSource for source.

If source is a YouTube URL, download the best audio stream into a temporary directory and yield a populated AudioSource whose path points into that tempdir. The tempdir is removed on exit. Otherwise treat source as a local path and yield a path-only AudioSource after verifying the file exists.

Parameters:

source (Path | str)

Return type:

Iterator[AudioSource]