CLI (autorag.cli)¶
The autorag console script is a thin Typer wrapper over
AutoRAG. The exposed commands are
transcribe, generate-topics, blocks, ingest,
query, serve, and the optional jobs submit /
jobs status subcommands (async pipeline; needs the [broker]
extra). The audio commands own temp-file lifetimes for YouTube URLs
and forward optional metadata (title, upload date, source URL) to
AutoRAG.persist_transcription.
- autorag.cli.ingest(paths=<typer.models.ArgumentInfo object>)[source]¶
Ingest one or more files/directories into the vector store.
- autorag.cli.query(question=<typer.models.ArgumentInfo object>, top_k=<typer.models.OptionInfo object>)[source]¶
Ask a question against the ingested corpus.
- autorag.cli.transcribe(source=<typer.models.ArgumentInfo object>, title=<typer.models.OptionInfo object>, whisper_model=<typer.models.OptionInfo object>, language=<typer.models.OptionInfo object>, persist=<typer.models.OptionInfo object>, db_override=<typer.models.OptionInfo object>)[source]¶
Transcribe an audio file or YouTube URL and output word spans as JSON.
- autorag.cli.generate_topics(source=<typer.models.ArgumentInfo object>, title=<typer.models.OptionInfo object>, whisper_model=<typer.models.OptionInfo object>, provider=<typer.models.OptionInfo object>, llm_model=<typer.models.OptionInfo object>, num_ctx_l1=<typer.models.OptionInfo object>, num_ctx_fanout=<typer.models.OptionInfo object>, max_concurrency=<typer.models.OptionInfo object>, min_subdivide_duration_s=<typer.models.OptionInfo object>, reasoning=<typer.models.OptionInfo object>, boundary_block_seconds=<typer.models.OptionInfo object>, language=<typer.models.OptionInfo object>, transcription_json=<typer.models.OptionInfo object>, persist=<typer.models.OptionInfo object>, db_override=<typer.models.OptionInfo object>)[source]¶
Generate topics for an audio file or YouTube URL, transcribing first if not cached.
- autorag.cli.blocks(source=<typer.models.ArgumentInfo object>, seconds=<typer.models.OptionInfo object>, force_retranscribe=<typer.models.OptionInfo object>, title=<typer.models.OptionInfo object>, db_override=<typer.models.OptionInfo object>, whisper_model=<typer.models.OptionInfo object>, language=<typer.models.OptionInfo object>)[source]¶
Print the transcription as N-second time blocks, one line per speaker turn.
Reads from the cached SQLite row when present; otherwise runs Whisper transcription and persists the words first. Topic generation is not performed here; use the
transcribecommand for that.
- autorag.cli.jobs_submit(source=<typer.models.ArgumentInfo object>, title=<typer.models.OptionInfo object>, whisper_model=<typer.models.OptionInfo object>, llm_model=<typer.models.OptionInfo object>, language=<typer.models.OptionInfo object>)[source]¶
Enqueue an audio→topics job on the broker; prints the job id.