CLI reference ============= ``autorag`` is a thin Typer wrapper over :class:`~autorag.core.AutoRAG`. Every subcommand maps to one (or two) SDK methods. ``autorag transcribe`` ---------------------- Whisper + diarization → ``WordSpan`` list. With ``--persist`` (the default), the words are written to SQLite. .. code-block:: text autorag transcribe SOURCE [OPTIONS] SOURCE Audio file path or YouTube URL. --title -t TEXT Clip title (defaults to YouTube video title for URLs, else filename stem / video id). --whisper-model -w TEXT tiny / base / small / medium / large [default: base] --language -l TEXT Whisper language code [default: en]; pass '' to auto-detect. --persist/--no-persist Write word spans to SQLite (default: true). --db PATH Override database path. ``autorag generate-topics`` --------------------------- Full audio→topics pipeline: transcribe (or read from cache, or accept a pre-computed ``--transcription`` JSON), run the LLM topic extraction, persist everything. .. code-block:: text autorag generate-topics SOURCE [OPTIONS] --provider -p TEXT LLM provider [default: ollama] --llm-model -m TEXT LLM model [default: gemma4:latest] --language -l TEXT Whisper language code [default: en]; pass '' to auto-detect. --num-ctx-l1 INT LLM context for the Stage 2 L1-boundary call [default: 8192]; raise to ~16384 for 1hr+ audio (costs one model reload). --num-ctx-fanout INT LLM context for the batched fan-out stages 3a/3b/4/5 [default: 8192]. --max-concurrency INT Max parallel LLM calls in batched stages [default: 4]; match OLLAMA_NUM_PARALLEL. --min-subdivide-duration-s Minimum L1 span length in seconds FLOAT before the L2 subdivide decision runs [default: 120.0]. --reasoning/--no-reasoning Enable chain-of-thought on thinking-capable models (slower; default: --no-reasoning). --boundary-block-seconds Time-bucket window (s) for the L1/L2 INT boundary-prompt transcript [default: 30]; smaller = finer MM:SS anchors but more prompt tokens. --transcription -T TEXT Pre-computed WordSpan JSON (skip Whisper) --persist/--no-persist Write transcription + topics to SQLite/Chroma (default: true). Outputs the persisted topic JSON to stdout; a timing breakdown (whisper / agent / cli_store_words / cli_finalize / cli_embed) goes to stderr. ``autorag blocks`` ------------------ Cached, dependency-friendly view of a previously transcribed clip: ``MM:SS-MM:SS Speaker K: …`` lines bucketed into N-second blocks. .. code-block:: text autorag blocks SOURCE [OPTIONS] --seconds -n INT Block length [default: 10] --force-retranscribe Re-run transcription even if cached. Reads straight from SQLite when the clip is already there — only the ``[rag]`` extra is needed for the cache hit. On a miss the ``[audio]`` / ``[diarize]`` / ``[youtube]`` extras are imported lazily to run the full pipeline first, then format. Equivalent SDK call: :meth:`AutoRAG.transcribe_blocks `. ``autorag ingest`` ------------------ .. code-block:: text autorag ingest PATH [PATH ...] Ingest one or more files or directories into the vector store. ``autorag query`` ----------------- .. code-block:: text autorag query QUESTION [--top-k K] Ask a question against the ingested corpus and print the generated answer. ``autorag serve`` ----------------- .. code-block:: text autorag serve [--host HOST] [--port PORT] [--reload] Run the HTTP API server (default ``http://127.0.0.1:8000``). See :doc:`server`. ``autorag jobs`` ---------------- Optional async pipeline. Needs the ``[broker]`` + ``[rag]`` extras and a running RabbitMQ; it is fully decoupled from the synchronous commands above (installing or running it changes nothing about ``transcribe`` / ``generate-topics`` / ``serve``). .. code-block:: text autorag jobs submit SOURCE [OPTIONS] SOURCE Audio file path or YouTube URL. --title -t TEXT Clip title. --whisper-model -w TEXT tiny / base / small / medium / large [default: base] --llm-model -m TEXT LLM model [default: gemma4:latest] --language -l TEXT Whisper language code [default: en] autorag jobs status JOB_ID ``submit`` enqueues the job on the broker and prints ``{"job_id": …, "session_id": …}``; ``status`` prints the job's status + per-stage state as JSON. A finished async job writes the **same** SQLite / Chroma rows a CLI run would, so :doc:`/viz ` and every other reader work unchanged. Without the extras the commands exit with an install hint. See the "Async pipeline" section of :doc:`server` and ``CLAUDE.md`` for the architecture.