Installation¶
AutoRAG is distributed from GitHub, not PyPI. The base install carries
only typer, pydantic, langchain-core, and
langchain-ollama — anything heavier (Whisper, pyannote, Chroma,
UMAP, yt-dlp, FastAPI) is gated behind an install extra so the import
from autorag import AutoRAG stays fast.
Choosing your extras¶
Pick the smallest set that unlocks the methods you intend to call:
Extra |
Adds |
Use when you want… |
|---|---|---|
|
whisperx, torch, imageio-ffmpeg |
…to call |
|
pyannote.audio, huggingface-hub |
…speaker labels on every word (combine with |
|
yt-dlp |
…to pass a YouTube URL as |
|
chromadb, umap-learn, scikit-learn, pydantic_sqlite, numpy |
… |
|
fastapi, uvicorn[standard] |
… |
|
pika |
…the async, RabbitMQ-driven pipeline ( |
|
everything above |
…the full local-dev stack. |
[diarize] rides on top of [audio] — pyannote needs the same
torch + ffmpeg stack. Install them together.
Installing from a tagged release¶
# Audio → topics agent only
pip install "autorag[audio,diarize] @ git+https://github.com/AutoLogger/AutoRAG@v0.7.0"
# Audio + YouTube URL support
pip install "autorag[audio,diarize,youtube] @ git+https://github.com/AutoLogger/AutoRAG@v0.7.0"
# Full stack (audio, diarize, rag, server, youtube)
pip install "autorag[all] @ git+https://github.com/AutoLogger/AutoRAG@v0.7.0"
Calling a method whose extra is missing raises
MissingExtraError with a hint naming the
pip install command that fixes it.
Local development¶
Inside a checkout of the repository, AutoRAG uses uv (not pip):
uv sync --all-extras # install everything
uv sync --group docs # add the docs build deps
uv run pytest # run the test suite
uv run autorag --help # invoke the CLI
See Packaging and distribution for the release flow.
Required external services¶
AutoRAG calls Ollama for LLM chat and embeddings; you need a local (or
remote) Ollama running before invoking generate_topics, query,
or the /viz page. Diarization needs an HF token; the optional async
pipeline needs a broker:
Ollama —
AUTORAG_OLLAMA_BASE_URL(defaulthttp://localhost:11434) andAUTORAG_EMBED_MODEL(defaultnomic-embed-text). LLM chat usesgemma4:latestby default (a thinking-capable model; the agent disables thinking by default — see the audio-pipeline-design internals page).Hugging Face —
HF_TOKENis required for the gatedpyannote/speaker-diarization-3.1model. Without it, every word is labelled"0".RabbitMQ — only for the async
[broker]path;AUTORAG_BROKER_URL(defaultamqp://localhost:5672). The synchronous SDK / CLI / API never need it. The whole async stack (RabbitMQ + Ollama + workers +docker-socket-proxy) is brought up on the host with one command,./scripts/stack.sh up(.envis optional — onlyHF_TOKENfor diarization), or pointAUTORAG_BROKER_URLat an external broker. See Running the HTTP server for the full deployment.