Document RAG¶

In addition to the audio→topics pipeline, AutoRAG exposes a generic document retrieval-augmented generation flow:

from autorag import AutoRAG

rag = AutoRAG()
rag.ingest(["./notes", "./design-docs"])
answer = rag.query("What did we decide about retries?", top_k=5)
print(answer)

The flow:

autorag.ingest.load_documents() reads each path into a Document.
autorag.ingest.chunk_document() splits each document into overlapping Chunk records sized by Settings.chunk_size / Settings.chunk_overlap.
autorag.embed.Embedder calls Ollama (nomic-embed-text by default) and writes the resulting vectors into each chunk.
A autorag.store.VectorStore persists them.
At query time, autorag.retrieve.Retriever embeds the question and pulls the top_k nearest chunks.
autorag.generate.Generator assembles a context-grounded prompt and returns an answer string.

Configuration¶

The defaults live on autorag.config.Settings and can be overridden via environment variables (all prefixed with AUTORAG_):

Setting	Default	Env var
`chunk_size`	1000	`AUTORAG_CHUNK_SIZE`
`chunk_overlap`	200	`AUTORAG_CHUNK_OVERLAP`
`top_k`	5	`AUTORAG_TOP_K`
`db_path`	`~/.autorag/autorag.db`	`AUTORAG_DB_PATH`

Status¶

The RAG pipeline modules (ingest, store, generate, retrieve) ship as stub interfaces. The audio→topics pipeline is fully implemented; the document-RAG side is the natural place to plug in your own loaders and vector store.

The CLI commands autorag ingest and autorag query are wired up to the SDK so a backend swap surfaces through the CLI and HTTP API without further changes.