Document RAG¶
In addition to the audio→topics pipeline, AutoRAG exposes a generic document retrieval-augmented generation flow:
from autorag import AutoRAG
rag = AutoRAG()
rag.ingest(["./notes", "./design-docs"])
answer = rag.query("What did we decide about retries?", top_k=5)
print(answer)
The flow:
autorag.ingest.load_documents()reads each path into aDocument.autorag.ingest.chunk_document()splits each document into overlappingChunkrecords sized bySettings.chunk_size/Settings.chunk_overlap.autorag.embed.Embeddercalls Ollama (nomic-embed-textby default) and writes the resulting vectors into each chunk.A
autorag.store.VectorStorepersists them.At query time,
autorag.retrieve.Retrieverembeds the question and pulls thetop_knearest chunks.autorag.generate.Generatorassembles a context-grounded prompt and returns an answer string.
Configuration¶
The defaults live on autorag.config.Settings and can be
overridden via environment variables (all prefixed with
AUTORAG_):
Setting |
Default |
Env var |
|---|---|---|
|
1000 |
|
|
200 |
|
|
5 |
|
|
|
|
Status¶
The RAG pipeline modules (ingest, store, generate,
retrieve) ship as stub interfaces. The audio→topics pipeline is
fully implemented; the document-RAG side is the natural place to plug
in your own loaders and vector store.
The CLI commands autorag ingest and autorag query are wired up
to the SDK so a backend swap surfaces through the CLI and HTTP API
without further changes.