Visualization (/viz)¶
GET /viz serves an interactive 3-D scatter plot of every topic in
the SQLite database. The page is a Vite-built React + r3f bundle (see
Frontend (/viz)); the data behind it comes from two
JSON endpoints.
Requirements: the [rag] extra (Chroma + UMAP + sklearn) and the
[server] extra (FastAPI). Without [rag], the FastAPI app boots
fine but the /viz route and /viz-assets mount are silently
skipped.
Running it¶
autorag serve --host 127.0.0.1 --port 8000
Then open http://127.0.0.1:8000/viz.
The pipeline behind the page¶
Embed — every persisted topic’s
"<title>. <summary>"is embedded with the Ollama embedding model (defaultnomic-embed-text). Cached vectors are read from Chroma; missing ones are computed on demand byautorag.embed.Embedderand written back.Project —
autorag.viz.umap_3d()projects every vector to 3 dimensions withmetric="cosine"andn_neighbors=15.Cluster —
autorag.topic_cluster.cluster_embeddings()groups topics via agglomerative clustering (linkage="average"). The cut is controlled by thedistance_thresholdquery param (default 0.35).Edges —
autorag.topic_cluster.build_edges()wires each topic’s top-5 cosine neighbours above 0.60 similarity as undirected edges in the scatter.
Endpoints¶
GET /viz— HTML for the React app.GET /viz/data?distance_threshold=0.35— UMAP coordinates, cluster labels, and edges asautorag.viz.VizData.GET /viz/search?q=<query>&top_k=10— semantic search hits as a list ofautorag.viz.SearchResult.
Example /viz/search query:
GET /viz/search?q=gradient+descent&top_k=5
[
{
"point_index": 12,
"topic_title": "Backpropagation deep-dive",
"clip_title": "ML Lecture 3",
"clip_id": "...",
"similarity": 0.91,
"summary": "..."
}
]