autorag.topic_cluster¶
Semantic clustering and similarity-edge construction for topic embeddings.
- autorag.topic_cluster.cluster_embeddings(embeddings, distance_threshold=0.35)[source]¶
Assign cluster labels to topic embeddings using agglomerative clustering.
distance_threshold is cosine distance (0-2); 0.35 ~ similarity >= 0.65. Returns an int array of shape (N,) with labels 0..K-1.
- Parameters:
embeddings (
ndarray)distance_threshold (
float)
- Return type:
ndarray