How to get cluster center text from scikit-learn KMeans?

I have a list of strings that I use to match sklearn.cluster.KMeans:

X = TfidfVectorizer().fit_transform(docs)
km = KMeans().fit(X)

Now I would like to get the centers of the clusters in their original row representation. I know km.cluster_centers_, but could not figure out how to get the corresponding indexes docs.

+5
source share
1 answer

There is no “initial representation” of cluster centers by k-value; they are not actually points (vectorized documents) from the input data set, but are several points. Such tools cannot be converted back to documents, since the presentation of the summary word destroys the order of the terms.

- , TfidfVectorizer.inverse_transform , , tf-idf .

, k-medoids, , scikit .

+6

All Articles