Saving data in sklearn

I use scikit-learn for clustered text documents. I use the CountVectorizer, TfidfTransformer, and MiniBatchKMeans classes to help me with this. New text documents are added to the system all the time, which means that I need to use the above classes to convert the text and predict the cluster. My question is: how to store data on disk? Should I just sort the objects of the vectorizer, transformer and kmeans? Should I just save the data? If so, how can I add it back to the vectorizer, transformer and kmeans objects?

Any help would be greatly appreciated.

+5
source share
2 answers

, .

, , , ( ) .

, , + , () , .

, (., , ), , ( "" ).

, : Persist Tf-Idf data

+6

, , sk-learn pickle .

, , , . , , ?

+4

All Articles