Saving data in sklearn

Question

Saving data in sklearn

I use scikit-learn for clustered text documents. I use the CountVectorizer, TfidfTransformer, and MiniBatchKMeans classes to help me with this. New text documents are added to the system all the time, which means that I need to use the above classes to convert the text and predict the cluster. My question is: how to store data on disk? Should I just sort the objects of the vectorizer, transformer and kmeans? Should I just save the data? If so, how can I add it back to the vectorizer, transformer and kmeans objects?

Any help would be greatly appreciated.

+5

python scikit-learn machine-learning data-mining

pnsilva Jun 21 '12 at 15:41

source share

2 answers

, , sk-learn pickle .

, , , . , , ?

+4

Rob Neuhaus 21 . '12 19:21

ogrisel · Accepted Answer · 2012-06-22T14:31:16+0000

, .

, , , ( ) .

, , + , () , .

, (., , ), , ( "" ).

, : Persist Tf-Idf data

Saving data in sklearn

More articles: