I have the following method that is called sequentially:
- private StringBuilder ReadPDF ();
- private StringBuilder CleanText (StringBuilder sb);
- private void ParseText ();
ParseText calls ReadPDF, which calls CleanText;
The PDF analysis that I am processing contains 15 MB of text, and it takes only 10 minutes to get all the data from the file using a regular kernel with two duos.
How can I parallelize these tasks?
edit: just to clarify, reading PDF takes very little time, the problem is the analysis of the extracted text, and more precisely in the CleanText phase. The reason I need to parallelize is because it cleans up one page instantly, but clearing 2k + pages takes a lot of time.
source
share