Iโm thinking about putting stop words in my affinity program and then stem (for porters 1 or 2 it depends on what is easiest to implement)
I was wondering that since I read my text from files as whole lines and saved them as a long line, so if I have two ex lines.
String one = "I decided buy something from the shop.";
String two = "Nevertheless I decidedly bought something from a shop.";
Now that I got these lines
Morphological: Can I just use the algorithmic algorithms directly on it, save it as a string and continue working on it in a similar way as it was before introducing the program into the program, for example running one.stem (); Such things?
Stop word: How does it work? oo I just use; one.replaceall ("I", ""); or is there any specific way to use this process? I want to continue working with the string and get the string before using the similarity algorithms to get the similarities. Wiki doesn't say much.
I hope you help me! Thank you
Edit: This is for a school project where I am writing an article on the similarities between different algorithms, so I donโt think that I am allowed to use lucene or other libraries that do this work for me. Plus, I would like to try to understand how this works before I start using libraries like Lucene and co. Hope this doesn't bother you too much. ^^
source
share