An effective way to find the words “hooks” in a word list?

A keyword is a word in which you can add one letter to the beginning or end and make a new word.

I have a rather large list of words (about 170 thousand), and I would like to choose 5 random words-hooks. The problem is that the method I'm using is very slow. See below:

Random rnd = new Random();
var hookBases = (from aw in allWords  //allWords is a List<string>
                from aw2 in allWords
                where aw2.Contains(aw) 
                      && aw2.Length == aw.Length + 1 
                      && aw[0] == 'c'
                select aw).OrderBy(t => rnd.Next()).Take(5);

When I try to access any of it hookBase, it spins for several minutes before I give up and kill it.

Can anyone see the obvious errors with the way I'm trying to do this? Any suggestions on a better way?

+3
source share
2 answers

-, HashSet<string>, List<string>.

, . .

HashSet<string> result = new HashSet<string>();
foreach (string word in allWords) {
    string candidate = word.Substring(0, word.Length - 1);
    if (allWords.Contains(candidate)) { result.Add(candidate); }
    candidate = word.Substring(1, word.Length - 1);
    if (allWords.Contains(candidate)) { result.Add(candidate); }
}

LINQ:

List<string> hookWords = allWords
    .Select(word => word.Substring(0, word.Length - 1))
    .Concat(allWords.Select(word => word.Substring(1, word.Length - 1)))
    .Distinct()
    .Where(candidate => allWords.Contains(candidate))
    .ToList();

, : ideone

+6

- . linq, .net ddbb . , . Microsoft .

-1

All Articles