Extract word plus 20 more from section (python)

Jep is still playing with Python.

I decided to try Gensim, a topic search tool for the selected word and context.

So, I wondered how to find a word in a section of a text and extract 20 words from it (as in 10 words before this spectral word and 10 words after this specific word), then save it together with other similar emphasis Gensim could work on it.

It seems to me that it’s hard for me to find a way to extract 10 words before and after when the selected word found is found. I played with nltk before and just signing the text with words or sentences made it easy to get sentences. Still getting these words or sentences before and after this particular sentence, it’s hard for me to figure out how to do this.

For those who are confused (here I am here, so I can be confusing). I will show this with an example:

As soon as this ended, all her blood rushed into her heart, because she was so angry that she heard that Snow White was still living. “But now,” she thought to herself: “I will do something to completely destroy her.” Thus, she made the poisoned crest an art that she understood, and then, after changing clothes, she took the form of an old widow. She crossed seven hills to the house of seven dwarfs, and [15] knocking on the door, called out: "Good goods for sale today!"

If we say that the word "Snow White", I would like to extract this part:

her heart, because she was so angry that she heard that Snow White was still living. “But now,” she thought to herself, “will

10 words before and after Snow White.

, Snow-White, nltk .

, . , - .

... , . 3 ... , , atm .

+3
2
strs="""
As soon as it had finished, all her blood rushed to her heart, for she was so angry to hear that Snow-White was yet living. "But now," thought she to herself, "will I make something which shall destroy her completely." Thus saying, she made a poisoned comb by arts which she understood, and then, disguising herself, she took the form of an old widow. She went over the seven hills to the house of the seven Dwarfs, and[15] knocking at the door, called out, "Good wares to sell to-day!"
"""
spl=strs.split()

def ans(word):
    for ind,x in enumerate(spl):
       if x.strip(",'.!")==word or x.strip(',".!')==word:
           break    
    print(" ".join(spl[ind-10:ind]+spl[ind:ind+11]))

ans('Snow-White')

her heart, for she was so angry to hear that Snow-White was yet living. "But now," thought she to herself, "will
+4

(KWIC).

. , . re.split re.findall, .

, , .

, deque maxlen .

itertools:

from re import finditer
from itertools import tee, islice, izip, chain, repeat

def kwic(text, tgtword, width=10):
    'Find all occurrences of tgtword and show the surrounding context'
    matches = (mo.span() for mo in finditer(r"[A-Za-z\'\-]+", text))
    padded = chain(repeat((0,0), width), matches, repeat((-1,-1), width))
    t1, t2, t3 = tee((padded), 3)
    t2 = islice(t2, width, None)
    t3 = islice(t3, 2*width, None)
    for (start, _), (i, j), (_, stop) in izip(t1, t2, t3):
        if text[i: j] == tgtword:
            context = text[start: stop]
            yield context

print list(kwic(text, 'Snow-White'))
+7

All Articles