Replace words using Soundex, python

I have a list of sentences, and basically my goal is to replace all cases of the appearance of prepositions in the form of "opp, nr, off, abv, behnd" with their correct spelling "opposite, next, above, back" and so on by. The code for the word soundex is the same, so I need to build an expression to iterate over this list by word, and if soundex is the same, replace it with the correct spelling.

Example - [“Jack was standing on a tree,
       ” “they were everything he planned,
   ” “Just stand against the meter,
   ” “Go to the gas station”]

so i need to replace the words nr, abv, opp and twrds with my right full forms. The soundex code in the direction and twrds are the same, so it needs to be replaced.
I need to iterate over this list.
here is the soundex algorithm:

import string

allChar = string.uppercase + string.lowercase
charToSoundex = string.maketrans(allChar, "91239129922455912623919292" * 2)

def soundex(source):
    "convert string to Soundex equivalent"

    # Soundex requirements:
    # source string must be at least 1 character
    # and must consist entirely of letters
    if (not source) or (not source.isalpha()):
    return "0000"

    # Soundex algorithm:
    # 1. make first character uppercase
    # 2. translate all other characters to Soundex digits
    digits = source[0].upper() + source[1:].translate(charToSoundex)

    # 3. remove consecutive duplicates
    digits2 = digits[0]
    for d in digits[1:]:
        if digits2[-1] != d:
           digits2 += d

    # 4. remove all "9"s
    # 5. pad end with "0"s to 4 characters
    return (digits2.replace('9', '') + '000')[:4]

if __name__ == '__main__':
   import sys
   if sys.argv[1:]:
      print soundex(sys.argv[1])
   else:
    from timeit import Timer
    names = ('Woo', 'Pilgrim', 'Flingjingwaller')
    for name in names:
        statement = "soundex('%s')" % name
        t = Timer(statement, "from __main__ import soundex")
        print name.ljust(15), soundex(name), min(t.repeat())

am newbie, so if there is another approach you could suggest, it would be helpful .. thanks.

+3
source share
1 answer

I use the enchant module:

import enchant
d = enchant.Dict("en_US")

phrase = ['Jack was standing nr the tree' ,
'they were abv everything he planned' ,
'Just stand opp the counter' ,
'Go twrds the gas station']

output = []
for section in phrase:
    sect = ''
    for word in section.split():
        if d.check(word):
            sect += word + ' '
        else:
            for correct_word in d.suggest(word):
                if soundex(correct_word) == soundex(word):
                    sect +=  correct_word + ' '
    output.append(sect[:-1])
0
source

All Articles