Regular expression does not match

I am writing a small python script to collect some data from a database, the only problem is when I export the data as XML from mysql, it includes the \ b character in the XML file. I wrote the code to remove it, but then I realized that I did not need to do this processing every time, so I put it in a method and call it. I find \ b in the XML file, only now the regular expression does not match, even though I know that \ b is.

that's what I'm doing:

The main program:

'''Program should start here'''
#test the file to see if processing is needed before parsing
for line in xml_file:
    p = re.compile("\b")
    if(p.match(line)):
        print p.match(line)
        processing = True
        break #only one match needed

if(processing):
    print "preprocess"
    preprocess(xml_file)

Pretreatment Method:

def preprocess(file):
    #exporting from MySQL query browser adds a weird
    #character to the result set, remove it
    #so the XML parser can read the data
    print "in preprocess"
    lines = []
    for line in xml_file:
        lines.append(re.sub("\b", "", line))

    #go to the beginning of the file
    xml_file.seek(0);
    #overwrite with correct data
    for line in lines:
        xml_file.write(line);
    xml_file.truncate()

Any help would be great, Thanks

+3
source share
3 answers

\bis the flag for the regex engine :

, . - , . , \b \w \W, , -, UNICODE LOCALE. \b Pythons.

, .

+7

. Python ( , ), 3 :

p = re.compile("\\\b")

, \b.

+1

Correct me if I am wrong, but there is no need to use regEx to replace '\ b', you can just use the replacement method for this purpose:

def preprocess(file):
    #exporting from MySQL query browser adds a weird
    #character to the result set, remove it
    #so the XML parser can read the data
    print "in preprocess"
    lines = map(lambda line: line.replace("\b", ""), xml_file)
    #go to the beginning of the file
    xml_file.seek(0)
    #overwrite with correct data
    for line in lines:
        xml_file.write(line)
    # OR: xml_file.writelines(lines)
    xml_file.truncate()

Note that in python there is no need to use ';' at the end of the line

0
source

All Articles