Too many indentation when checking a string with many regular expressions sequentially in python

I get a deep indent when I write code as shown below

match = re.search(some_regex_1, s)
if match:
    # do something with match data
else:
    match = re.search(some_regex_2, s)
    if match:
        # do something with match data
    else:
        match = re.search(soem_regex_3, s)
        if match:
            # do something with match data
        else:
            # ...
            # and so on

I tried to rewrite as:

if match = re.search(some_regex_1, s):
    # ...
elif match = re.search(some_regex_2, s):
    # ...
elif ....
    # ...
...

but Python does not allow the use of this syntax. What to do to avoid deep deepening in this case?

+3
source share
4 answers
regexes = (regex1, regex2, regex3)
for regex in regexes:
    match = re.search(regex, s)
    if match:
        #do stuff
        break

Alternative (more advanced):

def process1(match_obj):
    #handle match 1

def process2(match_obj):
    #handle match 2

def process3(match_obj):
    #handle match 3
.
.
.
handler_map = ((regex1, process1), (regex2, process2), (regex3, process3))
for regex, handler in handler_map:
    match = re.search(regex, s)
    if match:
        result = handler(match)
        break
else:
    #else condition if no regex matches
+6
source

If you can use finditer()instead search()(most of the time you can), you can combine all your regular expressions into one and use the names of symbolic groups. Here is an example:

import re

regex = """
   (?P<number> \d+ ) |
   (?P<word> \w+ ) |
   (?P<punctuation> \. | \! | \? | \, | \; | \: ) |
   (?P<whitespace> \s+ ) |
   (?P<eof> $ ) |
   (?P<error> \S )
"""

scan = re.compile(pattern=regex, flags=re.VERBOSE).finditer

for match in scan('Hi, my name is Joe. I am 1 programmer.'):
    token_type = match.lastgroup
    if token_type == 'number':
        print 'found number "%s"' % match.group()
    elif token_type == 'word':
        print 'found word "%s"' % match.group()
    elif token_type == 'punctuation':
        print 'found punctuation character "%s"' % match.group()
    elif token_type == 'whitespace':
        print 'found whitespace'
    elif token_type == 'eof':
        print 'done parsing'
        break
    else:
        raise ValueError('String kaputt!')
+2
source
if re.search(some_regex_1, s) is not None:
    # ...
elif re.search(some_regex_2, s) is not None:
    # ...
elif ....
    # ...
...

search() None, , if .

0

I found the corresponding answer in another thread that uses a class to save data to emulate an assignment idiom in state in C

fooobar.com/questions/126062 / ...

0
source

All Articles