regex re.findall:
(?:\w+(?:\(\w+\))?)|(?:\([\w\ \.]+\)))
(?:\w+(?:\(\w+\))?) , .
\w+ - word character one or more times
\(\w+\)? - (optional) opening parenthesis, word character one or more times,
closing parenthesis
(?:\([\w\ \.]+\))) , , .
\([\w\ \.]+\) - opening partnthesis, (either a word character,
space or period one or more times), closing parenthesis
?:at the beginning of each group simply means that it cannot be captured, so it .findallreturns only the matches you need.
It really is guaranteed to work on the example you provided, or something very similar, and can do with some additional consideration if there will be much more variance in the input, but this is the beginning.
source
share