How can I split a string while keeping parentheses

I am splitting strings to generate keys for a dictionary, but I am having problems with parentheses.

I would take a string of modern buildings (2000 pers.) And divided into three keys: contemporary, building(s)and(2000 c.e. to present)

So far i have used re.findall('\w+', key)

Any help is greatly appreciated.

+3
source share
3 answers

You can also use re.findall('[(][^)]*[)]|\S+', key)if there are no parentheses in parentheses.

+3
source

, . . , , , re.findall('\w+', key) ?

parts = re.findall('[\w)(\.]+', key)
[parts[0], parts[1], parts[2] + " " + parts[3] + " " + parts[4] + " " + parts[5]]

, . , . , 0 1, , 0.

, , .

+2

regex re.findall:

(?:\w+(?:\(\w+\))?)|(?:\([\w\ \.]+\)))

(?:\w+(?:\(\w+\))?) , .

\w+ - word character one or more times
\(\w+\)? - (optional) opening parenthesis, word character one or more times,
           closing parenthesis

(?:\([\w\ \.]+\))) , , .

\([\w\ \.]+\) - opening partnthesis, (either a word character,
                space or period one or more times), closing parenthesis

?:at the beginning of each group simply means that it cannot be captured, so it .findallreturns only the matches you need.

It really is guaranteed to work on the example you provided, or something very similar, and can do with some additional consideration if there will be much more variance in the input, but this is the beginning.

+2
source

All Articles