Parsing a wikiText with regex in Java

Given the wikiText line, for example:

{{ValueDescription
    |key=highway
    |value=secondary
    |image=Image:Meyenburg-L134.jpg
    |description=A highway linking large towns.
    |onNode=no
    |onWay=yes
    |onArea=no
    |combination=
    * {{Tag|name}}
    * {{Tag|ref}}
    |implies=
    * {{Tag|motorcar||yes}}
    }}

I would like to analyze patterns ValueDescriptionand Tagin Java / Groovy. I tried using the regex /\{\{\s*Tag(.+)\}\}/and it is excellent (it returns |name |refand |motorcar||yes), but /\{\{\s*ValueDescription(.+)\}\}/does not work (it should return all the text above).

Expected output

Is there a way to skip nested patterns in regex?

Ideally, I would prefer to use the simple wikiText 2 xml tool , but I could not find anything like it.

Thank! Mulone

+3
source share
2 answers

Create your regular expression pattern using the parameter Pattern.DOTALLas follows:

Pattern p = Pattern.compile("\\{\\{\\s*ValueDescription(.+)\\}\\}", Pattern.DOTALL);

Code example:

Pattern p=Pattern.compile("\\{\\{\\s*ValueDescription(.+)\\}\\}",Pattern.DOTALL);
Matcher m=p.matcher(str);
while (m.find())
   System.out.println("Matched: [" + m.group(1) + ']');

OUTPUT

Matched: [
|key=highway
|value=secondary
|image=Image:Meyenburg-L134.jpg
|description=A highway linking large towns.
|onNode=no
|onWay=yes
|onArea=no
|combination=
* {{Tag|name}}
* {{Tag|ref}}
|implies=
* {{Tag|motorcar||yes}}
]

Update

, }} {{ValueDescription, ValueDescription:

Pattern p = Pattern.compile("\\{\\{\\s*ValueDescription(.+?)\n\\}\\}", Pattern.DOTALL);
+2

, . -, - . ANTLR - .

+3

All Articles