Regular expression - how to find words and phrases

I want to take a line from the following:

Guiness Harp "Holy Moses"

So, in C # or VB we get a set of matches:

Guiness
Harp
Holy Moses

Essentially, it breaks into spaces, if there are no quotes around the spaces, then these words between quotation marks are considered one phrase.

Thanks Kevin

+1
source share
4 answers

If you do not have any (hidden or double) quotes in your quoted strings, you can search

 "[^"]*"|\S+

However, quotation marks will be part of the match. The regular expression can be extended to handle quotes inside quoted strings, if necessary.

Another (and in this case preferred) possibility would be to use the csv parser.

For example (Python):

import csv
reader = csv.reader(open('test.txt'), delimiter=' ', quotechar='"')
for row in reader:
    print(row)
+5
source

Here's a different approach:

string s0 = @"Guiness Harp ""Holy Moses""";
Regex r = new Regex(@"""(?<FIELD>[^""]*)""|(?<FIELD>\S+)");
foreach (Match m in r.Matches(s0))
{
  Console.WriteLine(m.Groups["FIELD"].Value);
}

, .NET . , , Perl 6 , .NET.

+3

, .

, .

0

, .

string text = "Guiness Harp \"Holy Moses\"";
string pattern = @"""[^""]*""|\S+";

MatchCollection matches = Regex.Matches( text, pattern );
foreach( Match match in matches )
{
    string value = match.Value.Trim( '"' );
    Console.Out.WriteLine( value );
}

. - . .

0

All Articles