Matching the whole word (Visual Studio style)

I am trying to add Match Whole Word to my small application. I want him to do what Visual Studio does. So, for example, the code below should work fine:

  public partial class MainWindow : Window
    {
        public MainWindow()
        {
            InitializeComponent();

            String input = "[  abc() *abc  ]";

            Match(input, "abc", 2);
            Match(input, "abc()", 1);
            Match(input, "*abc", 1);
            Match(input, "*abc ", 1);            
        }

        private void Match(String input, String pattern, int expected)
        {
            String escapedPattern = Regex.Escape(pattern);
            MatchCollection mc = Regex.Matches(input, @"\b" + escapedPattern + @"\b", RegexOptions.IgnoreCase);
            if (mc.Count != expected)
            {
                throw new Exception("match whole word isn't working");
            }
        }
    }   

A search for "abc" works fine, but other patterns return 0 results. I think \ b is inadequate, but I'm not sure what to use.

Any help would be greatly appreciated. Thanks

+3
source share
3 answers

The metacharacter is \bmatched at the word boundary between alphanumeric and non-alphanumeric characters. Lines that end with non-alphanumeric characters fail because they \bwork properly.

, , :

  • \b -
  • \b (capital B) --
  • \b, -- , ,

, , . \b \b. \b, . , foo*abc @"\*abc\b".

:

string input = "[  abc() *abc foo*abc ]";
string[] patterns =
{
    @"\babc\b",     // 3
    @"\babc\(\)\B", // 1
    @"\B\*abc\b",   // 1, \B prefix ensures whole word match, "foo*abc" not matched
    @"\*abc\b",     // 2, no \B prefix so it matches "foo*abc"
    @"\B\*abc "     // 1
};

foreach (var pattern in patterns)
{
    Console.WriteLine("Pattern: " + pattern);
    var matches = Regex.Matches(input, pattern);
    Console.WriteLine("Matches found: " + matches.Count);
    foreach (Match match in matches)
    {
        Console.WriteLine("  " + match.Value);
    }
    Console.WriteLine();
}
+3

, , :

@"(?<!\w)" + escapedPattern + @"(?!\w)"

\b "" , . , , .

+2

\b - , , .

Letters, numbers, and underscores are word characters. *, SPACE, and parens are characters other than words. therefore, when you use \b*abc\bas your template, it does not match your input, because * is not a word. Similarly for your parens-related model.

To solve this problem, you will need to eliminate it \bin cases where your input (unscreened) pattern begins or ends with characters without a word.


    public void Run()
    {
        String input = "[  abc() *abc  ]";

        Match(input, @"\babc\b", 2);
        Match(input, @"\babc\(\)", 1);
        Match(input, @"\*abc\b", 1);
        Match(input, @"\*abc\b ", 1);
    }

    private void Match(String input, String pattern, int expected)
    {
        MatchCollection mc = Regex.Matches(input, pattern, RegexOptions.IgnoreCase);
        Console.WriteLine((mc.Count == expected)? "PASS ({0}=={1})" : "FAIL ({0}!={1})",
                          mc.Count, expected);
    }
0
source

All Articles