Matching two or three words after different arabic expression patterns in Java

Greetings All;

I am starting to use regex. What I want to do is extract 2 or 3 Arabic words after a specific pattern.

eg:

If i have arabic string

inputtext = "تكريم الدكتور احمد زويل والدكتورة سميرة موسي عن ابحاثهم العلمية "

I need to extract the names after

الدكتور

and

والدكتورة

therefore, the output should be:

احمد زويل
سميرة موسى

what i have done so far:

inputtext = "تكريم الدكتور احمد زويل والدكتورة سميرة موسي عن ابحاثهم العلمية "
Pattern pattern = Pattern.compile("(?<=الدكتور).*");
            Matcher matcher = pattern.matcher(inputtext);
            boolean found = false;
            while (matcher.find()) {
                // Get the matching string
                String match = matcher.group();
                System.out.println("the match is: "+match);
                found = true;
            }
            if (!found)
    {
        System.out.println("I didn't found the text");
    }

but it returns:

احمد زويل والدكتورة سميرة موسي عن ابحاثهم العلمية

I don’t know how to add another template and how to stop after two words?

Could you help me with any ideas?

+3
source share
1 answer

To combine only the following two words, try the following:

(?<=الدكتور)\s[^\s]+\s[^\s]+

.* will match all to the end of the line so you don't want

\s is a space character

[^\s] is a group of negative characters that matches any but space

, , ( ), ( ).

, ( lookbehind) . .

(?<=الدكتور)\s[^\s]+\s[^\s]+|(?<=والدكتورة)\s[^\s]+\s[^\s]+
+2

All Articles