How does the regex engine parse regex with recursive subpatterns?

This regular expression matches palindromes: ^((.)(?1)\2|.?)$

It is impossible to wrap your head in how it works. When does the end of recursion end and when does the regular expression exit the recursive subpattern and go into the "|.?"part?

Thank.

Edit: sorry i didn't explain \2and(?1)

(?1) - refers to the first subpattern (to oneself)

\2 - a link back to the match of the second subpattern, which (.)

The above example is written in PHP. Matches both the "abba" (without the middle symbol of the palindrome) and the "abcba" - it has a middle, not reflected symbol

+5
source share
3 answers

^((.)(?1)\2|.?)$

^ $ . , :

((.)(?1)\2|.?)
1------------1 // Capturing group 1
 2-2           // Capturing group 2

(.)(?1)\2, , ( \2, , (.)). 1. , ( (.) \2, ), 2 . - .

.?, , 0 . 0 1, , 0 1 . - , .

:

  • , ^ $. .
  • <= 1, .
  • > 2 , . 2 , .
  • , , , <= 1.
+4

:

palin(str) {
    if (length(str) >= 2) {
      first = str[0];
      last = str[length(str)-1];
      return first == last && palin(substr(str, 1, length(str)-2));
    } else
      // empty and single-char trivially palindromes
      return true;
}
+4

PCRE. , , -:

$ pcretest -b
PCRE version 7.6 2008-01-28

  re> /^((.)(?1)\2|.?)$/x
------------------------------------------------------------------
  0  39 Bra
  3     ^
  4  26 CBra 1
  9   6 CBra 2
 14     Any
 15   6 Ket
 18   6 Once
 21   4 Recurse
 24   6 Ket
 27     \2
 30   5 Alt
 33     Any?
 35  31 Ket
 38     $
 39  39 Ket
 42     End
------------------------------------------------------------------

Perl , PCRE, echo 123454321 | perl -Mre=debug -ne '/^((.)(?1)\2|.?)$/x'. -, PCRE, , :

Compiling REx "^((.)(?1)\2|.?)$"
Final program:
   1: BOL (2)
   2: OPEN1 (4)
   4:   BRANCH (15)
   5:     OPEN2 (7)
   7:       REG_ANY (8)
   8:     CLOSE2 (10)
  10:     GOSUB1[-8] (13)
  13:     REF2 (19)
  15:   BRANCH (FAIL)
  16:     CURLY {0,1} (19)
  18:       REG_ANY (0)
  19: CLOSE1 (21)
  21: EOL (22)
  22: END (0)
floating ""$ at 0..2147483647 (checking floating) anchored(BOL) minlen 0 
Guessing start of match in sv for REx "^((.)(?1)\2|.?)$" against "12321"
Found floating substr ""$ at offset 5...
Guessed: match at offset 0
Matching REx "^((.)(?1)\2|.?)$" against "12321"
   0 <> <12321>              |  1:BOL(2)
   0 <> <12321>              |  2:OPEN1(4)
   0 <> <12321>              |  4:BRANCH(15)
   0 <> <12321>              |  5:  OPEN2(7)
   0 <> <12321>              |  7:  REG_ANY(8)
   1 <1> <2321>              |  8:  CLOSE2(10)
   1 <1> <2321>              | 10:  GOSUB1[-8](13)
   1 <1> <2321>              |  2:    OPEN1(4)
   1 <1> <2321>              |  4:    BRANCH(15)
   1 <1> <2321>              |  5:      OPEN2(7)
   1 <1> <2321>              |  7:      REG_ANY(8)
   2 <12> <321>              |  8:      CLOSE2(10)
   2 <12> <321>              | 10:      GOSUB1[-8](13)
   2 <12> <321>              |  2:        OPEN1(4)
   2 <12> <321>              |  4:        BRANCH(15)
   2 <12> <321>              |  5:          OPEN2(7)
   2 <12> <321>              |  7:          REG_ANY(8)
   3 <123> <21>              |  8:          CLOSE2(10)
   3 <123> <21>              | 10:          GOSUB1[-8](13)
   3 <123> <21>              |  2:            OPEN1(4)
   3 <123> <21>              |  4:            BRANCH(15)
   3 <123> <21>              |  5:              OPEN2(7)
   3 <123> <21>              |  7:              REG_ANY(8)
   4 <1232> <1>              |  8:              CLOSE2(10)
   4 <1232> <1>              | 10:              GOSUB1[-8](13)
   4 <1232> <1>              |  2:                OPEN1(4)
   4 <1232> <1>              |  4:                BRANCH(15)
   4 <1232> <1>              |  5:                  OPEN2(7)
   4 <1232> <1>              |  7:                  REG_ANY(8)
   5 <12321> <>              |  8:                  CLOSE2(10)
   5 <12321> <>              | 10:                  GOSUB1[-8](13)
   5 <12321> <>              |  2:                    OPEN1(4)
   5 <12321> <>              |  4:                    BRANCH(15)
   5 <12321> <>              |  5:                      OPEN2(7)
   5 <12321> <>              |  7:                      REG_ANY(8)
                                                        failed...
   5 <12321> <>              | 15:                    BRANCH(19)
   5 <12321> <>              | 16:                      CURLY {0,1}(19)
                                                        REG_ANY can match 0 times out of 1...
   5 <12321> <>              | 19:                        CLOSE1(21)
                                                          EVAL trying tail ... 9d86dd8
   5 <12321> <>              | 13:                          REF2(19)
                                                            failed...
                                                        failed...
                                                      BRANCH failed...
   4 <1232> <1>              | 15:                BRANCH(19)
   4 <1232> <1>              | 16:                  CURLY {0,1}(19)
                                                    REG_ANY can match 1 times out of 1...
   5 <12321> <>              | 19:                    CLOSE1(21)
                                                      EVAL trying tail ... 9d86d70
   5 <12321> <>              | 13:                      REF2(19)
                                                        failed...
   4 <1232> <1>              | 19:                    CLOSE1(21)
                                                      EVAL trying tail ... 9d86d70
   4 <1232> <1>              | 13:                      REF2(19)
                                                        failed...
                                                    failed...
                                                  BRANCH failed...
   3 <123> <21>              | 15:            BRANCH(19)
   3 <123> <21>              | 16:              CURLY {0,1}(19)
                                                REG_ANY can match 1 times out of 1...
   4 <1232> <1>              | 19:                CLOSE1(21)
                                                  EVAL trying tail ... 9d86d08
   4 <1232> <1>              | 13:                  REF2(19)
                                                    failed...
   3 <123> <21>              | 19:                CLOSE1(21)
                                                  EVAL trying tail ... 9d86d08
   3 <123> <21>              | 13:                  REF2(19)
                                                    failed...
                                                failed...
                                              BRANCH failed...
   2 <12> <321>              | 15:        BRANCH(19)
   2 <12> <321>              | 16:          CURLY {0,1}(19)
                                            REG_ANY can match 1 times out of 1...
   3 <123> <21>              | 19:            CLOSE1(21)
                                              EVAL trying tail ... 9d86ca0
   3 <123> <21>              | 13:              REF2(19)
   4 <1232> <1>              | 19:              CLOSE1(21)
                                                EVAL trying tail ... 0
   4 <1232> <1>              | 13:                REF2(19)
   5 <12321> <>              | 19:                CLOSE1(21)
   5 <12321> <>              | 21:                EOL(22)
   5 <12321> <>              | 22:                END(0)
Match successful!
Freeing REx: "^((.)(?1)\2|.?)$"

, Perl , (.) , .? \2, , , .

+1
source

All Articles