Regular expression goes into infinite loop

I parse the (form) form names:

Parus Ater
H. sapiens
T. rex
Tyr. rex

which usually has two members (binomial), but sometimes has 3 or more.

Troglodytes troglodytes troglodytes 
E. rubecula sensu stricto

I wrote

[A-Z][a-z]*\.?\s+[a-z][a-z]+(\s*[a-z]+)*

which worked most of the time, but sometimes went into an endless loop. It took some time to track that it was a regex match, and then I realized that it was a typo, and I had to write

[A-Z][a-z]*\.?\s+[a-z][a-z]+(\s+[a-z]+)*

which works correctly.

My questions:

  • Why is this cycle happening?
  • Is there a way to check for similar regular expression errors before running the program? Otherwise, it can be difficult to catch them before the program is distributed and causes problems.

[. . line + - .

. - , , 2 3/4 ( ), (, "Homo sapiens lives in big cities like London"), "L".]

. , , (, ). , . !

+5
2

, . , , , .

, , : (\s*[a-z]+)* , , . Qtax, .

, , . Halting. , , , , , .

, ? . , , , :

  • , .
  • anchors, , . ^ . : Word
  • , . ( ), , .
  • , . , , , , , .
  • , : , . - , . , . , , .

, . .

+6

:

[A-Z][a-z]*\.?\s+[a-z][a-z]+(\s*[a-z]+)*

- (\s*[a-z]+)*, . , String.matches(), , , ( Matcher).

(\s*[a-z]+)*:

inputstringinputstring;

(Repetition 1)
\s*=(empty)
[a-z]+=inputstringinputstring
FAILED

Backtrack [a-z]+=inputstringinputstrin
(Repetition 2)
\s*=(empty)
[a-z]+=g
FAILED

(End repetition 2 since all choices are exhausted)
Backtrack [a-z]+=inputstringinputstri
(Repetition 2)
\s*=(empty)
[a-z]+=ng
FAILED

Backtrack [a-z]+=n
(Repetition 3)
\s*(empty)
[a-z]+=g
FAILED

(End repetition 3 since all choices are exhausted)
(End repetition 2 since all choices are exhausted)
Backtrack [a-z]+=inputstringinputstr

. T(n), n . T(n) = Sum [i = 0..(n-1)] T(i). T(n + 1) = 2T(n), , !

* + , . , .

, (\s+[a-z]+\s*)* , , , . b^d, b - ( 1), d - . , ( Englsh , ), .

+2

All Articles