Multiple Perl Regex Matches

I am looking for a regular expression that will behave as follows:

input: "hello world."

: he, el, ll, lo, wo, or, rl, ld

my idea was something like

    while($string =~ m/(([a-zA-Z])([a-zA-Z]))/g) {
        print "$1-$2 ";
    }

But it does something a little different.

+5
source share
5 answers

It's complicated. You must capture it, save it, and then force it to retreat.

You can do it as follows:

use v5.10;   # first release with backtracking control verbs

my $string = "hello, world!";
my @saved;

my $pat = qr{
    ( \pL {2} )
    (?{ push @saved, $^N })
    (*FAIL)
}x;

@saved = ();
$string =~ $pat;
my $count = @saved;
printf "Found %d matches: %s.\n", $count, join(", " => @saved);

produces the following:

Found 8 matches: he, el, ll, lo, wo, or, rl, ld.

If you do not have v5.10, or you have a headache, you can use this:

my $string = "hello, world!";
my @pairs = $string =~ m{
  # we can only match at positions where the
  # following sneak-ahead assertion is true:
    (?=                 # zero-width look ahead
        (               # begin stealth capture
            \pL {2}     #       save off two letters
        )               # end stealth capture
    )
  # succeed after matching nothing, force reset
}xg;

my $count = @pairs;
printf "Found %d matches: %s.\n", $count, join(", " => @pairs);

Gets the same result as before.

But you may have a headache.

+10
source

No need to "force back"!

push @pairs, "$1$2" while /([a-zA-Z])(?=([a-zA-Z]))/g;

Although you may need to match any letter, not the limited set you specify.

push @pairs, "$1$2" while /(\pL)(?=(\pL))/g;
+5

. , map, for, .

#!/usr/bin/env perl

use strict;
use warnings;

my $in = "hello world.";
my @words = $in =~ /(\b\pL+\b)/g;

my @out = map {
  my @chars = split '';
  map { $chars[$_] . $chars[$_+1] } ( 0 .. $#chars - 1 );
} @words;

print join ',', @out;
print "\n";

, , YMMV.

+1

group lookahead.

(?=([a-zA-Z]{2}))
    ------------
         |->group 1 captures two English letters 

0

, pos, , \G, , substr, .

use v5.10;
use strict;
use warnings;

my $letter_re = qr/[a-zA-Z]/;

my $string = "hello world.";
while( $string =~ m{ ($letter_re) }gx ) {
    # Skip it if the next character isn't a letter
    # \G will match where the last m//g left off.
    # It pos() in a regex.
    next unless $string =~ /\G $letter_re /x;

    # pos() is still where the last m//g left off.
    # Use substr to print the character before it (the one we matched)
    # and the next one, which we know to be a letter.
    say substr $string, pos($string)-1, 2;
}

" " , (?=pattern). Zero-width , m//g. , .

while( $string =~ m{ ($letter_re) (?=$letter_re) }gx ) {
    # pos() is still where the last m//g left off.
    # Use substr to print the character before it (the one we matched)
    # and the next one, which we know to be a letter.
    say substr $string, pos($string)-1, 2;
}

. , m{ ($letter_re (?=$letter_re)) }gx, . - . , , ...

say "$1$2" while $string =~ m{ ($letter_re) (?=($letter_re)) }gx;

TMTOWTDI, .

0

All Articles