Regular expression to capture a word up to a specific R Perl character

I need to get the words before and after a unique character (in my case: &) in a string in R.

I need to get "word1" from something like this: "... something is something word1 and word2 is something ..."

I can get the word after using the Perl regular expression in R: (?<=& )[^ ]*(?= ) (It seems to behave the way I would like. I got this from combing the answers I found on this site)

I need to get the word preceding the character &. Changes the length of words and the number of other preceding words, as well as spaces. A word can be letters and numbers, simply connected by spaces on both sides.

+5
source share
4

(\S+)\s*&\s*(\S+), &. .

R regexec regmatches .

string  <- "...something something word1 & word2 something..."
pattern <- "(\\S+)\\s*&\\s*(\\S+)"
match   <- regexec(pattern, string)
words   <- regmatches(string, match)

words , : , . , words[[1]][2] - word1, words[[1]][3] - word2.

+15

(<? = & ;) (\ *) (= &)

, . lookbehind .

+3
\b(.*?)\b&

The word will be written in group 1. This is a reluctant coincidence contained in any line surrounded by two boundaries; after the second border &.

+2
source

This can be done with a relatively simple regular expression using gsubfnstrapplyc in the package . Assuming that is your line:s

library(gsubfn)
strapplyc(s, "(\\w+) & (\\w+)")
+1
source

All Articles