How to add a column to data.table with values ​​from a list based on regular expressions

I have the following data. table:

    id      fShort
1   432-12  1245
2   3242-12 453543
3   324-32  45543
4   322-34  45343
5   2324-34 13543


DT <- data.table(
        id=c("432-12", "3242-12", "324-32", "322-34", "2324-34"), 
        fShort=c("1245", "453543", "45543", "45343", "13543"))

and the following list:

filenames <- list("3242-124342345.png", "432-124343.png", "135-13434.jpeg")

I would like to create a new column "fComplete" that includes the full name of the file from the list. For this, the id column values ​​must match the list of file names. If the file name begins with the string "id", the full file name must be returned. I use the following regex

t <- grep("432-12","432-124343.png",value=T)

which return the correct file name.

Here is the final table:

    id      fShort      fComplete
1   432-12  1245    432-124343.png
2   3242-12 453543  3242-124342345.png
3   324-32  45543   NA
4   322-34  45343   NA
5   2324-34 13543   NA


DT2 <- data.table(
         id=c("432-12", "3242-12", "324-32", "322-34", "2324-34"), 
         fshort=c("1245", "453543", "45543", "45343", "13543"), 
         fComplete = c("432-124343.png", "3242-124342345.png", NA, NA, NA))

I tried using the apply and data.table approaches, but always get warnings like

argument 'pattern' has length > 1 and only the first element will be used

What is a simple approach to this?

+3
source share
2 answers

Here is a data.tablesolution:

DT[ , fComplete := lapply(id, function(x) {
  m <- grep(x, filenames, value = TRUE)
  if (!length(m)) NA else m})]


        id fShort          fComplete
1:  432-12   1245     432-124343.png
2: 3242-12 453543 3242-124342345.png
3:  324-32  45543                 NA
4:  322-34  45343                 NA
5: 2324-34  13543                 NA
+3

, , - , Y -, data.frame, lap-, /unlist data.frame,

- data.tables, , , . , , , [1], , .

DT <- data.frame(
  id=c("432-12", "3242-12", "324-32", "322-34", "2324-34"), 
  fShort=c("1245", "453543", "45543", "45343", "13543"))

filenames <- list("3242-124342345.png", "432-124343.png", "135-13434.jpeg")
filenames1 <- unlist(filenames)

x<-apply(DT[1],1,function(x) grep(x,filenames1)[1])
DT$fielname <- filenames1[x]
+1

All Articles