How would you translate this into the language of the data.table package in R?

I am trying to study the package data.tableat R. I have a data table with a name DT1and a data frame DF1, and I would like to multiply some instances according to a logical condition (disjunction). This is my code:

DF1[DF1$c1==0 | DF1$c2==1,] #the data.frame way with the data.frame DF1
DT1[DT1$c1==0 | DT1$c2==1,] #the data.frame way with the data.table DT1

On page 5 of “Introduction to the data.table package in R”, the author gives an example of something similar, but with a union (replace |with &in the second line above) and note that the package is used poorly data.table. Instead, he suggests doing it as follows:

setkey(DT1,c1,c2)
DT1[J(0,1)]

So my question is: how can I write a clause condition with package syntax data.table? Is this a misuse of my second line DT1[DT1$c1==0 | DT1$c2==1,]? Is there an equivalent J, but for disjunction?

+5
source share
2 answers

This document indicates that you could use:

DT1[c1==0 | c2==1, ]
+4
source

Here is another solution:

grpsize = ceiling(1e7/26^2)
DT <- data.table(
  x=rep(LETTERS,each=26*grpsize),
  y=rep(letters,each=grpsize),
  v=runif(grpsize*26^2))

setkey(DT, x)
system.time(DT1 <- DT[x=="A" | x=="Z"])
   user  system elapsed 
   0.68    0.05    0.74 
system.time(DT2 <- DT[J(c("A", "Z"))])
   user  system elapsed 
   0.08    0.00    0.07 
all.equal(DT1[, v], DT2[, v])
TRUE

Note that I took this example from the data.table document. The only difference is that I no longer rewrite letters in factors, because character keys are now available (see NEWS for version 1.8.0).

: J data.table. , J(0, 1), data.table , , :

> J(0,1)
     V1 V2
[1,]  0  1

, , . data.table . c().

J(c(0,1))
     V1
[1,]  0
[2,]  1
+3

All Articles