Create a new column using vertical conditions with data.table

I am trying to use data.tablefor an operation that I do not do, if for data.table. Suppose I have the following

set.seed(1)
id<-rep(LETTERS[1:6], each=3)
event<-rep(letters[1:4], each=3, length.out=18)
order<-rep(1:3, 6)
var<-rnorm(18)

dt<-data.table(id, event, order, var)
setkey(dt, id)

    id event order         var
 1:  A     a     1 -0.62645381
 2:  A     a     2  0.18364332
 3:  A     a     3 -0.83562861
 4:  B     b     1  1.59528080
 5:  B     b     2  0.32950777
 6:  B     b     3 -0.82046838
 7:  C     c     1  0.48742905
 8:  C     c     2  0.73832471
 9:  C     c     3  0.57578135
10:  D     d     1 -0.30538839
11:  D     d     2  1.51178117
12:  D     d     3  0.38984324
13:  E     a     1 -0.62124058
14:  E     a     2 -2.21469989
15:  E     a     3  1.12493092
16:  F     b     1 -0.04493361
17:  F     b     2 -0.01619026
18:  F     b     3  0.94383621

and I need to calculate in the new column the Euclidean distance between the values ​​of var, having different idbut identical event, supporting the orderobservational data .

I realized that using the command byin the expression, data.tableI divide the data into pieces that I can use in connection with other fragments from another data.table, but can I do the same thing in the same one data.table?

To be clear, this is what I would like to get

    id event order         var   euclid
 1:  A     a     1 -0.62645381 3.097720
 2:  A     a     2  0.18364332 3.097720
 3:  A     a     3 -0.83562861 3.097720
 4:  B     b     1  1.59528080 2.433635
 5:  B     b     2  0.32950777 2.433635
 6:  B     b     3 -0.82046838 2.433635
 7:  C     c     1  0.48742905       NA
 8:  C     c     2  0.73832471       NA
 9:  C     c     3  0.57578135       NA
10:  D     d     1 -0.30538839       NA
11:  D     d     2  1.51178117       NA
12:  D     d     3  0.38984324       NA
13:  E     a     1 -0.62124058 3.097720
14:  E     a     2 -2.21469989 3.097720
15:  E     a     3  1.12493092 3.097720
16:  F     b     1 -0.04493361 2.433635
17:  F     b     2 -0.01619026 2.433635
18:  F     b     3  0.94383621 2.433635

Many thanks!

+3
source share
1

, :

dt[, euclid:= dist(xtabs(var ~ id + order, data=.SD)), by=event]

#    id event order         var   euclid
# 1:  A     a     1 -0.62645381 3.097720 # snip
# 4:  B     b     1  1.59528080 2.433635 # snip
# 7:  C     c     1  0.48742905       NA # snip
#10:  D     d     1 -0.30538839       NA # snip
#13:  E     a     1 -0.62124058 3.097720 # snip
#16:  F     b     1 -0.04493361 2.433635 # snip

@Arun, 3+ , dist, 0 NA, :

dt[, euclid := sum(dist(xtabs(var ~ id + order, data=.SD))), by=event]
+1

All Articles