How to split data.frame list and apply function to single column?

I have a little question about using functions. For example, I have:

l <- list(a = data.frame(A1=rep(10,5),B1=c(1,1,1,2,2),C1=c(5,10,20,7,30)),
          b = data.frame(A1=rep(20,5),B1=c(3,3,4,4,4),C1=c(3,5,10,20,30)))

I want to find a minimum of C1 for each B1. The result should be

$a
  A1 B1 C1
  10  1  5
  10  2  7

$b
  A1 B1 C1
  20  3  3
  20  4  10

I know how to do this with the "for", but it should be a simpler way with "lapply", but I could not get it to work.

Please, help

+5
source share
4 answers

Here's another approach that matches your desired result:

lapply(l, function(x) {
  temp <- ave(x[["C1"]], x["B1"], FUN = min)
  x[x[["C1"]] == temp, ]
})
# $a
#   A1 B1 C1
# 1 10  1  5
# 4 10  2  7
# 
# $b
#   A1 B1 C1
# 1 20  3  3
# 3 20  4 10
+2
source

How about combining lapplyand tapply:

lapply(l, function(i) tapply(i$C1, i$B1, min))
$a
1 2 
5 7 

$b
3  4 
3 10 

The trick to thinking about several operations is to break the task down into bits. SO,

  • C1 B1. ?

    i = l[[1]]
    tapply(i$C1, i$B1, min)
    
  • ? lapply:

    lapply(l, function(i) tapply(i$C1, i$B1, min))
    

1, 2.

+3

Recently succumbing to the siren song of the package data.tableand its combination of versatility and speed to perform such operations, I present another solution:

library(data.table)
lapply(l, function(dat) {
    data.table(dat, key="B1,C1")[list(unique(B1)), mult="first"]
})

If maintaining the original column order is important, the call data.table()may be wrapped for some reason setcolorder(..., names(dat)).

+3
source

You can also try llply + dcast from the plyr / reshape2 toolbar:

library(reshape2)
library(plyr)

    l <- list(a = data.frame(A1=rep(10,5),B1=c(1,1,1,2,2),C1=c(5,10,20,7,30)),
              b = data.frame(A1=rep(20,5),B1=c(3,3,4,4,4),C1=c(3,5,10,20,30)))

    llply(l, function (x) {dcast (x, A1+B1~., value.var="C1", min)})
0
source

All Articles