Calculating the appearance of numbers in subsets of data.frame

Question

Calculating the appearance of numbers in subsets of data.frame

I have a data frame in R that looks like the following. Actually my real df data frame is much bigger than here, but I really don't want to confuse anyone, so I try to simplify things as much as possible.

So there is a data frame.

id <-c(1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3)   
a <-c(3,1,3,3,1,3,3,3,3,1,3,2,1,2,1,3,3,2,1,1,1,3,1,3,3,3,2,1,1,3)
b <-c(3,2,1,1,1,1,1,1,1,1,1,2,1,3,2,1,1,1,2,1,3,1,2,2,1,3,3,2,3,2)
c <-c(1,3,2,3,2,1,2,3,3,2,2,3,1,2,3,3,3,1,1,2,3,3,1,2,2,3,2,2,3,2)
d <-c(3,3,3,1,3,2,2,1,2,3,2,2,2,1,3,1,2,2,3,2,3,2,3,2,1,1,1,1,1,2)
e <-c(2,3,1,2,1,2,3,3,1,1,2,1,1,3,3,2,1,1,3,3,2,2,3,3,3,2,3,2,1,3)

df <-data.frame(id,a,b,c,d,e)
df

Basically I would like to get the occurrences of numbers for each column (a, b, c, d, e) and for each group id (1,2,3) (for this last grouping see my column id).

So, for column a and for id number 1 (for the last see the column identifier), the code will be something like this:

as.numeric(table(df[1:10,2]))

##The results are:
[1] 3 7

: a ( , 1 ), , "1" 3 , "3" 7 .

, . a id id 2 ( . ):

as.numeric(table(df[11:20,2]))

##After running the codes the results are: 
[1] 4 3 3

: a , 2 id), , "1" 4 , "2" 3 "3" 3 .

. ( ). , , , df , ...

, , df dataframe , :

for (z in (2:ncol(df))) assign(paste("df",z,sep="."),df[,z])

, df.2 df $a, df.3 df $b, df.4 df $c .. Im , , ...

, "" ?

+3

r subset

Laszlo 17 . '11 9:13

5

wkmor1 · Answer 1 · 2011-03-17T12:16:25+0000

-

> library(reshape)

> dftab <- table(melt(df,'id'))
> dftab
, , value = 1

   variable
id  a b c d e
  1 3 8 2 2 4
  2 4 6 3 2 4
  3 4 2 1 5 1

, , value = 2

   variable
id  a b c d e
  1 0 1 4 3 3
  2 3 3 3 6 2
  3 1 4 5 3 4

, , value = 3

   variable
id  a b c d e
  1 7 1 4 5 3
  2 3 1 4 2 4
  3 5 4 4 2 5

, "3 " a " " 1 ",

> dftab[3,'a',1]
[1] 4

James · Answer 2 · 2011-03-17T12:44:07+0000

tapply apply :

tapply(df$id,df$id,function(x) apply(df[id==x,-1],2,table))

, , 1a, id, ().

$`1`
$`1`$a

1 3 
3 7 

$`1`$b

1 2 3 
8 1 1 

$`1`$c

1 2 3 
2 4 4 

$`1`$d

1 2 3 
2 3 5 

$`1`$e

1 2 3 
4 3 3 


$`2`
  a b c d e
1 4 6 3 2 4
2 3 3 3 6 2
3 3 1 4 2 4

$`3`
  a b c d e
1 4 2 1 5 1
2 1 4 5 3 4
3 5 4 4 2 5

Noah · Answer 3 · 2011-03-17T10:51:55+0000

, - , , dlply plyr.

ColTables <- function(df) {
  counts <- list()
  for(a in names(df)[names(df) != "id"]) {
    counts[[a]] <- table(df[a])
  }
  return(counts)
}

results <- dlply(df, "id", ColTables)

- "" id; table id. :

> results[['2']]['a']
$a

1 2 3 
4 3 3

id = 2, column = a, .

arinarmo · Answer 4 · 2014-03-31T21:46:09+0000

aggregate,

> df$freq <- 0
> aggregate(freq~a+id,df,length)
  a id freq
1 1  1    3
2 3  1    7
3 1  2    4
4 2  2    3
5 3  2    3
6 1  3    4
7 2  3    1
8 3  3    5

Of course, you can write a function to do this, so it’s easier for you to do this often, and you don’t need to add a column to your actual data frame

> frequency <- function(df,groups) {
+   relevant <- df[,groups]
+   relevant$freq <- 0
+   aggregate(freq~.,relevant,length)
+ }
> frequency(df,c("b","id"))
  b id freq
1 1  1    8
2 2  1    1
3 3  1    1
4 1  2    6
5 2  2    3
6 3  2    1
7 1  3    2
8 2  3    4
9 3  3    4

John · Answer 5 · 2014-05-26T14:52:01+0000

You did not say how you need the data. A function bycan provide you with the result you need.

by(df, df$id, function(x) lapply(x[,-1], table))

Calculating the appearance of numbers in subsets of data.frame

More articles: