Calculating the appearance of numbers in subsets of data.frame

I have a data frame in R that looks like the following. Actually my real df data frame is much bigger than here, but I really don't want to confuse anyone, so I try to simplify things as much as possible.

So there is a data frame.

id <-c(1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3)   
a <-c(3,1,3,3,1,3,3,3,3,1,3,2,1,2,1,3,3,2,1,1,1,3,1,3,3,3,2,1,1,3)
b <-c(3,2,1,1,1,1,1,1,1,1,1,2,1,3,2,1,1,1,2,1,3,1,2,2,1,3,3,2,3,2)
c <-c(1,3,2,3,2,1,2,3,3,2,2,3,1,2,3,3,3,1,1,2,3,3,1,2,2,3,2,2,3,2)
d <-c(3,3,3,1,3,2,2,1,2,3,2,2,2,1,3,1,2,2,3,2,3,2,3,2,1,1,1,1,1,2)
e <-c(2,3,1,2,1,2,3,3,1,1,2,1,1,3,3,2,1,1,3,3,2,2,3,3,3,2,3,2,1,3)

df <-data.frame(id,a,b,c,d,e)
df

Basically I would like to get the occurrences of numbers for each column (a, b, c, d, e) and for each group id (1,2,3) (for this last grouping see my column id).

So, for column a and for id number 1 (for the last see the column identifier), the code will be something like this:

as.numeric(table(df[1:10,2]))

##The results are:
[1] 3 7

: a ( , 1 ), , "1" 3 , "3" 7 .

, . a id id 2 ( . ):

as.numeric(table(df[11:20,2]))

##After running the codes the results are: 
[1] 4 3 3

: a , 2 id), , "1" 4 , "2" 3 "3" 3 .

. ( ). , , , df , ...

, , df dataframe , :

for (z in (2:ncol(df))) assign(paste("df",z,sep="."),df[,z])

, df.2 df $a, df.3 df $b, df.4 df $c .. Im , , ...

, "" ?

+3
5

-

> library(reshape)

> dftab <- table(melt(df,'id'))
> dftab
, , value = 1

   variable
id  a b c d e
  1 3 8 2 2 4
  2 4 6 3 2 4
  3 4 2 1 5 1

, , value = 2

   variable
id  a b c d e
  1 0 1 4 3 3
  2 3 3 3 6 2
  3 1 4 5 3 4

, , value = 3

   variable
id  a b c d e
  1 7 1 4 5 3
  2 3 1 4 2 4
  3 5 4 4 2 5

, "3 " a " " 1 ",

> dftab[3,'a',1]
[1] 4
+5

tapply apply :

tapply(df$id,df$id,function(x) apply(df[id==x,-1],2,table))

, , 1a, id, ().

$`1`
$`1`$a

1 3 
3 7 

$`1`$b

1 2 3 
8 1 1 

$`1`$c

1 2 3 
2 4 4 

$`1`$d

1 2 3 
2 3 5 

$`1`$e

1 2 3 
4 3 3 


$`2`
  a b c d e
1 4 6 3 2 4
2 3 3 3 6 2
3 3 1 4 2 4

$`3`
  a b c d e
1 4 2 1 5 1
2 1 4 5 3 4
3 5 4 4 2 5
+2

, - , , dlply plyr.

ColTables <- function(df) {
  counts <- list()
  for(a in names(df)[names(df) != "id"]) {
    counts[[a]] <- table(df[a])
  }
  return(counts)
}

results <- dlply(df, "id", ColTables)

- "" id; table id. :

> results[['2']]['a']
$a

1 2 3 
4 3 3 

id = 2, column = a, .

0

aggregate,

> df$freq <- 0
> aggregate(freq~a+id,df,length)
  a id freq
1 1  1    3
2 3  1    7
3 1  2    4
4 2  2    3
5 3  2    3
6 1  3    4
7 2  3    1
8 3  3    5

Of course, you can write a function to do this, so it’s easier for you to do this often, and you don’t need to add a column to your actual data frame

> frequency <- function(df,groups) {
+   relevant <- df[,groups]
+   relevant$freq <- 0
+   aggregate(freq~.,relevant,length)
+ }
> frequency(df,c("b","id"))
  b id freq
1 1  1    8
2 2  1    1
3 3  1    1
4 1  2    6
5 2  2    3
6 3  2    1
7 1  3    2
8 2  3    4
9 3  3    4
0
source

You did not say how you need the data. A function bycan provide you with the result you need.

by(df, df$id, function(x) lapply(x[,-1], table))
0
source

All Articles