I have a dataframe that looks like this:
df <- data.frame(Site=rep(paste0('site', 1:5), 50),
Month=sample(1:12, 50, replace=T),
Count=(sample(1:1000, 50, replace=T)))
I want to delete any sites where the counter always amounts to <5% of the maximum monthly bill on all sites.
Maximum monthly count for all sites:
library(plyr)
ddply(df, .(Month), summarise, Max.Count=max(Count))
If account 1 is assigned the value 1, then its calculation always amounts to 5% of the maximum monthly bill in all sites. Therefore, I would like to delete the site5.
df$Count[df$Site=='site5'] <- 1
However, after assigning site2 to the new values, some of its calculations are <5% of the maximum monthly bill, while others → 5%. Therefore, I would not want to delete site2.
df$Count[df$Site=='site2'] <- ceiling(seq(1, 1000, length.out=20))
How can I multiply a dataframe to delete any sites where the count is always <5% of the maximum monthly bill? Let me know if the question is unclear and I will correct it.