Grouping data in ranges in R

Suppose I have a data frame in R that has student names in one column and their labels in another column. These tags range from 20 to 100.

> mydata  
id  name   marks gender  
1   a1    56     female  
2   a2    37      male  

I want to divide a student into groups based on the criteria for the labels received, so the difference between the labels in each group should be more than 10. I tried to use a function table that gives the number of students in each from 20-30 to 30-40, but I want so that they select those students who have marks in a given range, and combine all their information into a group. Any help is appreciated.

+5
source share
3 answers

, " ", , , 10:

mydata <- data.frame(
  id = 1:100,
  name = paste0("a",1:100),
  marks = sample(20:100,100,TRUE),
  gender = sample(c("female","male"),100,TRUE))

split(mydata,cut(mydata$marks,seq(20,100,by=10)))
+9

, @Sacha answer , , .

, "" , , , , "", (rbind , ).

, , . , .

-, .

# Two data.frames (myData1, and myData2)
set.seed(1)
myData1 <- data.frame(id = 1:20, 
                      name = paste("a", 1:20, sep = ""),
                      marks = sample(20:100, 20, replace = TRUE),
                      gender = sample(c("F", "M"), 20, replace = TRUE))
myData2 <- data.frame(id = 1:17,
                      name = paste("b", 1:17, sep = ""),
                      marks = sample(30:100, 17, replace = TRUE),
                      gender = sample(c("F", "M"), 17, replace = TRUE))

-, .

  • 1: ( list) myData1 myData2, . data.frame s.

    lapply(list(myData1 = myData1, myData2 = myData2), 
           function(x) x[x$marks >= 30 & x$marks <= 50, ])
    
  • 2: ( list) , FALSE ( ), TRUE ( ), , . , data.frame s.

    lapply(list(myData1 = myData1, myData2 = myData2), 
           function(x) split(x, x$marks >= 30 & x$marks <= 50))
    
  • 3: , . @Sacha, . , , , , . , data.frame s.

    lapply(list(myData1 = myData1, myData2 = myData2),
           function(x) split(x, cut(x$marks, 
                                    breaks = c(0, 30, 50, 75, 100), 
                                    include.lowest = TRUE)))
    
  • 4. , 1. data.frame, , .

    # Combine the data. Assumes all the rownames are the same in both sets
    myDataALL <- rbind(myData1, myData2)
    # Extract just the group of scores you're interested in
    myDataALL[myDataALL$marks >= 30 & myDataALL$marks <= 50, ]
    
  • 5: split : , , , . data.frame s.

    split(myDataALL, myDataALL$marks >= 30 & myDataALL$marks <= 50)
    

, !

+4

, :

Step 1: determine the range Step 2. Find the elements that fall into the range Step 3: Plot

A sample code is shown below:

   range = NULL
   for(i in seq(0, max(all$downlink), 2000)){
    range <- c(range, i)
   }
   counts <- numeric(length(range)-1);
   for(i in 1:length(counts)) {
   counts[i] <- length(which(all$downlink>=range[i] & all$downlink<range[i+1]));
   }
   countmax = max(counts)
   a = round(countmax/1000)*1000
   barplot(counts, col= rainbow(16), ylim = c(0,a))
+1
source

All Articles