Vectorization for a loop over a data frame in R

I have a data frame with three columns: ref, target, distance. Each ref has a measured distance to the same set of targets, and I would like to get a vector of minimum distances for each ref. Right now I'm doing this with a for loop, but it seems like there should be a way to vectorize it.

Here is my code:

refs <- levels(data$ref)

result <- c()
for (ref in refs) {
    # Find the minimum distance for observations with the current ref
    # but be sure to protect against ref == target!
    best_dist <- min(data[data$ref == ref & data$target != ref,]$distance)
    result <- c(result, best_dist)
}

Am I doomed to have such a data frame set up, or is there a good way to vectorize it? Thanks for the help!

+5
source share
1 answer

Never Grow object inside the loop via c, cbind, rbind. The object will be copied every time. Instead, predefine the correct size (or overestimate a little if the result is fluid).

data.table .

 library(data.table)
 DT <- data.table(data)


 DT[ref != target, list(bestdist = min(distance)), by = ref] 

ref target ( ), ,

 DT[as.character(ref) != as.character(target),  list(bestdist = min(distance)), by = ref] 
+6

All Articles