Delete duplicate tuples after sorting a tuple in R

Question

Delete duplicate tuples after sorting a tuple in R

I have a question about removing duplicates after sorting in a tuple in R.

Say I have a dataframe of values

df<-cbind(c(1,2,7,8,5,1),c(5,6,3,4,1,8),c(1.2,1,-.5,5,1.2,1))

a and b

a=df[,1]
b=df[,2]
temp<-cbind(a,b)

What I'm doing is unique based on a sorted tuple. For example, I want to save a = 1,2,7,8,1 and b = 5,6,3,4,8 with deleted record a [5] and b [5]. This is mainly for determining the interaction between two objects. 1 vs 5, 2 vs 6, etc., But 5 vs 1 is the same as 1 vs 5, so I want to remove it.

The route I started doing was as follows. I created a function that sorts each element and returns the results back to the vector as such.

sortme<-function(i){sort(temp[i,])}
sorted<-t(sapply(1:nrow(temp),sortme))

and got the following results

     a b
[1,] 1 5
[2,] 2 6
[3,] 3 7
[4,] 4 8
[5,] 1 5
[6,] 1 8

Then I uniquely sorted the result

unique(sorted)

which gives

     a b
[1,] 1 5
[2,] 2 6
[3,] 3 7
[4,] 4 8
[5,] 1 8

! duplicated, / , , .

T_F<-!duplicated(sorted)
final_df<-df[T_F,]

, , - , , .

+3

r duplicates tuples

doggysaywhat 12 '12 1:45

2

sortme sapply sort apply

sorted <- t(apply(df[, 1:2], 1, sort))

+1

Tyler Rinker 12 '12 2:11

BenBarnes · Accepted Answer · 2012-05-12T21:09:32+0000

, " ", , , .

theSums<-.rowSums(temp,m=nrow(temp),n=ncol(temp))

almostSorted <- do.call(rbind, tapply(seq_len(nrow(temp)), theSums,
  function(x) {
    if(length(x) == 1L) {
      return(cbind(x, temp[x, , drop = FALSE]))
    } else {
      return(cbind(x, t(apply(temp[x, ], 1, sort))))
    }
  }
))

(sorted <- almostSorted[order(almostSorted[, 1]), -1])

[1,] 1 5
[2,] 2 6
[3,] 7 3
[4,] 8 4
[5,] 1 5
[6,] 1 8

Delete duplicate tuples after sorting a tuple in R

More articles: