Marking adjacent pieces of observations without a loop

I have a standard "can-I-avoid-a-loop" problem, but cannot find a solution.

I answered this question @splaisan , but I had to resort to some ugly distortions in the middle part with tests forand multiple if. I am imitating a simpler version here, hoping that someone can give a better answer ...

PROBLEM

For a data structure like this:

df <- read.table(text = 'type
a
a
a
b
b
c
c
c
c
d
e', header = TRUE)

I want to identify adjacent pieces of the same type and mark them in groups. The first fragment should be marked as 0, the next 1 and so on. There is an indefinite number of fragments, and each fragment can be short as soon as one member.

type    label
   a    0
   a    0
   a    0
   b    1
   b    1
   c    2
   c    2
   c    2
   c    2
   d    3
   e    4

MY DECISION

I had to resort to a loop forto do this, here is the code:

label <- 0
df$label <- label

# LOOP through the label column and increment the label
# whenever a new type is found
for (i in 2:length(df$type)) {
    if (df$type[i-1] != df$type[i]) { label <- label + 1 }
    df$label[i] <- label
}

MY QUESTION

- ?

+3
3

rle

r <- rle(as.numeric(df$type))
df$label <- rep(seq(from=0, length=length(r$lengths)), times=r$lengths)

rle, cumsum , .

df$label <- c(0,cumsum(df$type[-1] != df$type[-length(df$type)]))

:

> df
   type label
1     a     0
2     a     0
3     a     0
4     b     1
5     b     1
6     c     2
7     c     2
8     c     2
9     c     2
10    d     3
11    e     4
+6

:

as.numeric(df[, 1])-1
+3

It just happened to me, you can just convert to a coefficient, then go back to integers and subtract one of them:

as.integer(as.factor(df$type))-1

If typealready a factor, you can skip this step.

+2
source

All Articles