Sequence Number Recognition

develops on another issue:

Identification of sequences of duplicate numbers in R

I used the answers to this question to identify the sequences in my data, and not the problem, however I get stuck when it comes to identifying sequences of different numbers, for example: a sequence can be: 126,126.25, not repeating numbers,

The code I use is the same as in the above question (rle)

sample data:

   d<-read.table(text='Date.Time Aerial
794  "2012-10-01 08:18:00"      1
795  "2012-10-01 08:34:00"      1
796  "2012-10-01 08:39:00"      1
797  "2012-10-01 08:42:00"      1
798  "2012-10-01 08:48:00"      1
799  "2012-10-01 08:54:00"      1
800  "2012-10-01 08:58:00"      1
801  "2012-10-01 09:04:00"      1
802  "2012-10-01 09:05:00"      1
803  "2012-10-01 09:11:00"      1
1576 "2012-10-01 09:17:00"      2
1577 "2012-10-01 09:18:00"      2
804  "2012-10-01 09:19:00"      1
805  "2012-10-01 09:20:00"      1
1580 "2012-10-01 09:21:00"      2
1581 "2012-10-01 09:23:00"      2
806  "2012-10-01 09:25:00"      1
807  "2012-10-01 09:32:00"      1
808  "2012-10-01 09:37:00"      1
809  "2012-10-01 09:43:00"      1', header=TRUE, stringsAsFactors=FALSE, row.names=1)

which will recognize a repeating sequence of numbers (this number is repeated 4 times):

tmp <- rle(d$Aerial)
d$newCol <- rep(tmp$lengths>=4, times = tmp$lengths)

However, I do not know how to identify a sequence that contains different numbers, for example, the sequence can be: 1,2,2,1 (as in d $ Aerial) in "2012-10-01 09:11: 00"

. - , , , . 1,2,2,1, 1, 2, 2, 1 ( ). , , . , .

, 4 , 4 , : 1,2,2,1

(1,2,2,1) , .

+5
2

:

pat <- c(1,2,2,1)
x <- sapply(1:(nrow(d)-length(pat)), function(x) all(d$Aerial[x:(x+length(pat)-1)] == pat))

d[which(x),]  # "which" prevents recycling of the shorter vector "x"
##               Date.Time Aerial
## 803 2012-10-01 09:11:00      1
## 805 2012-10-01 09:20:00      1

zoo rollapply, :

require(zoo)
x <- rollapply(d$Aerial, length(pat), FUN=function(x) all(x == pat))

d[which(x),]
##               Date.Time Aerial
## 803 2012-10-01 09:11:00      1
## 805 2012-10-01 09:20:00      1

( ) , :

d[which(x)+length(pat)-1,]
##               Date.Time Aerial
## 804 2012-10-01 09:19:00      1
## 806 2012-10-01 09:25:00      1
+4

, ( , ), , :

pattern_length = 4
patterns = list()
for (i in 1:(nrow(d) - pattern_length)) {
  patterns[[i]] = d$Aerial[i:(i + pattern_length - 1)]
}
unique(patterns[duplicated(patterns)])

[[1]]
[1] 1 1 1 1

[[2]]
[1] 1 1 2 2

[[3]]
[1] 1 2 2 1

[[4]]
[1] 2 2 1 1

.

+4

All Articles