I built an outlier detection function, and it worked quite well, but given the huge amount of data that I am working on, I needed to remove the βcycle loopβ, so here we have a vectorized version (or at least what I think , this is a vector version of my code). By calling a function, the user performs the following parameters: I work with the following:
alpha=3
gamma=0.5
k=5
The series "price" exists in the workspace, connected when the function is called. I think I almost did it, but I have a problem.
Here is the code snippet:
[n] = size(price,1);
x = price;
[j1]=find(x); %output is a column vector with size (n,1) of the following form j1=[1:1:n]
matrix_left=zeros(n, k,'double');
matrix_right=zeros(n, k,'double');
toc
matrix_left(j1(k+1:end),:)=x(j1-k:j1-1);
% Here he returns the following error: index indices must be either natural integers or logical.
matrix_right(j1(1:end-k),:)=x(j1+1:j1+k);
% : .
matrix_group=[matrix_left matrix_right];
trimmed_mean=trimmean(matrix_group,10,'round',2);
score=bsxfun(@minus,x,trimmed_mean);
sd=std(matrix_group,2);
temp = abs(score) > (alpha .* sd + gamma);
outmat = temp*1;
, , - :
k = 5
left_matrix (3443,5):
[100.25 103.5 102.25 102.75 103] <---5 left neighbouring observations of the 15th row of **x**
[103.5 102.25 102.75 103 103.5] <---5 left neighbouring observations of the 16th row of **x**
right_matrix(3443,5):
[103.75 104.25 104 104.75 104.25] <---5 right neighbouring observations of the 15th row of **x**
[104.25 104 104.75 104.25 104.5] <---5 right neighbouring observations of the 16th row of **x**
:
x = Price; price size = (3443, 1)
[...]
100.25 %
103.5
102.25
102.75
103
103.5 %
103.75
104.25
104
104.75
104.25
104.5
[...]
Time (3443,1) %
j1 (3443,1)
1
2
[...]
3442
3443
,
Giorgio