Optimize MATLAB code (nested in a loop to calculate the similarity matrix)

I am calculating a similarity matrix based on Euclidean distance in MATLAB. My code is as follows:

for i=1:N % M,N is the size of the matrix x for whose elements I am computing similarity matrix
 for j=1:N
  D(i,j) = sqrt(sum(x(:,i)-x(:,j)).^2)); % D is the similarity matrix
 end
end

Can any help to optimize this = reduce the for loop, since my matrix xhas dimension 256x30000.

Thank you so much!

- Aditya

+2
source share
4 answers

The function for this in matlab is called pdist. Unfortunately, this is very slow and does not take into account the ability of vectorization of Matlabs.

Below is the code I wrote for the project. Let me know what speed you get.

   Qx=repmat(dot(x,x,2),1,size(x,1));
   D=sqrt(Qx+Qx'-2*x*x');

, , . , , , 256 100000 , mac x = rand (256,100000), 256x256 .

+5

, , , , , , D(i,j)==D(i,j)

norm(x(:,i)-x(:,j),2)

+1

, , .

D=zeros(N);    
jIndx=repmat(1:N,N,1);iIndx=jIndx'; %'# fix SO syntax highlighting
D(:)=sqrt(sum((x(iIndx(:),:)-x(jIndx(:),:)).^2,2));

, x NxM, M - , N - . , , .

+1

For starters, you calculate twice as much as you need, because D will be symmetrical. You do not need to calculate the record (i, j) and the record (j, i) separately. Change your inner loop to for j=1:iand add to the body of this loopD(j,i)=D(i,j);

After that, what this code does is not really that much redundancy, so your only room for improvement is to parallelize it: if you have the Parallel Computing Toolbox, convert your outer loop to parforand run it in front of you, say matlabpool(n), where nis the number of threads used.

0
source