Maintaining gpuArray data between starting a CUDA kernel in Matlab

I am using parallel.gpu.CUDAKernel to run CUDA cores in Matlab 2011a. I designed my code in such a way that the same gpuArray was filled with subsequent kernel starts in a loop, but each start limits itself to a unique gpuArray segment.

By the end of execution, the entire array should be full. However, when I transfer memory back to the host using the gather () command, only the memory recorded by the last kernel launch is fixed; everything else is empty. This is also true if I exit the loop somewhere in the middle.

I checked that this is indeed the case by passing a flag to indicate the iteration of the kernel. If this is nothing but the first iteration, then the kernel does nothing. However, the data locations recorded by the first core are still empty, although subsequent kernels do nothing! This is not the case if I exit the loop immediately after starting the first kernel.

Thus, it seems to me that Matlab reloads gpuArray between kernel starts. Is there any way to prevent this?

+3
source share
1 answer

This should work by providing feval call capture capture. Consider a trivial core like this:

__global__ void setOneEl( double * array, double val, int element ) {
    array[element] = val;
}

Then, executing the following code in MATLAB works the way I assume you are after:

>> k = parallel.gpu.CUDAKernel('kern.ptx');
>> g = parallel.gpu.GPUArray.zeros(1,10);
>> for ii = 1:2:10, g = k.feval(g, rand, ii); end
>> gather(g)
ans =
         0    0.0975         0    0.2785         0    0.5469         0    0.9575         0    0.9649

MATLAB, gpuArray , , gpuArray, , MATLAB. , CUDAKernel.feval , , .

+2

All Articles