Persistent threads in OpenCL and CUDA

Question

Persistent threads in OpenCL and CUDA

I read a few articles about the "constant threads" for GPGPU, but I do not understand this. Can someone give me an example or show me the use of this programming method?

What I keep in mind after reading and searching for "constant streams":

Presistent Threads is nothing more than a while loop that supports a thread and evaluates many functions.

It is right? thanks in advance

Link: http://www.idav.ucdavis.edu/publications/print_pub?pub_id=1089 http://developer.download.nvidia.com/GTC/PDF/GTC2012/PresentationPDF/S0157-GTC2012-Persistent-Threads-Computing .pdf

+5

opencl gpu gpgpu cuda

Aminems Feb 11 '13 at 21:18

source share

2 answers

. . , , . , , 1920x1080, 1920 , 1080 .

+2

Hunter Wang 20 . '14 7:45

JackOLantern · Accepted Answer · 2014-06-10T17:06:53+0000

CUDA Single Instruction Multiple Data (SIMD). , (SM). SM 32: warp .

, GPU , SM. SM, , . , , , SM. , SIMD . , A B SM , A .

CUDA , , . , , , . , , SM.

,

"GPGPU" CUDA/OpenCL

// Persistent thread: Run until work is done, processing multiple work per thread
// rather than just one. Terminates when no more work is available

// count represents the number of data to be processed

__global__  void persistent(int* ahead, int* bhead, int count, float* a, float* b)
{
    int local_input_data_index, local_output_data_index;
while ((local_input_data_index = read_and_increment(ahead)) <   count)
{                                   
        load_locally(a[local_input_data_index]);

        do_work_with_locally_loaded_data();

        int out_index = read_and_increment(bhead);

        write_result(b[out_index]);
    }
}

// Launch exactly enough threads to fill up machine (to achieve sufficient parallelism 
// and latency hiding)
persistent<<numBlocks,blockSize>>(ahead_addr, bhead_addr, total_count, A, B);

Persistent threads in OpenCL and CUDA

More articles: