Debugging shared memory issue

I am trying to use Nsight to debug the following code:

__device__ void change(int shared[])
{
    if(threadIdx.x<10)
        shared[threadIdx.x]=threadIdx.x;
}
__global__ void MyK()
{
    int shared[10]; 
    change(shared);
    __syncthreads();
}

I call my kernel in the main method as follows:

cudaSetDevice(1);
MyK<<<1,20>>>();

when I set a breakpoint before the change (shared), I see that the shared array is created and its values ​​are set to 0.
when the breakpoint is set after __syncthreads (); I get "cannot resolve common name" in the debugger.

Can't pass my shared array to another device function?

+3
source share
3 answers

, , " , " , , , - ( ), user586831, .

, "_ shared_" "shared". . "int shared" - . , _ shared_ qualifier. . extern _ shared _ int shared [10]

+1

_ _ shared _ _ ?

, _ _device _ _ - . ​​ 16 32 , SP, .

0

A call __syncthreads()for some and not all threads can cause a deadlock. threadIdx.x < 10calls _syncthreads() As mentioned earlier, you are not using shared memory here. The compiler is smart, if you do not use the value afterwards, the memory location may become invalid. Try displaying the value as the return value for your device function. Should work fine, especially if you move / delete __syncthreads().

0
source

All Articles