CUDA: CUtil Timer - Past Confusion

When I evaluate my program, I saw that at some point I get up to 100 ms of time. I searched every operation, but individually no operation was performed this time. Then I noticed that wherever I make a call to cudaThreadSynchronize, the first call takes 100 ms. Then I wrote such an example below. When cudaThreadSynchronize is called on the first line, the sought time at the end is less than 1 ms. But if it is not called, it takes an average of 110 ms.

int main(int argc, char **argv)
{
    cudaThreadSynchronize(); //Comment out it then get 110msec as elapsed time..

    unsigned int timer;
    cutCreateTimer(&timer);
    cutStartTimer(timer);

    float *data;
    CUDA_SAFE_CALL(cudaMalloc(&data, sizeof(float) * 1024));

    cutStopTimer(timer);
    printf("CUT Elapsed: %.3f\n", cutGetTimerValue(timer));

    cutDeleteTimer(timer);

    return EXIT_SUCCESS;
}

, cudaThreadSynchronize() CUDA. , ? cudaThreadSynchronize , .

+1
1

CUDA, CUDA , 70-100 . cudaThreadSynchronize(); . . ( , cudaThreadSynchronize();).

+1

All Articles