CUDA memory transfer at runtime

I know that CUDA cores can be “overlapped” by putting them in separate threads, but I wonder if memory can be transferred at runtime. CUDA kernels are asynchronous after

+3
source share
2 answers

You can run kernels, transfer from host to device and forward from device to host at the same time.

http://developer.download.nvidia.com/CUDA/training/StreamsAndConcurrencyWebinar.pdf

+2
source

For clarification only, the above are only valid if your device supports it. You can check it running a request to the device and checking for a parallel copy and attribute execution

+1
source

All Articles