I need the OpenCL kernel to iteratively update the buffer and return the results. To clarify:
- Send source buffer to kernel content
- Kernel / worker updates every item in the buffer
- The host code reads the results. HOPEFULLY asynchronously, although I'm not sure how to do this without blocking the kernel.
- The kernel starts again, updating each element again, but the new value depends on the previous value.
- Repeat for some fixed number of iterations.
So far, I have been able to fake this by providing an input and output buffer, copying the output back to the input when the kernel finishes execution, and restarting the kernel. This seems like a huge waste of time and abuse of limited memory bandwidth, as the buffer is quite large (~ 1 GB).
Any suggestions / examples? I am new to OpenCL, so this may have a very simple answer.
If that matters, I use Cloo / OpenCL.NET for the NVidia GTX460 and two GTX295s.
source
share