How can I compile a CUDA program for sm_1X and sm_2X when I have a surface declaration

I am writing a library that uses a surface (for re-fetching and recording textures) to improve performance:

...
surface<void,  2> my_surf2D; //allows writing to a texture
...

The target platform's GPU has the ability to compute 2.0, and I can compile my code with:

nvcc -arch=sm_20 ...

and it works great.

The problem is that I am trying to develop and debug a library on my laptop with an NVIDIA ION GPU with computing power of 1.1 (I would also like my library to be backward compatible). I know that this architecture does not support surfaces, so I used nvcc macros in my device code to define alternative code for this old architecture:

#if (__CUDA_ARCH__ < 200)
#warning using kernel for CUDA ARCH < 2.0
...
temp_array[...] =  tex3D(my_tex,X,Y,Z+0.5f);
#else
...
surf2Dwrite( tex3D(my_tex,X,Y,Z+0.5f), my_surf2D, ix*4, iy,cudaBoundaryModeTrap);
#endif

The problem is that when I do:

nvcc -gencode arch=compute_11,code=sm_11

:

ptxas PTX/myLibrary.ptx, line 1784; fatal  : Parsing error near '.surf': syntax error

PTX, , :

.surf .u32 _ZN16LIB_15my_surf2DE;

:

#ifdef __CUDACC__
#if __CUDA_ARCH__ < 200
#warning skipping surface declaration for nvcc trajectory
#else
surface ...
#endif
#else
#warning keeping surface declaration by default
surface ...
#endif

: undefined - cuda . ?

, , - , , .

+3
1

, ...

( ). , , , (, __CUDACC__ ).

< 2.0

:

//enable backwards compatability:
#if defined(__CUDA_ARCH__) & (__CUDA_ARCH__ < 200)
#warning skipping surface declarations for compute capability < 2.0
#else
surface<void,  2> my_surf2D; //allows writing to a texture
#endif

:

#if defined(__CUDA_ARCH__) & (__CUDA_ARCH__ < 200)
#warning skipping cudaBindSurfaceToArray for compute capability < 2.0
...
#else
errorCode = cudaBindSurfaceToArray(my_surf2D, my_cudaArray2D);
#endif

:

#if defined(__CUDA_ARCH__) & (__CUDA_ARCH__ < 200)
#warning using kernel for compute capability < 2.0
...
temp_array[...] =  tex3D(my_tex,X,Y,Z+0.5f);
#else
...
surf2Dwrite( tex3D(my_tex,X,Y,Z+0.5f), my_surf2D, ix*4, iy,cudaBoundaryModeTrap);
#endif

, (-arch = compute_XX -arch = sm_XX ).

talonmies , , talonmies, nvcc/CUDA.

+3

All Articles