Thrust :: Sort a very long compilation time

I am trying to compile a block of sample code using Thrust to help learn some CUDA.

I am using Visual Studio 2010, and I have other examples to compile. However, when I compile this example, it takes more than 10 minutes to compile. I commented out the lines selectively and realized that its Thrust :: sort line, which takes forever (with that one line commented out, it takes about 5 seconds to compile).

I found a post somewhere that said how sorting was slow to compile in Thrust, and that was the decision that the Thrust development team made (it is 3 times faster at runtime, but takes longer to compile). But this post was at the end of 2008.

Any idea why this took so long?

In addition, I compile on a machine with the following specifications, so its not a slow machine

i7-2600k @ 4.5 ghz
16 GB DDR3 @ 1833 mhz
Raid 0 of 6 GB / s 1TB drives

As requested, this is an assembly line that looks like Visual Studio calls

C: \ Program Files (x86) \ Microsoft Visual Studio 9.0 \ VC \ bin "-I" C: \ Program Files \ NVIDIA GPU Computing Toolkit \ CUDA \ v3.2 \ include "-G0 --keep-dir" Debug \ "-maxrregcount = 32 --machine 64 --compile -D_NEXUS_DEBUG -g -Xcompiler" / EHsc / nologo / Od / Zi / MTd "-o" Debug \ kernel.obj "C: \ Users \ Rob \ Desktop \ VS2010Test \ VS2010Test \ VS2010Test \ kernel.cpp "-clean

Example

#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <thrust/sort.h>
int main(void)
{
    // generate 16M random numbers on the host
    thrust::host_vector<int> h_vec(1 << 24);
    thrust::generate(h_vec.begin(), h_vec.end(), rand);
    // transfer data to the device
    thrust::device_vector<int> d_vec = h_vec;
    // sort data on the device
    thrust::sort(d_vec.begin(), d_vec.end());
    // transfer data back to host
    thrust::copy(d_vec.begin(), d_vec.end(), h_vec.begin());
    return 0;
}
+3
source share
1 answer

CUDA 3.2 , sort, (i nvcc -G0). , CUDA 4.0 . -G0 .

+1

All Articles