I am working on a comparative study, in which I have to compare serial and parallel versions of the algorithm (NSGA-II algorithm, to be exact download the link here ). NSGA-II is a heuristic optimization method and therefore depends on the generated original random population. If the original populations created using the processor and the GPU are different, I cannot conduct an impartial acceleration study.
I have an NVIDIA-TESLA-C1060 card that has a computing power of 1.3. According to this announcement and this NVIDIA document , we cannot expect sm_13 devices to always get the IEEE-754 floating point value (single precision). Which in another word means that on my current device I can not impartially accelerate the study of the CUDA program, corresponding to its serial counterpart.
My question is: will switching to Fermi architecture solve the problem?
source
share