I am a little confused about how CUDA works, do the threads each of them execute the same instruction (SIMT), but using separate data that is available with different indices? Or is it considered "different data" (so this is also SIMD)?
Is SMX a GPU chip? Should an SMX consist of several SPs, each of which runs one thread at a time, is a block of threads assigned to only one SP?
I'm a little confused right now
- . 1-3- . CUDA SMX. 1 SMX. > 10 SMX.
SMX 32 , . SMX 64 16 , . - (, , , ) .
SMX 4 , . 1 2 . . , , - .
warp warp. , , warp ( ), warp warp, .
, , . . , . , , . .