How to load one 32-bit floating point to all eight positions in the ymm AVX register?

Question

How to load one 32-bit floating point to all eight positions in the ymm AVX register?

How to load / convert one 32-bit floating point to the AVX register 256 ymm so that all 8 floats are from the same floating source?

I used to use the AVX 128 xmm register to load one float into 4 packed floats.

    movss    xmm7,[eax];
    shufps   xmm7,xmm7,0;

    add eax, 0x4;

+3

c ++ optimization avx

xzb667 May 19, '12 at 13:24

source share

2 answers

Fanana · Answer 1 · 2012-05-19T14:15:21+0000

This operation is sometimes called a "broadcast". We have a bunch of AVX instructions that are doing just that: vbroadcast128, vbroadcastsdand vbroadcastss. Since you want to translate one floating point value with one precision, you want to get the last one:

vbroadcastss ymm7, [eax]

Daniel Kamil Kozar · Answer 2 · 2012-06-02T00:28:22+0000

, , - :

shufps      xmm0, xmm0, 0
vinsertf128 ymm0, ymm0, xmm0, 1

, xmm0 . shufps, 0 , dword XMM. vinsertf128 xmmword YMM xmmword.

, . . , , vbroadcast -.

How to load one 32-bit floating point to all eight positions in the ymm AVX register?

More articles: