ARM NEON Vectorization Error

I would like to enable the NEON vectorization on my ARM cortex-a9, but I get this output when compiling:

"not vectorized: the corresponding stmt is not supported: D.14140_82 = D.14143_77 * D.14141_81"

Here is my loop:

void my_mul(float32_t * __restrict data1, float32_t * __restrict data2, float32_t * __restrict out){    
    for(int i=0; i<SIZE*4; i+=1){
        out[i] = data1[i]*data2[i];
    }
}

And the parameters used during compilation:

-march=armv7-a -mcpu=cortex-a9 -mfpu=neon -mfloat-abi=softfp -ftree-vectorize -mvectorize-with-neon-quad -ftree-vectorizer-verbose=2

I am using the arm-linux-gnueabi compiler (v4.6) .

It is important to note that the problem only occurs with float32 vectors . If I enable int32 , then vectorization is performed . Perhaps vectorization for float32 is not yet available ...

Does anyone have any ideas? Have I forgotten something in the cmd line or in my implementation?

Thanks in advance for your help.

Guix

+5
1

GCC

-mfpu =

     

...

     

NEON (, -mfpu = `neon '), , GCC, -funsafe-math-optimizations . , NEON IEEE 754 ( , ), NEON .

-funsafe-math-optimizations, , , .

+8

All Articles