I am working on an Intel processor for the Nehalam / westmere micro architecture. I want to optimize my code for this architecture. Are there any specialized compilation flags or GCC C functions that can help me improve code execution performance?
I already use -o3.
Language of the Code - C
Platform - Linux
GCC Version - 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC)
In my code, I have a floating point comparison, and they execute more than a million times.
Assume the code is already optimized.
source
share