If you are considering a single variable, then the integer code (native size) will always execute at least as fast, because masking or alignment should not be performed. However, if you are considering arrays (or, conversely, closely packed sets of variables), then smaller variables will fit into the cache line and be easily accessible to the kernel. The disadvantages of the cache add a significant delay, which leads to an increase in the efficiency of manipulating words of their own size in the kernel.
source
share