Speed up a program with multiple processors

Question

Speed up a program with multiple processors

I found that sometimes it’s faster to split one cycle into two or more

for (i=0; i<AMT; i++) {
    a[i] += c[i];
    b[i] += d[i];
}
     ||
     \/
for (i=0; i<AMT; i++) {
    //a[i] += c[i];
    b[i] += d[i];
}
for (i=0; i<AMT; i++) {
    a[i] += c[i];
    //b[i] += d[i];
}

On my desktop, win7, AMD Phenom (tm) x6 1055T, the two-loop version is faster by about 1/3 of the time less.

But if I am dealing with an appointment,

for (i=0; i<AMT; i++) {
    b[i] = rand()%100;
    c[i] = rand()%100;
}

dividing the assignment of b and c into two loops is not faster than in one cycle.

I think there are some rules for using the OS to determine if certain codes can be executed by multiple processors.

I want to ask if I guess correctly, and if I am right, what are the rules or cases when several processors will be automatically (without thread programming) used to speed up my programs?

+5

c ++ performance c parallel processing

Robert Bean Apr 2 '13 at 6:25

source share

3

, . , SIMD (, Intel SSE), , , - , , , a, b. , .

"" rand() , , . SIMD, , , . , , , .

, ; concurrency. , , , , . , .

, , , "" . , b c "" , , , . , b c , , .

+4

Joni 02 . '13 6:45

, , , , ( ), ?

, . .

For some other reason, splitting the first cycle into two makes it faster. Perhaps your compiler is able to generate more efficient code, or the processor has an easier time, after taking the correct data, etc. It is difficult to say without analyzing the generated machine code.

0

NPE Apr 2 '13 at 6:40

source share

Bechir · Accepted Answer · 2013-04-02T06:37:45+0000

(http://en.wikipedia.org/wiki/Loop_optimization). GCC, http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html .

, rand(), .

Speed ​​up a program with multiple processors

More articles:

Speed up a program with multiple processors