I am involved in learning how to use OpenMP in C, and as a HelloWorld exercise, I am writing a program for counting primes. Then I repeat it as follows:
int numprimes = 0;
#pragma omp parallel for reduction (+:numprimes)
for (i = 1; i <= n; i++)
{
if (is_prime(i) == true)
numprimes ++;
}
I will compile this code with gcc -g -Wall -fopenmp -o primes primes.c -lm( -lmfor the functions math.hthat I use). Then I run this code on Intel® Core™2 Duo CPU E8400 @ 3.00GHz × 2, and as expected, performance is better than for a serial program.
The problem, however, arises when I try to run it on a much more powerful machine. (I also tried manually setting the number of threads to use with num_threads, but that didn’t change anything.) Counting all the primes before 10 000 000gives me the following times (using time):
8- :
real 0m8.230s
user 0m50.425s
sys 0m0.004s
:
real 0m10.846s
user 0m17.233s
sys 0m0.004s
, , , , . ( , 4 4 ?)
50 000 000:
8- :
real 1m29.056s
user 8m11.695s
sys 0m0.017s
:
real 1m51.119s
user 2m50.519s
sys 0m0.060s
- , .
.
static int is_prime(int n)
{
if (n == 0) return 0;
else if (n == 1) return 0;
else if (n == 2) return 1;
int i;
for(i=2;i<=(int)(sqrt((double) n));i++)
if (n%i==0) return 0;
return 1;
}