Hyper-Threading Performance Comparison

I wrote a project that uses some basic features in openssl, such as RAND_bytesand des_ecb_encrypt.

My computer has an i7-2600 (4 cores and 8 logical processors). When I run a project with 4 threads, it will cost 10 seconds. When I run it with 8 threads, it also costs 10 seconds.

I mean that hyperthreading does not give me any performance improvement. On Linux, the result of the experiment is the same.

I found here that hyperthreading does not give me some improvement in some situations. Also, I found here to give me some intuitive results.

However, I tried to write some simple tests and found some simple examples that show that hyperthreading will not give me a clear improvement. Unfortunately, I do not find it.

So my questions are that some tests simpleshow that hyperthreading will not give me a performance improvement.

+3
source share
4 answers

You may find that hyperthreading helps more code that uses large amounts of memory, so the processor locks regularly when it is retrieved from memory.

, " ", . , , , . , , , 2x " ". , , 20-30%.

+4

Hyper threading , , , , . , (, , ALU), (, , FPU).

, , .

, ( , , ).

, , , , , .

+2

, openssl, RAND_bytes des_ecb_encrypt... i7-2600 (4 8 ). 4 , 10 . 8 , 10 .

RDRAND ( RAND_bytes ), . 800 /. , - . . Intel rdrand.

AES, DES/3DES. Ivy Bridge AES-NI, 1,3 /, AES - . , AES-NI, EVP_*.


, . , .

, @selalerer @Mats Petersson . , . Intel 30%.

Intel Out-Of-Order Hyper-threading, . Silvermont.

, . . , , : (, , ).


, , .

OpenSSL . . <openssl source>/apps/speed.c.

, . - , . ., , .

+2

MP Linux Windows, -. HT, Linux Atom (1 core 2 threads), Windows Core i7 (4 + 4).

http://www.roylongbottom.org.uk/linux%20multithreading%20benchmarks.htm

http://www.roylongbottom.org.uk/quad%20core%208%20thread.htm

Take your pick, depending on what you want to prove if the HT provides better or worse performance. Below are the results of RandMem on i7 (Linux uses this test better). For ones like the i7, you also need to consider Turbo Boost, which may be lower with multiple threads.

             CPUs          MBytes Per Second Using Threads        Gain At Threads
             /HTs         1       2       4       6       8     2     4     6     8
 Serial RD
 Core i7     4/8 L1   11458   22661   37039   43717   46374   2.0   3.2   3.8   4.0
 930             L2   10380   20832   32853   41711   42839   2.0   3.2   4.0   4.1
 #### MHz        L3    8828   17743   29610   38414   40330   2.0   3.4   4.4   4.6
 Win 764        RAM    4266    8712   17347   24946   25589   2.0   4.1   5.8   6.0

 Serial RW
 Core i7     4/8 L1   15282   13724   16240   16209   18379   0.9   1.1   1.1   1.2
 930             L2   12223   18216   25326   28104   27047   1.5   2.1   2.3   2.2
 #### MHz        L3   10234   19266   21931   24450   26351   1.9   2.1   2.4   2.6
 Win 764        RAM    4533    7656   13876   14543   13390   1.7   3.1   3.2   3.0

 Random RD
 Core i7     4/8 L1   11266   22548   38174   45592   47141   2.0   3.4   4.0   4.2
 930             L2    6233   12463   20059   24986   25667   2.0   3.2   4.0   4.1
 #### MHz        L3    3499    6915    9211   10002    9531   2.0   2.6   2.9   2.7
 Win 764        RAM     459     909    1241    1398    1364   2.0   2.7   3.0   3.0

 Random RW
 Core i7     4/8 L1   14375    3027    2780    2901    3297   0.2   0.2   0.2   0.2
 930             L2    5887    4555    6117    6693    7281   0.8   1.0   1.1   1.2
 #### MHz        L3    3104    4604    4721    5047    4933   1.5   1.5   1.6   1.6
 Win 764        RAM     428     860     899     948    1026   2.0   2.1   2.2   2.4

 #### 2.8 GHz running at up to 3.06 GHz via Turbo Boost, dual channel 1066 MHz DDR3 RAM 

Then MP Whetstone Benchmark Showing Real Profits

                      MWIPS  MFLOP  MFLOP  MFLOP   COS    EXP   FIXPT   IF    EQUAL
CPU              MHz            1      2      3    MOPS   MOPS   MOPS   MOPS   MOPS

Core i7 1 Thrd  ####   3115   1065    886    738   79.3   39.7   2447   2936   1154

Core i7 Win7    ####  21690   8676   7621   5844    531    291  16643  12027   5034
Quad Core Thread 1            1091   1027    728   66.4   36.5   2050   1501    629
Plus HT   Thread 2            1089   1037    742   66.0   36.5   2090   1507    630
          Thread 3            1090    946    742   66.8   36.5   2069   1534    631
          Thread 4            1092   1037    727   66.6   36.6   2031   1501    630
          Thread 5            1042    959    736   66.4   36.5   1912   1483    630
          Thread 6            1091    874    723   66.6   36.1   2049   1507    629
          Thread 7            1090    867    725   65.6   36.3   2094   1516    631
          Thread 8            1091    874    722   66.3   36.3   2350   1476    624

Gain %                  696    815    860    792    670    733    680    410    436
+2
source

All Articles