Performance 100M Row Table (Oracle 11g)

We are developing a table for ad-hoc analysis, which will cover many value fields over time for the received applications. The table structure essentially (pseudo-code):

   table_huge (
     claim_key int not null,
     valuation_date_key int not null,
     value_1 some_number_type,
     value_2 some_number_type,
     [etc...],
     constraint pk_huge primary key (claim_key, valuation_date_key)
   );

All value fields are all numeric. Requirements: in the table should be recorded at least 12 last years (I hope more) of the stated requirements. Each claim must have an evaluation date for each end of the month between the beginning of the claim and the current date. Typical volumes of applications for claims range from 50 thousand to 100 thousand per year.

Adding all this, I am designing a table with row counts of the order of 100 million and can grow to 500 million over the years, depending on the needs of the business. The table will be rebuilt every month. Consumers will choose only. With the exception of monthly updates, no updates, insertions, or deletions will occur.

I come to this from the side of the business (consumer), but I have an interest in reducing IT costs while maintaining the analytical value of this table. We are not particularly concerned about the quick return from the table, but sometimes you need to drop a couple of dozen queries on it and get all the results in a day or three.

For the sake of argument, suppose that the technology stack, I do not know, is in the 80th percentile of modern equipment.

I have the following questions:

  • , , ?
  • SO + 100M , ?
  • , - ( ?)?

, , , , .

, , - . !

+3
4

: , " ", - "80% " .

200M + MySQL , .

:

  • , . , parallell . ( 10 ) " , " 70%

  • : ( concurrency= IO, CPU)

  • , 20 64 , 200M : , 32 . 64G RAM -.

  • ,

+5

.

, , ~ 95% - 100M 5M, .

, - , .

"" , ; - ( ) . , . - . . 10.

enter image description here

enter image description here

, , MS SQL, ORACLE. .

()

  • ( )

  • ()

  • , ( - )

  • ,

, .

+3

/ NEVER , ( MANY ).

, .

Claim
    claim_key
    valuation_date

ClaimValue
    claim_key (fk->Claim.claim_key)
    value_key
    value

, , , , .

+1

, , , .

In our company, we have solved a huge number of performance problems with the concept of the partition.

Another design decision - if we know that the table will be very large, try not to apply more restrictions in the table and process it in logic before u and do not have many columns of the table, avoid problems with chaining.

0
source

All Articles