How to deal with a BIG DATA Data / Fact Table? (240 million lines)

We have a BI client that generates about 40 million rows every month in its sales database tables from their sales transactions. They want to build Sales Data Mart with their historical data for 5 years, which means that this fact table will have about 240 million rows. (40 x 12 months x 5 years)

This is well structured data.

This is the first time I've come across this amount of data, and I needed to analyze vertical database tools such as Inforbright and others. But still, with such software, a simple request will take a very long time.

This made me look at Hadoop, but after reading some articles, I came to the conclusion that Hadoop is not the best option (even with Hive) to create a fact table, since in my understanding it means working with unstructured data.

So my question is: what would be the best way to build this problem ?, I'm not looking for the right technology? What would be the best response time to a query that I could get in such a large fact table? .. or did I come across a real wall here, and the only option is to build aggregated tables?

+5
source share
6 answers

You have verified Google BigQuery (Paid Premium Service) that suits your needs. It is as simple as

  • CSV ( char ). gzip. .

  • SQL ( sql), .

  • CSV ( )

. https://developers.google.com/bigquery/

100 . , , Google Spreadsheet, , ​​ .. . Google Microsoft Excel/PDF.

Google ( ).

+4

240 2400 .

ssd.analytical-labs.com

FCC 150 , Infobright, , VW .

, , , .

, , , , .

, Marts , , , .. 1 ( ), .

cheep, , , .

, OLAP, , , , , .

, , , , , , , .

. 0 , , , 1 90% , (date dim ) .

2 . , - .

Tom

Edit:

, JVD:

  • ssd : 175.67 /
  • sata : 113,52 /
  • ec2: 75,65 /
  • ec2 ebs raid: 89.36 /

, .

+2

, ,

1). mondrian, agg - , , , , , .

2) - , , , . , ( Oracle) , MS SqlServer.

, . , ETL- ( 1 ), RDMBS .

+2

NoSQL/Analysis, DataStax Enterprise, Apache Cassandra Hadoop . , Hadoop "" HDFS , NoSQL (, Cassandra HBase) MapReduce.

+1

, , Hadoop + Hive. Map/Reduce jobs Hive . .

, () SQL- . - Hive . , , , .

+1

hasoop . hbase, , . ... , , . , sets..you apache "sqoop", .

0

All Articles