Do I need to configure the data pipeline daily for the AWS Dynamo database?

I am considering using AWS DynamoDB for the application we are creating. I understand that setting up a backup job that exports data from DynamoDB to S3 includes a data pipeline with EMR. But my question is: do I need to worry about the backup task being configured on day 1? What are the chances of losing data?

+3
source share
3 answers

This is really subjective. IMO, you should not worry about them "now." You can also use simpler solutions besides pipleline . Perhaps this will be a good start.

DynamoDB , . . , , SDK .

+1

DynamoDB :

(1) S3 , , , ( ?)

(2) S3, -. , S3, , , RDBMS (RDS ) S3 . EMR Redshift (ETL) BI. Redshift, ELT- - Redshift

(3) ( ) ( , ) - . - , , . , , DynamoDB, .

(4) S3. , - DynamoDB - concurrency .

AWS Data Pipeline ( EMR ).

, , , , .

+1

S3. .

Dynamo DB , ( ). - .

You can say that Pipeline only consumes, say, 25% of the capacity when backing up so that your real users do not notice a delay. Each backup is "full" (not incremental), so at some periodic time interval you can delete several old backups if you are concerned about storage.

0
source

All Articles