How to handle high-volume storage in the cloud (or else?)

I wrote an application that encodes a video. Encoding is a pipeline process: first you extract the video, then encode it with ffmpeg, then divide the video into several parts, etc.

During this time, 1 GB videos per several GB of intermediate data. This service is written so that another program (via RabbitMQ) can process every part of the pipeline. Of course, this process should not start in such a way that brings me to my question.

I am considering storage requirements in order to make the application live. With cloud providers, you pay per GB of memory and per GB of transfer. So far so good.

When I transfer this 1 GB video clip from one instance of a virtual virtual machine to another or from a virtual machine to a shared storage service, is this due to my bandwidth? (I understand that this answer will vary depending on the host's terms of service.)

Would it be more reasonable if 1 VM did the whole process and then deployed several instances of this? Unlike 1 VM, performing only one task in the pipeline? I ask this question from the point of view of cost optimization (the lowest cost of storage, the lowest cost of rotation of virtual machines. Since encoding will happen in batch mode, I am less worried about quickly repelling requests).

This scenario is a little unique in that I have a huge amount of binary data that cannot be effectively stored in, say, a database. Which raises a similar question: for those who have experience, when your VM database sends its results back to your web application, do you pay for this intermediate transfer?

Am I even asking the right questions? Is there a guide that I should read without calling the hosting providers and asking them how to evaluate themselves?

+3
source share
1 answer

The uniqueness of your script makes it quite interesting, I would say!

, . Amazon, , EC2, - . , / " ".

, ? , , . , = , , , , . VM , , . , , , .

, , . (CPU/RAM), "" , .

+1

All Articles