Gradually process a large XML file through HTTPS?

I need to download, process and save an 8 GB XML file from a secure web server. I can upload the file using the class WebRequest, but it will take a VERY long time. In addition, I know that the file is structured in such a way that it is suitable for processing in discrete pieces.

How can I “transfer” this file in such a way that I get only pieces of the size of a piece that I can work on, without having to get the whole stream at a time?

Edit

I forgot to mention - we are hosted on Azure. The idea that comes to mind is to provide a working role that simply uploads large files and can take as much time as it wants. As much as possible?

+3
source share
4 answers

8 GB is a big workload. To protect myself from redistribution and efficient scaling, I would separate the loading of the XML file from its processing.

When loading as a stream, I would write some kind of stream identifier to the persistent storage and pay for each atomic unit of work, putting a message with the corresponding data in the queue. This will allow you to restore the download from the south for some reason, or part of the work will be unsuccessful and / or interfere with the download.

+3
source

I use HttpWebRequest, BeginGetResponse, then GetResponseStream

You can then read the stream in pieces when it drips down through stream.BeginRead

Here's an overly complicated example: http://stuff.seans.com/2009/01/05/using-httpwebrequest-for-asynchronous-downloads/

+2

, XMLReader .

(.. ), ( RANGE ) , .

, 8 - , , .

+1

All Articles