"Subparameres" in attopar tubes

Question

"Subparameres" in attopar tubes

I am trying to parse binary data using pipe-attoparsec in Haskell. The reason why pipes (proxies) are used is to interleave reads using parsing to avoid using large memory for large files. Many binary formats are based on blocks (or chunks), and their sizes are often described by a field in a file. I'm not sure what the parser is called for such a block, but this is what I mean by "sub-parser" in the name. The problem is to implement them in a compressed form without a potentially large amount of memory. I came up with two alternatives that in each case will not work.

Alternative 1 is to read the block into a separate bytestring and run a separate parser for it. Although short, a large block will cause high memory.

Alternative 2 is to continue parsing in the same context and keep track of the number of bytes consumed. This tracking is error prone and seems to infect all the parsers that are generated in the last block processor. For a garbled input file, it can also spend time parsing further than indicated by the size field before the tracked size can be compared.

import Control.Proxy.Attoparsec
import Control.Proxy.Trans.Either
import Data.Attoparsec as P
import Data.Attoparsec.Binary
import qualified Data.ByteString as BS

parser = do
    size <- fromIntegral <$> anyWord32le

    -- alternative 1 (ignore the Either for simplicity):
    Right result <- parseOnly blockParser <$> P.take size
    return result

    -- alternative 2
    (result, trackedSize) <- blockparser
    when (size /= trackedSize) $ fail "size mismatch"
    return result

blockParser = undefined

main = withBinaryFile "bin" ReadMode go where
    go h = fmap print . runProxy . runEitherK $ session h
    session h = printD <-< parserD parser <-< throwParsingErrors <-< parserInputD <-< readChunk h 128
    readChunk h n () = runIdentityP go where
        go = do
            c <- lift $ BS.hGet h n
            unless (BS.null c) $ respond c *> go

+5

parsing haskell attoparsec haskell-pipes

absence Mar 15 '13 at 18:15

source share

2 answers

, , , , , pipes-parse. pipes-parse , , "".

() (.. ), ByteString.

, :

StateP ( pipes-3.3.0)
StateP , leftovers

pipes-attoparsec , pipes-parse, .

+2

Gabriel Gonzalez 03 . '13 19:56

Gabriel Gonzalez · Accepted Answer · 2013-03-16T00:37:21+0000

I like to call it a fixed input parser.

, pipes-parse. , pipes-parse parseN parseWhile . , , , String , .

, , , , ( , ), .

, , , . :

, . , , , , , - , , , .

pipes-attoparsec, , attoparsec . , , , attoparsec.

"Subparameres" in attopar tubes

More articles: