Scanning a large binary with Erlang

I like to scan larger (> 500M) binaries for structs / patterns. I am new to the language, hope someone can start me. In fact, files are a database containing segments. The segment begins with a fixed-size header, followed by an optional fixed-sized part, followed by a variable-length payload / data portion. For the first test, I just need to register the number of segments in the file. I already went to Google, but I didn’t find anything that helped. I need a hint or tutorial that is not too far from my use to get started.

hello Stefan

+5
source share
3 answers

You need to learn about Bit Syntax and Binary Comprehensions . More useful links: http://www.erlang.org/documentation/doc-5.6/doc/programming_examples/bit_syntax.html and http://goto0.cubelogic.org/a/90 . You also need to learn how to process files, read from files (in turn, chunk-by-chunk, at given positions in a file, etc.), write files in several ways. File processing functions are explained here. You can also see the source code of large file processing libraries in erlang packages, such as Disk Log , Dets and mnesia.


. , .

,

+3

: (test.txt), . <<$a, $b, $c>> .

test.txt ":

I arbitrarily decide to choose the string "abc" as my target string for my test. I want to find all the abc in my testing file.

(lab.erl):

-module(lab).
-compile(export_all).

find(BinPattern, InputFile) ->
    BinPatternLength = length(binary_to_list(BinPattern)),
    {ok, S} = file:open(InputFile, [read, binary, raw]),
    loop(S, BinPattern, 0, BinPatternLength, 0),
    file:close(S),
    io:format("Done!~n", []).

loop(S, BinPattern, StartPos, Length, Acc) ->
    case file:pread(S, StartPos, Length) of
    {ok, Bin} ->
        case Bin of
        BinPattern ->
            io:format("Found one at position: ~p.~n", [StartPos]),
            loop(S, BinPattern, StartPos + 1, Length, Acc + 1);
        _ ->
            loop(S, BinPattern, StartPos + 1, Length, Acc)
        end;
    eof ->
        io:format("I've proudly found ~p matches:)~n", [Acc])
    end.

:

1> c(lab).
{ok,lab}
2> lab:find(<<"abc">>, "./test.txt").     
Found one at position: 43.
Found one at position: 103.
I've proudly found 2 matches:)
Done!
ok

, ( ), ( "" ). , .

+1

, , , file:read_file/1. raw. bit_syntax. , /, HiPE. , / . " " .

+1

All Articles