How to decode BERT when BERT is a binary string

My BERT is passed to Erlang via the query string. I read it through gen_tcp with the http_bin parameter so that it acts like this: <131,104,1,100,0,2,104,105 "→. This is almost correct because I want to decode it with binary_to_term / 2. But binary_to_term / 2 wants binary binary. not a binary file (it wants <<131,104,1,100,0,2,104,105 → not <<131,104,1,100,0,2,104,105 "→).

I can make it out in the correct form.

parse(Source) ->
    Bins = binary:split(Source, <<",">>, [global]),
    parse(Bins, []).
parse([H | T], Acc) ->
    parse(T, [list_to_integer(binary_to_list(H)) | Acc]);
parse([], Acc) ->
    list_to_binary(lists:reverse(Acc)).

But this seems confusing and slower than I had hoped for (~ 5k / sec, each of which is 200 bytes). Also came up with something based on io_lib: fread / 2, but it wasn’t much better and still looks awkward.

  • Is there a BIF or NIF that can do this?

  • If not, is there a better way to do this to speed it up?

+3
source share
2 answers

For what it costs, an alternative solution - presumably slower, but perhaps less ad-hoc, depending on taste - is to consider it as a problem parsing a subset of Erlang, for which there are tools:

parse(Source) ->
  case erl_scan:string(Source++" .") of
    {ok, Tokens, _} ->
      case erl_parse:parse_term(Tokens) of
        {ok, Bin} when is_binary(Bin) -> % Only accept binary literals.
          Bin;
        _ -> error(badarg)
      end;
    _ -> error(badarg)
  end.

Perhaps in this context it may be redundant, but no more code than the original solution.

0
source

Using this code, you can parse up to 75 MB / s in native (HiPE) and up to 17 MB / s in byte code:

-module(str_to_bin).

-export([str_to_bin/1]).

str_to_bin(Bin) when is_binary(Bin) ->
  str_to_bin(Bin, <<>>).

-define(D(X), X >= $0, X =< $9 ).

-define(C(X), (X band 2#1111)).

str_to_bin(<<X,Y,Z,Rest/binary>>, Acc)
    when ?D(X), ?D(Y), ?D(Z) ->
  str_to_bin_(Rest, <<Acc/binary, (?C(X)*100 + ?C(Y)*10 + ?C(Z))>>);
str_to_bin(<<Y,Z,Rest/binary>>, Acc)
    when ?D(Y), ?D(Z) ->
  str_to_bin_(Rest, <<Acc/binary, (?C(Y)*10 + ?C(Z))>>);
str_to_bin(<<Z,Rest/binary>>, Acc)
    when ?D(Z) ->
  str_to_bin_(Rest, <<Acc/binary, ?C(Z)>>).

-compile({inline, [str_to_bin_/2]}).

str_to_bin_(<<>>, Acc) -> Acc;
str_to_bin_(<<$,, Rest/binary>>, Acc) -> str_to_bin(Rest, Acc).
0
source

All Articles