Reading and saving arbitrary byte length integers from a file

I am trying to speed up the parser file parser that I wrote last year by doing parsing / accumulating data in numpy. numpyโ€™s ability to define customized data structures and slurp data from a binary file in them looks exactly what I need, except that some of the fields in these files are unsigned integers of non-standard length (for example, 6 bytes). Since I'm using Python 2.7, I created my own emulated version of int.from_bytes to handle these fields, but if there is any way to read these fields with integers initially in numpy, this will obviously be much faster and preferable.

+4
source share
1 answer

Numpy does not support integers of arbitrary bytes, and using ctypes bit fields will be more complicated than it costs.

I would suggest using a vectorized slice to convert your data to the next higher standard integer:

buf = "000000111111222222"
a = np.ndarray(len(buf), np.dtype('>i1'), buf)
e = np.zeros(len(buf) / 6, np.dtype('>i8'))
for i in range(3):
    e.view(dtype='>i2')[i + 1::4] = a.view(dtype='>i2')[i::3]
[hex(x) for x in e]
+4
source

All Articles