Subarea Amounts

I have a 2d array of integers and I want to bring it to 2d. Both arrays can have arbitrary sizes, although we can assume that the subframe will be an order of magnitude smaller than the full array.

The reference implementation in python is trivial:

def sub_sums(arr, l, m):
    result = np.zeros((len(arr) // l, len(arr[0]) // m))
    rows = len(arr) // l * l
    cols = len(arr[0]) // m * m
    for i in range(rows):
        for j in range(cols):
            result[i // l, j // m] += arr[i, j]
    return result

The question is how do I do it best using numpy, hopefully without any loops in python. For 1d arrays cumsum, it r_will work, and I could use this with a little loop to implement a solution for 2d, but I'm still learning numpy, and I'm almost sure there is a smarter way.

Output Example:

arr = np.asarray([range(0, 5),
                  range(4, 9),
                  range(8, 13),
                  range(12, 17)])
result = sub_sums(arr, 2, 2)

gives:

[[ 0  1  2  3  4]
 [ 4  5  6  7  8]
 [ 8  9 10 11 12]
 [12 13 14 15 16]]

[[ 10.  18.]
 [ 42.  50.]]
+3
source share
3 answers

There is a function blockshapedthat does something pretty close to what you want:

In [81]: arr
Out[81]: 
array([[ 0,  1,  2,  3,  4],
       [ 4,  5,  6,  7,  8],
       [ 8,  9, 10, 11, 12],
       [12, 13, 14, 15, 16]])

In [82]: blockshaped(arr[:,:4], 2,2)
Out[82]: 
array([[[ 0,  1],
        [ 4,  5]],

       [[ 2,  3],
        [ 6,  7]],

       [[ 8,  9],
        [12, 13]],

       [[10, 11],
        [14, 15]]])

In [83]: blockshaped(arr[:,:4], 2,2).shape
Out[83]: (4, 2, 2)

, , , ( ), sum .

, , , sub_sums :

import numpy as np

def sub_sums(arr, nrows, ncols):
    h, w = arr.shape
    h = (h // nrows)*nrows
    w = (w // ncols)*ncols
    arr = arr[:h,:w]
    return (arr.reshape(h // nrows, nrows, -1, ncols)
               .swapaxes(1, 2)
               .reshape(h // nrows, w // ncols, -1).sum(axis=-1))

arr = np.asarray([range(0, 5),
                  range(4, 9),
                  range(8, 13),
                  range(12, 17)])

print(sub_sums(arr, 2, 2))

[[10 18]
 [42 50]]

: Ophion - np.einsum :

def sub_sums_ophion(arr, nrows, ncols):
    h, w = arr.shape
    h = (h // nrows)*nrows
    w = (w // ncols)*ncols
    arr = arr[:h,:w]
    return np.einsum('ijkl->ik', arr.reshape(h // nrows, nrows, -1, ncols))

In [105]: %timeit sub_sums(arr, 2, 2)
10000 loops, best of 3: 112 µs per loop

In [106]: %timeit sub_sums_ophion(arr, 2, 2)
10000 loops, best of 3: 76.2 µs per loop
+3

, - sum():

def sub_sums(arr, l, m):
    result = np.zeros((len(arr) // l, len(arr[0]) // m))
    rows = len(arr) // l * l
    cols = len(arr[0]) // m * m
    for i in range(len(arr) // l):
        for j in range(len(arr[0]) // m):
            result[i, j] = arr[i*m:(i+1)*m, j*l:(j+1)*l].sum()
    return result

, 2x2, 3x3 - (sub_sums2 - ):

In [19]: arr = np.asarray([range(100)] * 100)

In [20]: %timeit sub_sums(arr, 2, 2)
10 loops, best of 3: 21.8 ms per loop

In [21]: %timeit sub_sums2(arr, 2, 2)
100 loops, best of 3: 9.56 ms per loop

In [22]: %timeit sub_sums(arr, 3, 3)
100 loops, best of 3: 9.58 ms per loop

In [23]: %timeit sub_sums2(arr, 3, 3)
100 loops, best of 3: 9.36 ms per loop

In [24]: %timeit sub_sums(arr, 4, 4)
100 loops, best of 3: 5.58 ms per loop

In [25]: %timeit sub_sums2(arr, 4, 4)
100 loops, best of 3: 9.56 ms per loop

In [26]: %timeit sub_sums(arr, 10, 10)
1000 loops, best of 3: 939 us per loop

In [27]: %timeit sub_sums2(arr, 10, 10)
100 loops, best of 3: 9.48 ms per loop

, 10x10 sub-arrays 1000 . 2x2 . , -.

, for (, , ?), numpy, , . , 3 .

+1
source

Here is an easier way:

In [160]: import numpy as np

In [161]: arr = np.asarray([range(0, 5),   
                  range(4, 9),
                  range(8, 13),
                  range(12, 17)])

In [162]: np.add.reduceat(arr, [0], axis=1)
Out[162]: 
array([[10],
       [30],
       [50],
       [70]])

In [163]: arr
Out[163]: 
array([[ 0,  1,  2,  3,  4],
       [ 4,  5,  6,  7,  8],
       [ 8,  9, 10, 11, 12],
       [12, 13, 14, 15, 16]])

In [164]: import numpy as np

In [165]: arr = np.asarray([range(0, 5),
                            range(4, 9),
                            range(8, 13),
                            range(12, 17)])

In [166]: arr
Out[166]: 
array([[ 0,  1,  2,  3,  4],
       [ 4,  5,  6,  7,  8],
       [ 8,  9, 10, 11, 12],
       [12, 13, 14, 15, 16]])

In [167]: np.add.reduceat(arr, [0], axis=1)
Out[167]: 
array([[10],
       [30],
       [50],
       [70]])
0
source

All Articles