Work with memory problems in a network with a lot of weights

I have a neural network with architecture 1024, 512, 256, 1(the input level has units 1024, the output level has 1unit, etc.). I would like to train this network using one of the optimization algorithms in scipy.optimize.

The problem is that these algorithms expect function parameters to be specified in one vector; this means that in my case I have to expand all the weights in a length vector

1024*512 + 512*256 + 256*1 = 655616

Some algorithms (for example, fmin_bfgs) must use identification matrices, so they make a call like

I = numpy.eye(655616)

which, not surprisingly, gives a MemoryError. Is there any way for me to avoid the need to expand all the weights into one vector, without waiting for the algorithms to adapt scipy.optimizeto my own needs?

+5
source share
2 answers

Do not attempt to match the scales with NN using L-BFGS. This does not work particularly well (see Documents of early Jan LeCun), but because this is a second-order method that you are going to approximate to the Hessian one, which for this large weight is 655,000 x 650,000 matrices: it introduces that simply will not be justified .

: , back-prop? , , , , .

EDIT:

Backpropogation , w_i t:

w_i (t) = w_i (t-1) -\alpha (dError/dw_i)

, , NNs: .

+1

, . numpy.memmap, numpy. , , , thrashing.

mmap , .

0

All Articles