Work with memory problems in a network with a lot of weights

Question

Work with memory problems in a network with a lot of weights

I have a neural network with architecture 1024, 512, 256, 1(the input level has units 1024, the output level has 1unit, etc.). I would like to train this network using one of the optimization algorithms in scipy.optimize.

The problem is that these algorithms expect function parameters to be specified in one vector; this means that in my case I have to expand all the weights in a length vector

1024*512 + 512*256 + 256*1 = 655616

Some algorithms (for example, fmin_bfgs) must use identification matrices, so they make a call like

I = numpy.eye(655616)

which, not surprisingly, gives a MemoryError. Is there any way for me to avoid the need to expand all the weights into one vector, without waiting for the algorithms to adapt scipy.optimizeto my own needs?

+5

python numpy scipy machine-learning

Paul manta Mar 6 '13 at 10:49

source share

2 answers

Ben Allison · Answer 1 · 2013-03-18T13:01:54+0000

Do not attempt to match the scales with NN using L-BFGS. This does not work particularly well (see Documents of early Jan LeCun), but because this is a second-order method that you are going to approximate to the Hessian one, which for this large weight is 655,000 x 650,000 matrices: it introduces that simply will not be justified .

: , back-prop? , , , , .

EDIT:

Backpropogation , w_i t:

w_i (t) = w_i (t-1) -\alpha (dError/dw_i)

, , NNs: .

blz · Answer 2 · 2013-03-18T09:32:13+0000

, . numpy.memmap, numpy. , , , thrashing.

mmap , .

Work with memory problems in a network with a lot of weights

More articles: