I have a neural network with architecture 1024, 512, 256, 1(the input level has units 1024, the output level has 1unit, etc.). I would like to train this network using one of the optimization algorithms in scipy.optimize.
The problem is that these algorithms expect function parameters to be specified in one vector; this means that in my case I have to expand all the weights in a length vector
1024*512 + 512*256 + 256*1 = 655616
Some algorithms (for example, fmin_bfgs) must use identification matrices, so they make a call like
I = numpy.eye(655616)
which, not surprisingly, gives a MemoryError. Is there any way for me to avoid the need to expand all the weights into one vector, without waiting for the algorithms to adapt scipy.optimizeto my own needs?
source
share