Neural networks. Is there a need for separate standardization to establish a training set and a validation set?

I have this 5-5-2 backpropagation neural network that I train, and after reading this wonderful LeCun article , I started putting into practice some of the ideas that it offers.

I am currently evaluating it using a 10x cross-validation algorithm, which I did myself, which basically looks like this:

for each epoch      
  for each possible split (training, validation)
    train and validate
  end
  compute mean MSE between all k splits
end

My inputs and outputs are standardized (0-average, variance 1), and I use the tanh activation function. It seems that all network algorithms work correctly: I used the same implementation to approximate the sin function, and it does it pretty well.

Now the question is: Do I have to standardize each set / validation separately or do I just need to standardize the entire data set once ?

Please note that if I do the latter, the network does not create meaningful forecasts, but I prefer to have a more “theoretical” answer than just looking at the outputs.

By the way, I implemented it in C, but I also like C ++.

+3
source share
2 answers

, . - , , . , . , , , .

, , (, , ), . , , (.. 10- 90% ).

+7

, , , , N (0,1) ?

, @bogatron, , , "", . ; , . / , - , .

, , . , . -.

+3

All Articles