today I want to write short summary of how to reduce overfitting. Here it goes:
- Weight decay.
- Weight sharing
- Early stopping of training
- Model averaging
- Bayesian fitting of NN
- Generative pre-training
Some explanations about some points.
- Weight decay stands for keeping weights small
- Insist that weights will be similar to each other
- Early stopping stands for not training NN to full memorizing of test set
- In other words usage of different models
- Little bit another usage of model averaging according to some rules
- random ommiting of hidden units in order to validate results