|
decay: parameter for weight decay. Default 0.
Further information is available in the authors' book, Modern Applied Statistics with S. Fourth Edition, page 245:
One way to ensure that f is smooth is to restrict the class of estimates, for example, by using a limited number of spline knots. Another way is regularization in which the fit criterion is altered to
E + λC(f)
with a penalty C on the ‘roughness’ of f . Weight decay, specific to neural networks, uses as penalty the sum of squares of the weights wij. ... The use of weight decay seems both to help the optimization process and to avoid over-fitting. (emphasis added)
|