Why Does Regularization Reduce Overfitting in Deep Neural Networks?
Why does regularization help with overfitting?Why does it help with reducing variance problems?Let's go through a couple examples to gain some intuition about how it works.So, recall that high bias, high variance.And I just write pictures from our earlier video that looks something like this.Now, let's see a fitting large and deep neural network.I know I haven't drawn this one too large or too deep,unless you think some neural network and this currently overfitting.So you have some cost function like J of W,B equals sum of the losses.So what we did for regularization was add this extra term that penalizes the weight matrices from being too large.So that was the Frobenius norm.So why is it that shrinking the L two norm orthe Frobenius norm or the parameters might cause less overfitting?One piece of intuition is that if youcrank regularisation lambda to be really, really big,they'll be really incentivized to setthe weight matrices W to be reasonably close to zero.So one piece of intuition is maybe it set the weight to be so close to zero fora lot of hidden units that's basically zeroing out a lot of the impact of these hidden units.And if that's the case,then this much simplified neural network becomes a much smaller neural network........
Download
0 formatsNo download links available.