[UDL 스터디 노트] Chapter 20 - Why does deep learning work?

Jzahnny
September 28, 2025

[UDL 스터디 노트] Chapter 20 - Why does deep learning work?

Use Original Cover Image
Type
Post
Children
Language
en
Tags
Deep Learning
UDL
Overparameterizatiion
Shallow Neural Networks
Deep Neural Networks
Generalization
Authors
Jzahnny
Published
September 28, 2025

Overview

This series of posts are study notes documenting my progress through the book "Unerstanding Deep Learning".
This post covers Chapter 20, Why does deep learning work?

Shouldn't deep learning work in theory?

In theory, even shallow neural networks can produce functions that are free enough within a given space. They can also perform well enough with far fewer parameters than the number of train data.
 
Nevertheless, deeper is usually better. Overparameterization is said to make both training and generalization much better. The book still doesn't make it clear why this is the case.
 
My best guess is that deep NNs can express much more diverse outputs with the same number of parameters than shallow NNs. I wonder if it's some kind of increased dimensionality, like shallow is drawing an NN on 2 dimensions, while DNN is drawing on 3 dimensions? The other question I have is, is there another dimensionality beyond wide and deep? If such a characteristic exists, wouldn't it allow us to increase the degree of representation even more than we have now?
 

References

[1] Prince, S. J. D. (2023). Understanding Deep Learning. The MIT Press. Retrieved from http://udlbook.com