[UDL Study Notes] Ch 6 - Fitting models

Use Original Cover Image

Type

Post

Parent

“Understanding Deep Learning” Study Notes

Children

Language

Overview

This post series is a study note that records the process of learning the book "Understanding Deep Learning".

This time, I will cover Chapter 6, Fitting models.

Understanding Deep Learning

https://udlbook.github.io/udlbook/

1. Visualization of Loss function

As in section 6.4 of the Interactive Figures above, it was personally impressive to see the Loss function of a model represented as a picture on a plane.

The Loss function originally has to be obtained through various calculations such as various parameters and MSE, but by representing it in a 2D plane and brightness like this, I was able to understand the Loss function at a glance.

Also, when explaining model fitting, which is the topic of Chapter 6, I was able to understand the training that reduces Loss well by visualizing the Loss function.

2. Batch & Epoch

Previously, when writing a deep learning model, I used the terms Batch and Epoch without fully understanding them, just grasping their usage.

However, after reading the Batches and epochs section here, I was able to learn about these terms.

I was able to understand that a Batch is a subset of the training dataset divided to perform Stochastic Gradient Descent, and an Epoch is the number of times these Batches complete one full run of the training dataset.

Thus, I came to know the actual meaning of the terms that I had only known empirically.

3. Development of Optimizer

In Chapter 6, several types of optimizers were introduced.

When I first read the Gradient Descent part, I thought it was a natural fitting process, but after seeing optimizers such as SGD and Adam, I was able to learn about the history of what problems existed and how they were solved.

Seeing this development, I thought that the researchers who find and evolve the points that can be improved in the current model are truly amazing and that I should learn from them.

Reference

[1] Prince, S. J. D. (2023). Understanding Deep Learning. The MIT Press. Retrieved from http://udlbook.com