[UDL Study Notes] Ch 2 - Supervised learning

Use Original Cover Image

Type

Post

Parent

“Understanding Deep Learning” Study Notes

Children

Language

Overview

This posting series is a study note that records the process of learning the book “Unerstanding Deep Learning”.

This time, it covers Chapter 2, Supervised learning.

Understanding Deep Learning

https://udlbook.github.io/udlbook/

1. The Essence of “Learning”

As I read through the flow of supervised learning covered in Chapter 2, I was able to think about what “learning” means here.

“Training” here was about finding the parameter, , that minimizes the loss function for the model .

In the end, “learning” a model was about finding the right parameters for a designed blackbox.

Previously, learning felt vague, but organizing it with equations like this seemed to solidify the concept of “learning.”

2. Interactive Figure Experience

The UDL book’s PDF supports Interactive figures, which are web pages that allow you to easily understand the graphs in the book by manipulating their values.

UDL Interactive Figures

Web site created using create-react-app

https://udlbook.github.io/udlfigures/

The link above leads to the Interactive Figure webpage.

It was very interesting to move and manipulate the graphs from the book, and I was able to understand the graphs and their related concepts more three-dimensionally.

3. Loss function and Cost function

At the end of Chapter 2, it discussed the difference between a Loss function and a Cost function, which I did not quite understand at first.

It was explained that a Loss function refers to the mismatch for a single data point in the dataset, while a Cost function refers to the overall mismatch for all data points in the dataset.

Therefore, I understood that if least-squares is applied to the Loss function, the Loss function for each can be defined as , and the Cost function can be defined as .

I also learned for the first time that a function that should be minimized or maximized during the learning process, which includes both loss and cost functions, is generally called an objective function.

4. Generative model

The model represented by the formula that we have discussed so far is called a discriminative model.

Conversely, a model that can be represented by the formula was newly called a generative model.

I understood this as, while a discriminative model predicts the result based on the actual measurement , a generative model generates the base value from the result .

Thinking of an image generation model as an example of a generative model in actual use, I could think of it as a model that, when given a descriptive label for an arbitrary image, “generates” a virtual image that would likely produce the label through the model .

Reference

\[1\] Prince, S. J. D. (2023). Understanding Deep Learning. The MIT Press. Retrieved from http://udlbook.com