Lazy loaded image
读书笔记
6️⃣UML Chapter 6
Words 713Read Time 2 min
2025-3-2
2025-3-3
type
status
date
slug
summary
tags
category
icon
password
😀
本博客是知名 Learning Theory 教材 ”Understanding Machine Learning: From Theory to Algorithms” 第六章的读书笔记

📝The VC-Dimension

🖼️ Big Picture

 
本章的目标是探寻哪些hypothesis class是PAC learnable的 (之前的章节中说明了大小为有限的假设类都是PAC可学的,而 the class of all functions (over an infinite size domain) 是不可学的,事实上也有一部分的大小为无限的H也是可学的)
因此我们引入 VC-dimension 的概念

一个例子

注意在书中是证明PAC learnable的,所以是有realizability条件的(因此一定可以把正例和反例用一个threshold完美分隔开)
证明思路
  • 假设这个threshold为
  • 找到两个mass为 的区间
  • 然后假设S中正例对应的x最大值 和反例对应的x最小值
  • 如果 , 那么可以得到通过ERM算法学到的 也一定属于区间 中,因此
  • 因此运用反面,套用union bound,就可以得到证明

VC-dimension

Restrict of H to C

也就是将 的domain限制在 C 上

Shattering

The restriction of H to C is the set of functions from C to {0, 1} that can be derived from H.
也就是说所有从C到{0,1}的映射是否都能被H中的函数覆盖
推论6.4:

VC-dimension

假设类 的VC维定义为 maximal size of a set C ⊂ X that can be shattered by H.
这里的 C 只是找到一个满足条件即可,不需要全部都满足
 
VC维为无限的假设类一定是PAC不可学的; VC维为有限的 H 保证了可学性。
 
证明VC-dim的步骤
  • There exists a set C of size d that is shattered by H.
  • Every set C of size d + 1 is not shattered by H.

Examples

Hypothesis Class
VC Dimension
threshold function
1
interval
2
axis aligned rectangles
4

Finite class

In the aforementioned examples, VC-Dimension and the Number of Parameters
seems to have some relations, while this doesn’t hold true for some cases.

🌟 The Fundamental Theorem of PAC learning

Let be a hypothesis class of functions from a domain X to {0, 1} and let the loss function be the 0 − 1 loss. Then, the following are equivalent:
  1. has the uniform convergence property.
  1. Any ERM rule is a successful agnostic PAC learner for .
  1. is agnostic PAC learnable.
  1. is PAC learnable.
  1. Any ERM rule is a successful PAC learner for .
  1. has a finite VC-dimension.

Growth function

The growth function measures the maximal “effective” size of on a set of m examples.

Sauer’s Lemma

when m becomes larger than the VC-dimension, the growth function increases polynomially rather than exponentially with m.
证明暂时省略

🤗 Summary

🪄
VC-dimension is finite and specifies the sample complexity required for PAC learning. The theorem also shows that if a problem is at all learnable, then uniform convergence holds and therefore the problem is learnable using the ERM rule.

📎 参考文章

  • 《机器学习》周志华
 
上一篇
UML Chapter 5
下一篇
Matrix Calculus