This page lists errors or typos appearing in the first edition and printing 2 of the book Foundations of Machine Learning as well as their corresponding corrections. We are grateful to all readers who kindly bring those to our attention.
• Page 19, paragraph following equation (2.10): "the bound guarantees 90% accuracy" (not 99% accuracy).

• Page 34, Definition 3.1: it should be noted that "Z" is an arbitrary input space.

• Page 36, first inline equation: all instances of "$g \in H$" (which the supremum is taken over) should be "$g \in G$".

• Page 43, proof of Theorem 3.4: The definition of $I_2$ should have the condition $\beta_i \leq 0$ (instead of $\beta_i < 0$).

• Page 44, Example 3.3: "figure 3.2(a)" should read "figure 3.3(a)" and "figure 3.2(b)" should read "figure 3.3(b)".

• Page 46, proof of Theorem 3.5, second paragraph: "$H$ includes by restriction to $S'$" should read "$H$ induces by restriction to $S'$".

• Page 49, proof of Theorem 3.6: The line above eq. (3.34) should read "with very high probability $(1 - 8 \epsilon)$" (instead of $(1 - \epsilon)$).

• Page 50, first second to last paragraph: "...PAC-learning in the non-realizable case is not possible..." should read "..PAC-learning in the realizable case is not possible...".

• Page 57, Exercise 3.12: The hint should read "show that $\{2^{-i} : i < m\}$ can be fully shattered for any $m \in \mathbb{N}$"

• Page 66, first paragraph: "$\nabla_w (F)" should instead be written "$\nabla F(w)$" and "its Hessian the" should read "its Hessian is the" • Page 78, Lemma 4.2: The word "function" should appear at the end of the first sentence. • Page 79, proof of Lemma 4.2, third inline equation: in the first equality, parentheses are missing around the two terms that are preceded by the factor 1/2. • Page 80, proof of Theorem 4.4: there is no need to resort to$\Phi_\rho - 1$, the proof holds directly with$\Phi_\rho$. • Page 94, second inline equation: the "$\sigma^n" in the denominator of the expression should instead be "$\sigma^{2n}$".

• Page 95, first line of the proof: $\Phi(x) \colon {\cal X} \to \Rset$ should read $\Phi(x) \colon \cX \to \Rset^{\cal X}$.

• Page 95, proof of Theorem 5.2, following third inline equation: $g$ should be defined in terms of $x_j'$ (instead of $x_j$).

• Page 99, second line to last, "$(x_1, x_1', x_2, x_2') \mapsto K'(y_1, y_2)" should read "$(x_1, x_1', x_2, x_2') \mapsto K'(x_1', x_2')".

• Page 100, last two sentences in proof of Theorem 5.3: "$K_n$" should instead be "$K^n$" (in two places).

• Page 102, last line of Theorem 5.5: "$w \cdot \Phi(x)$" should be "$\langle w, \Phi(x) \rangle$" (for consistency of notation).

• Page 121, definition 6.1: "$\epsilon > 0$ and" should be removed.

• Page 121, definition 6.1: "$poly(1/\epsilon, 1/\delta, n, size(c)$" should instead read "$poly(1/\delta, n, size(c))$".

• Page 121, definition 6.1: it should be added that "$O(n)$ is a bound on the cost of representing $x \in \cal X$".

• Page 125, last sentence of proof of Theorem 6.1: "follows from the identity" should read "follows from the inequality".

• Page 135, Theorem 6.3: Instead of "$a_t > 0$" it should read "$\alpha_t > 0$".

• Page 167, pseudocode of Dual Perceptron algorithm: $\alpha_{t + 1}$ should be replaced by $\alpha_t$.

• Page 168, pseudocode of Kernel Perceptron algorithm: $\alpha_{t + 1}$ should be replaced by $\alpha_t$.

• Page 170, in the inequality for $\Phi_{t + 1} - \Phi_{t}$, the following two intermediate lines should be inserted just before the last inequality for more explanation:
& = \log \E_{i \sim \w_t}\big[ \exp(\eta y_t x_{t, i} - \eta y_t \w_t \cdot x_{t} + \eta y_t \w_t \cdot x_{t}) \big] - \eta \rho_\infty\\
& \leq \log \big[ \exp(\eta^2 (2 r_\infty)^2/8) \big] + \underbrace{\eta y_t (\w_t \cdot x_{t})}_{\leq 0} - \eta \rho_\infty\\[-.35cm]

• Page 181, exercise 7.10, second paragraph: the definition of $m_i$ in the first sentence of that paragraph is given in the special case of the zero-one loss. For the general case, the sentence should be replaced by: "Let $m_i$ be the cumulative loss of hypothesis $h_i$ on the points $(x_i, \ldots, x_T)$, that is $m_i = \sum_{t = i}^T L(h_i(x_t), y_t)$".

• Page 181, exercise 7.10, in the text following the inline equation: $i^* = argmin_i m_i / (T - i)$ should be replaced by $i^* = argmin_i m_i / (T - i + 1)$ .

• page 189, third paragraph: $W=(w_1^\top, \ldots, w_k^\top)^\top$ should read $W=(w_1, \ldots, w_k)^\top$.

• Page 190, line 5: the empirical Rademacher complexity symbol should be replaced by that of Rademacher complexity.

• Page 191, equation 8.12: the factor 4k^2 should be 2k^2 instead.

• Page 191, optimization problem: the constraints $\xi_i \geq 0$ should be added.

• Page 192, section 8.3.2 first paragraph: "exercise 9.5" should read "exercise 8.4".

• Page 207, exercise 8.4: "family of base hypothesis" should read "family of base hypotheses".

• Page 217, line 3 from bottom: the expression should be replaced by $\sqrt{1 - \frac{(\e^+_t - \e^-_t)^2}{(1 - \e^0_t)}}$, which holds by the concavity of the square-root function.

• Page 283, inline after (12.2): $U^\top X X^\top U$ should read $Tr[U^\top X X^\top U]$.

• Page 354, paragraph preceding Definition B.7: all "$x$" should be bolded.

• Page 357, equation (B.13): "$g(\bar x_i)$" should instead be "$g_i(\bar x)$".

• Page 357, proof of Theorem B.8: "$\bar x$" should be bolded throughout.

• Page 357, proof of Theorem B.8, inline equation: the second and fourth inequalities should be equalities. Also, the final inequality holds due to the "(third condition and g(x) \leq 0)".

• Page 369, first line of section D.1: extra space before the comma should be removed.

• Page 370, line 5: $\phi'(t) \leq \frac{(b-a)^2}{4}$ should read $\phi''(t) \leq\frac{(b-a)^2}{4}$.

• Page 371, last line of lemma D.2: $\E[e^{sV} | Z ]$ should read $\E[e^{tV} | Z ]$.

• Page 364, last line: it should read $(1 - 2t)^{-1/2}$ and not $(1 - 2t)^{1/2}$.

• Page 365, first displayed equation: it should read $(1 - 2t)^{-k/2}$ and not $(1 - 2t)^{k/2}$.

• Page 370, lines 3 and 4 of proof of Theorem D.1: the factor $exp(-t \epsilon)$ should be outside the product sign.