I am trying to read the Elements of Statistical Learning Tibshirani, Hastie and Friedman, however I have a problem with understanding the expected (squared) prediction error ($$EPE$$) formula that they provide on page $$26$$:

The start they assume that the relationship between $$X$$ and $$Y$$ is linear so:

$$Y = X^TB+\epsilon$$, where $$\epsilon$$~$$N(0,\sigma^2)$$, the task is to feed the model to the training data. Now

$$EPE(x_0) = E_{x_0|y_0}[E_T(y_0-\hat y_0)^2]$$

What is the $$E_T$$? What is the reason to compute the $$EPE$$ of $$x_0$$ insted of $$\hat y_0$$?

On page $$23$$ there is written that $$T$$ is the training set, so my understanding is that it consists of some $$X$$'s. Is it right?

