How to prove this regularized matrix is invertible?

by Amin Sammara   Last Updated August 14, 2019 03:19 AM - source

So I'm taking Andrew Ng's course on machine learning (great course, only comment is that its lacking a lot of math) and we came across the analytical solution to a model using Normal equations with the regularization penalty.

Andrew claims that it is possible to prove that the matrix below, inside the parentheses, is always invertible but I'm stuck wondering how to do it.

$$\theta = (X^TX + \lambda [M])^{-1} X^Ty$$

where $M$ is an $(n+1)(n+1)$ matrix and $\lambda$ is the regularization parameter

$$M = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix}$$

Tags :

$X^TX$ is either PD or PSD.* If it's PD, it's already invertible. If it's PSD, its smallest eigenvalue is $0$. For any $\lambda>0$, you're making the smallest eigenvalue inside the parentheses positive.

*Consider $(Xa)^2$; we can write $||Xa||^2_2\ge0\implies a^TX^TXa\ge 0$. The LHS is nonnegative, so the RHS must also be. And the RHS is immediately the definition of PSD; $a\neq 0$.

As whuber points out, $(X^TX)_{1,1}$ must be positive for this to work. This will be true whenever the first column of $X$ is not a 0 vector. (We can verify that easily: the 1,1 entry is the inner product of the first column of X with itself, which must be nonnegative for real values because it is formed as the sum of squares and squares of reals are nonnegative.)

Sycorax
October 31, 2016 03:41 AM

We are seeing regularization is adding a diagonal elements in $X^TX$, then, $X^TX$ is the full rank matrix. Full rank is invertible matrix.

hxd1011
October 31, 2016 03:58 AM

Sorry cannot comment, but why adding diagonal elements make $$X^\top X$$ full rank?

Ruizhen Mai
August 14, 2019 02:07 AM