Simplifying the closed form of linear regression

Here’s one that must already exist:

Linear regression is given by the closed matrix form:

θ = (X^TX)^-1 X^TY.

We have a rule we can apply here: (AB)^-1 = B^-1 A^-1
Which gives us: θ = X^-1 (X^T)^-1 X^T Y.

But the transpose and its inverse cancel, yielding the identity matrix when multiplied.
This leaves us with:

θ = X^-1 Y.

Now, surely there must be some reason that it’s not taught this way. Is it because this requires X to be a square matrix? Can we use a pseudoinverse instead to solve this?

Update: Indeed we can. Now I understand why it’s presented that way; the pseudoinverse is given by (X^TX)^-1 X^T! Therefore, we can also just say the optimal parameters are given by X⁺Y, where X⁺ is the pseudoinverse.

“Reinvention is talent crying out for background”, I suppose.

Random Ideas

Ideas are the most powerful things in existence. Here are mine.

Simplifying the closed form of linear regression

Leave a Reply Cancel reply