投影矩阵 | Projection Matrix

定义 | Definition

投影矩阵是一个方阵，能够将一个向量投影到一个子空间上。

The projection matrix is a square matrix that projects a vector onto a subspace.

给定一个矩阵 $A$ ，其投影矩阵 $P$ 可以定义为：

For a given matrix $A$ , the projection matrix $P$ can be defined as:

P = A (A^{T} A)^{- 1} A^{T}

其中 $A$ 是一个 $m \times n$ 矩阵。

where $A$ is an $m \times n$ matrix.

投影矩阵的推导 | Derivation of the Projection Matrix

为了推导投影矩阵，我们考虑一个向量 $b$ 在子空间 $Col (A)$ 上的投影。首先，我们需要找到一个向量 $b_{proj}$ ，它是 $b$ 在子空间 $Col (A)$ 上的正交投影。

To derive the projection matrix, we consider the projection of a vector $b$ onto the subspace $Col (A)$ . We need to find a vector $b_{proj}$ that is the orthogonal projection of $b$ onto the subspace $Col (A)$ .

1. 定义正交投影 | Define Orthogonal Projection

设 $b_{proj}$ 是 $b$ 在 $Col (A)$ 上的投影，则 $b_{proj}$ 可以表示为：

Let $b_{proj}$ be the projection of $b$ onto $Col (A)$ , then $b_{proj}$ can be expressed as:

b_{proj} = Ax

其中 $x$ 是一个系数向量。

where $x$ is a coefficient vector.

2. 最小化投影误差 | Minimize the Projection Error

我们希望 $b_{proj}$ 最小化 $∥ b - b_{proj} ∥$ 。这等价于最小化 $∥ b - Ax ∥$ 。

We want $b_{proj}$ to minimize $∥ b - b_{proj} ∥$ . This is equivalent to minimizing $∥ b - Ax ∥$ 。

为了最小化这个误差，我们求误差的平方和并将其设置为零：

To minimize this error, we take the sum of squared errors and set its gradient to zero:

∥ b - Ax ∥^{2} = (b - Ax)^{T} (b - Ax)

展开得到：

Expanding this, we get:

(b - Ax)^{T} (b - Ax) = b^{T} b - 2 b^{T} Ax + x^{T} A^{T} Ax

3. 对 $x$ 求导 | Differentiate with Respect to $x$

为了找到最小值，我们对 $x$ 求导并设置为零：

To find the minimum, we take the derivative with respect to $x$ and set it to zero:

\frac{\partial}{\partial x} (b^{T} b - 2 b^{T} Ax + x^{T} A^{T} Ax) = 0

计算导数：

Calculating the derivatives, we get:

- 2 A^{T} b + 2 A^{T} Ax = 0

4. 解正则方程 | Solve the Normal Equations

简化得：

Simplifying, we get:

A^{T} Ax = A^{T} b

5. 求解 $x$ | Solve for $x$

假设 $A^{T} A$ 是可逆的，我们可以得到 $x$ 的解：

Assuming $A^{T} A$ is invertible, we get the solution for $x$ :

x = (A^{T} A)^{- 1} A^{T} b

6. 投影矩阵的计算 | Calculation of the Projection Matrix

将 $x$ 的解代入 $b_{proj} = Ax$ ，得到：

Substituting the solution for $x$ into $b_{proj} = Ax$ , we get:

b_{proj} = A (A^{T} A)^{- 1} A^{T} b

因此，投影矩阵 $P$ 定义为：

Thus, the projection matrix $P$ is defined as:

P = A (A^{T} A)^{- 1} A^{T}

投影矩阵的性质 | Properties of the Projection Matrix

对称性 | Symmetry: 投影矩阵 $P$ 是对称矩阵，即 $P = P^{T}$ 。 The projection matrix $P$ is symmetric, i.e., $P = P^{T}$ .
幂等性 | Idempotency: 投影矩阵 $P$ 是幂等矩阵，即 $P^{2} = P$ 。 The projection matrix $P$ is idempotent, i.e., $P^{2} = P$ .

投影的计算 | Calculation of Projection

给定一个向量 $b$ ，其在子空间 $Col (A)$ 上的投影 $b_{proj}$ 计算如下：

Given a vector $b$ , its projection $b_{proj}$ onto the subspace $Col (A)$ is calculated as:

b_{proj} = Pb

伪逆矩阵 | Pseudo-Inverse Matrix

伪逆矩阵是一种广义逆矩阵，用于解决一些矩阵方程（如线性回归中的正则方程）。

The pseudo-inverse matrix is a type of generalized inverse matrix used to solve certain matrix equations, such as normal equations in linear regression.

对于一个矩阵 $A$ ，其伪逆矩阵 $A^{+}$ 定义为：

For a matrix $A$ , its pseudo-inverse $A^{+}$ is defined as:

A^{+} = (A^{T} A)^{- 1} A^{T}

推导过程 | Derivation

为了推导伪逆矩阵，我们首先考虑一个线性方程组 $Ax = b$ 的最小二乘解 $x$ ：

To derive the pseudo-inverse matrix, we first consider the least squares solution $x$ of the linear system $Ax = b$ :

x = A^{+} b

我们要求 $x$ 使得 $∥ Ax - b ∥$ 最小化，这意味着我们需要解以下正则方程：

We want $x$ to minimize $∥ Ax - b ∥$ , which means we need to solve the following normal equations:

A^{T} Ax = A^{T} b

假设 $A^{T} A$ 可逆，我们可以得到 $x$ 的解：

Assuming $A^{T} A$ is invertible, we get the solution for $x$ :

x = (A^{T} A)^{- 1} A^{T} b

因此，伪逆矩阵 $A^{+}$ 为：

Thus, the pseudo-inverse matrix $A^{+}$ is:

A^{+} = (A^{T} A)^{- 1} A^{T}

与线性方程组的关系 | Relationship with Linear Systems

方程组 $Ax = b$ 的解 | Solutions to $Ax = b$

对于线性方程组 $Ax = b$ ，若 $A$ 是满秩矩阵（即 $A^{T} A$ 可逆），则方程组有唯一解：

For the linear system $Ax = b$ , if $A$ is a full-rank matrix (i.e., $A^{T} A$ is invertible), then the system has a unique solution:

x = A^{+} b

无解或多解的情况 | No Solutions or Multiple Solutions

若 $A$ 不是满秩矩阵，方程组可能无解或有无穷多解。在这种情况下，可以求得最小二乘解 $x$ ：

If $A$ is not a full-rank matrix, the system may have no solutions or infinitely many solutions. In this case, the least squares solution $x$ can be found:

x = A^{+} b

此时， $x$ 是使 $∥ Ax - b ∥$ 最小的向量。

In this case, $x$ is the vector that minimizes $∥ Ax - b ∥$ .

现实中的应用 | Real-World Applications

数据量远大于变量数量的情况 | When the Amount of Data Far Exceeds the Number of Variables

在统计学和机器学习中，线性方程组 $Ax = b$ 中的数据量（观测数 $m$ ）通常远大于变量数量（特征数 $n$ ），即 $m ≫ n$ 。这种情况下，矩阵 $A$ 通常是满秩的，因此 $A^{T} A$ 是可逆的。

In statistics and machine learning, the linear system $Ax = b$ often involves a dataset where the number of observations $m$ is much larger than the number of variables $n$ , i.e., $m ≫ n$ . In such cases, the matrix $A$ is typically full-rank, making $A^{T} A$ invertible.

Zephyr's Notes on ISCS & CBMS, UTokyo

Explorer

Explorer

Projection Matrix

投影矩阵 | Projection Matrix

定义 | Definition

投影矩阵的推导 | Derivation of the Projection Matrix

1. 定义正交投影 | Define Orthogonal Projection

2. 最小化投影误差 | Minimize the Projection Error

3. 对 $x$ 求导 | Differentiate with Respect to $x$

4. 解正则方程 | Solve the Normal Equations

5. 求解 $x$ | Solve for $x$

6. 投影矩阵的计算 | Calculation of the Projection Matrix

投影矩阵的性质 | Properties of the Projection Matrix

投影的计算 | Calculation of Projection

伪逆矩阵 | Pseudo-Inverse Matrix

推导过程 | Derivation

与线性方程组的关系 | Relationship with Linear Systems

方程组 $Ax = b$ 的解 | Solutions to $Ax = b$

无解或多解的情况 | No Solutions or Multiple Solutions

现实中的应用 | Real-World Applications

数据量远大于变量数量的情况 | When the Amount of Data Far Exceeds the Number of Variables

Graph View

Table of Contents

Backlinks

Zephyr's Notes on ISCS & CBMS, UTokyo

Explorer

Projection Matrix

投影矩阵 | Projection Matrix

定义 | Definition

投影矩阵的推导 | Derivation of the Projection Matrix

1. 定义正交投影 | Define Orthogonal Projection

2. 最小化投影误差 | Minimize the Projection Error

3. 对 x 求导 | Differentiate with Respect to x

4. 解正则方程 | Solve the Normal Equations

5. 求解 x | Solve for x

6. 投影矩阵的计算 | Calculation of the Projection Matrix

投影矩阵的性质 | Properties of the Projection Matrix

投影的计算 | Calculation of Projection

伪逆矩阵 | Pseudo-Inverse Matrix

推导过程 | Derivation

与线性方程组的关系 | Relationship with Linear Systems

方程组 Ax=b 的解 | Solutions to Ax=b

无解或多解的情况 | No Solutions or Multiple Solutions

现实中的应用 | Real-World Applications

数据量远大于变量数量的情况 | When the Amount of Data Far Exceeds the Number of Variables

Graph View

Table of Contents

Backlinks

3. 对 $x$ 求导 | Differentiate with Respect to $x$

5. 求解 $x$ | Solve for $x$

方程组 $Ax = b$ 的解 | Solutions to $Ax = b$