IS CS-2020S2-06

题目来源：Problem 6 日期：2024-08-11 题目主题：Math/CS-概率论与统计-正态分布与 EM 算法

解题思路

这道题目涉及到正态分布的基本性质、条件分布以及使用 EM 算法进行参数估计的问题。首先，我们需要计算复合随机变量的期望和方差，然后推导出条件分布，再进一步推导联合概率密度函数，最后，运用 EM 算法对缺失数据进行参数估计。

Solution

Question 1

The random variable $Y$ is defined as $Y = θX + Z$ , where $X \sim N (μ, 1)$ and $Z \sim N (0, 1)$ . Since $X$ and $Z$ are independent, we can calculate the expectation and variance of $Y$ as follows:

Expectation of $Y$ :
$E [Y] = E [θX + Z] = θ E [X] + E [Z] = θ μ + 0 = θ μ$
Variance of $Y$ :
$V [Y] = V [θX + Z] = θ^{2} V [X] + V [Z] = θ^{2} \cdot 1 + 1 = θ^{2} + 1$

Question 2

To find the conditional distribution of $X$ given $Y$ , note that $Y = θX + Z$ , where $X \sim N (μ, 1)$ and $Z \sim N (0, 1)$ . The joint distribution of $(X, Y)$ is bivariate normal, which implies that the conditional distribution $X ∣ Y$ is also normal.

Expectation of $X ∣ Y$ :
$E [X ∣ Y] = μ + \frac{θ}{θ ^{2} + 1} (Y - θ μ)$
Variance of $X ∣ Y$ :
$V [X ∣ Y] = \frac{1}{θ ^{2} + 1}$

This can be derived using the properties of conditional distributions for bivariate normal distributions.

Question 3

The joint probability density function $p_{μ, θ} (x^{(n)}, y^{(n)})$ for the random variables $X^{(n)} = (X_{1}, X_{2}, \dots, X_{n})$ and $Y^{(n)} = (Y_{1}, Y_{2}, \dots, Y_{n})$ can be expressed as the product of the marginal distributions of $X_{i}$ and the conditional distributions of $Y_{i}$ given $X_{i}$ :

p_{μ, θ} (x^{(n)}, y^{(n)}) = i = 1 \prod n (\frac{1}{2 π} exp (- \frac{( x _{i} - μ ) ^{2}}{2}) \cdot \frac{1}{2 π} exp (- \frac{( y _{i} - θ x _{i} ) ^{2}}{2}))

Expanding this, we get:

p_{μ, θ} (x^{(n)}, y^{(n)}) = \frac{1}{( 2 π ) ^{n}} exp (- i = 1 \sum n [\frac{( x _{i} - μ ) ^{2}}{2} + \frac{( y _{i} - θ x _{i} ) ^{2}}{2}])

Question 4

Part (i)

The expectation $E_{X_{n} \sim N (\overset{μ}{ˉ}, \overset{σ}{ˉ}^{2})} [lo g p_{μ, θ} (X^{(n)}, Y^{(n)})]$ is given by:

E_{X_{n} \sim N (\overset{μ}{ˉ}, \overset{σ}{ˉ}^{2})} [lo g p_{μ, θ} (X^{(n)}, Y^{(n)})] = E_{X_{n} \sim N (\overset{μ}{ˉ}, \overset{σ}{ˉ}^{2})} [- i = 1 \sum n - 1 (\frac{( x _{i} - μ ) ^{2}}{2} + \frac{( y _{i} - θ x _{i} ) ^{2}}{2}) - (\frac{( X _{n} - μ ) ^{2}}{2} + \frac{( y _{n} - θ X _{n} ) ^{2}}{2})]

Simplifying further using the properties of the expectation for a normal distribution:

E_{X_{n} \sim N (\overset{μ}{ˉ}, \overset{σ}{ˉ}^{2})} [lo g p_{μ, θ} (X^{(n)}, Y^{(n)})] = - i = 1 \sum n - 1 (\frac{( x _{i} - μ ) ^{2}}{2} + \frac{( y _{i} - θ x _{i} ) ^{2}}{2}) - \frac{1}{2} ((\overset{μ}{ˉ} - μ)^{2} + \overset{σ}{ˉ}^{2} + \frac{( y _{n} - θ μ ˉ ) ^{2}}{θ ^{2} + 1})

Part (ii)

The update rule for $(μ_{t + 1}, θ_{t + 1})$ in the EM algorithm is obtained by maximizing the expression found in part (i):

(μ_{t + 1}, θ_{t + 1}) = (μ, θ) \in R^{2} ar g max [- i = 1 \sum n - 1 (\frac{( x _{i} - μ ) ^{2}}{2} + \frac{( y _{i} - θ x _{i} ) ^{2}}{2}) - \frac{1}{2} ((\overset{μ}{ˉ} - μ)^{2} + \overset{σ}{ˉ}^{2} + \frac{( y _{n} - θ μ ˉ ) ^{2}}{θ ^{2} + 1})]

Solving this for $μ$ and $θ$ , we find:

μ_{t + 1} = \frac{1}{n} (i = 1 \sum n - 1 x_{i} + \overset{μ}{ˉ})

θ_{t + 1} = \frac{\sum _{i = 1}^{n - 1} y _{i} x _{i} + y _{n} μ ˉ}{\sum _{i = 1}^{n - 1} x _{i}^{2} + μ ˉ ^{2} + \frac{1}{θ ^{2} + 1}}

This update rule depends on the observed data $X^{(n - 1)}, Y^{(n)}$ and the estimates $\overset{μ}{ˉ}, \overset{σ}{ˉ}^{2}$ obtained from the conditional expectation.

知识点

正态分布条件分布数值期望 EM算法最大似然估计

难点思路

推导条件分布涉及到二元正态分布的性质，尤其是推导条件期望和方差时，需要对协方差矩阵有深刻理解。EM 算法的难点在于构建对数似然函数的期望，并通过优化找到参数的更新规则。

解题技巧和信息

条件分布：对于二元正态分布，条件分布仍然是正态分布，且其参数可以通过边际分布的参数计算得到。
EM 算法：EM 算法通过最大化对数似然函数的期望来迭代更新参数，对于缺失数据的问题尤为有效。
最大似然估计：通常情况下，EM 算法能够保证参数的渐进一致性，即经过多次迭代，参数估计会收敛到真值。

重点词汇

Expectation-Maximization (EM) Algorithm: 期望最大化算法
Conditional distribution: 条件分布
Maximum likelihood estimation: 最大似然估计
Normal distribution: 正态分布

参考资料

Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer. Chapter 9: Mixture Models and EM.
Casella, G., & Berger, R. L. (2001). Statistical Inference (2nd ed.). Duxbury. Chapter 7: Estimation.

Zephyr's Notes on ISCS & CBMS, UTokyo

Explorer