IS CS-2021W-01

题目来源：Problem 1 日期：2024-07-24 题目主题：CS-信息数学-有序二叉树

解题思路

这道题目涉及有序二叉树（每个节点最多有两个有序子节点）和其叶节点的权重。我们需要找到一种特定结构的二叉树，使得某些函数达到最优值。特别地，我们可以使用哈夫曼树的概念来解决第一题。

Solution

Q1: Give the tree $T \in T_{4}$ that has the smallest value of $W_{P} (T)$ in case $P = (4, 2, 1, 1)$

To minimize $W_{P} (T)$ , we should construct a tree that resembles a Huffman tree, where the most frequent items (with the highest weights) are located at shallower depths. Here, the weights are $(4, 2, 1, 1)$ .

Steps to construct the tree:

Start by pairing the two smallest weights, which are both $1$ .
Combine these to form a subtree with a combined weight of $2$ .
Now, we have weights $(4, 2, 2)$ .
Next, combine the two smallest remaining weights, which are both $2$ .
Combine these to form a subtree with a combined weight of $4$ .
Finally, combine the two subtrees $(4, 4)$ to form the complete tree.

The resulting tree structure is:

Thus, the depth of each leaf in the tree is:

$c_{1} = 4$ , depth $d_{T} (v_{1}) = 1$
$c_{2} = 2$ , depth $d_{T} (v_{2}) = 2$
$c_{3} = 1$ , depth $d_{T} (v_{3}) = 3$
$c_{4} = 1$ , depth $d_{T} (v_{4}) = 3$

Now, we calculate $W_{P} (T)$ :

W_{P} (T) = 4 \cdot 1 + 2 \cdot 2 + 1 \cdot 3 + 1 \cdot 3 = 4 + 4 + 3 + 3 = 14

Thus, the tree $T$ that minimizes $W_{P} (T)$ has the above structure.

Q2: Show that $\sum_{i = 1}^{n} 2^{- d_{T} (v_{i})} \leq 1$ holds for any ordered binary tree $T \in T_{n}$ with leaves $v_{1}, v_{2}, \dots, v_{n}$ using mathematical induction

To prove the inequality $\sum_{i = 1}^{n} 2^{- d_{T} (v_{i})} \leq 1$ using mathematical induction, we need to follow these steps:

Base Case
Inductive Step

Base Case: For $n = 1$ (a tree with only one leaf), the depth of the only leaf $v_{1}$ is 0.

i = 1 \sum 1 2^{- d_{T} (v_{i})} = 2^{- d_{T} (v_{1})} = 2^{0} = 1

Thus, the base case holds.

Inductive Step: Assume that for any ordered binary tree with $k$ leaves, the inequality holds:

i = 1 \sum k 2^{- d_{T} (v_{i})} \leq 1

Now, we need to prove that the inequality holds for an ordered binary tree with $k + 1$ leaves.

Consider an ordered binary tree with $k + 1$ leaves.
Let’s denote the depth of the leaves in this tree by $d_{T} (v_{1}), d_{T} (v_{2}), \dots, d_{T} (v_{k + 1})$ .

When we add an additional leaf to a tree with $k$ leaves to form a tree with $k + 1$ leaves, we must split one of the existing leaves into two children. This operation increases the depth of the affected leaf by 1 and adds a new leaf with the same depth.

Suppose we split the leaf $v_{j}$ (where $d_{T} (v_{j}) = d$ ) into two new leaves $v_{j}^{'}$ and $v_{k + 1}$ , both at depth $d + 1$ .

Thus, we need to show:

i = 1 \sum k 2^{- d_{T} (v_{i})} + 2^{- (d + 1)} + 2^{- (d + 1)} \leq 1

By the inductive hypothesis, for the original tree with $k$ leaves:

i = 1 \sum k 2^{- d_{T} (v_{i})} \leq 1

In the new tree:

i = 1 \sum k 2^{- d_{T} (v_{i})} - 2^{- d} + 2^{- (d + 1)} + 2^{- (d + 1)}

Since $2^{- (d + 1)} + 2^{- (d + 1)} = 2^{- d}$ :

i = 1 \sum k + 1 2^{- d_{T} (v_{i})} = i = 1 \sum k 2^{- d_{T} (v_{i})} \leq 1

Thus, the inequality holds after adding a new leaf and increasing the depth of the original leaf.

By induction, the inequality $\sum_{i = 1}^{n} 2^{- d_{T} (v_{i})} \leq 1$ holds for all $n$ .

Q3: Assume that $x_{1}, x_{2}, \dots, x_{n}$ range over the set of positive real numbers so that $\sum_{i = 1}^{n} x_{i} = 1$ . Show that $\sum_{i = 1}^{n} (c_{i} \cdot lo g_{2} x_{i})$ is maximized when $x_{i} = c_{i} / S_{P}$ for any sequence $P = (c_{1}, c_{2}, \dots, c_{n})$ of $n$ positive real numbers

To maximize $\sum_{i = 1}^{n} (c_{i} \cdot lo g_{2} x_{i})$ under the constraint $\sum_{i = 1}^{n} x_{i} = 1$ , we use the method of Lagrange multipliers.

Define the Lagrangian:

L (x_{1}, \dots, x_{n}, λ) = i = 1 \sum n (c_{i} \cdot lo g_{2} x_{i}) + λ (1 - i = 1 \sum n x_{i})

Take the partial derivatives and set them to zero:

\frac{\partial L}{\partial x _{i}} = \frac{c _{i}}{x _{i} ln 2} - λ = 0 \Rightarrow x_{i} = \frac{c _{i}}{λ ln 2}

Using the constraint $\sum_{i = 1}^{n} x_{i} = 1$ :

i = 1 \sum n \frac{c _{i}}{λ ln 2} = 1 \Rightarrow λ ln 2 = S_{P} \Rightarrow λ = \frac{S _{P}}{ln 2}

Thus, the maximizing $x_{i}$ is:

x_{i} = \frac{c _{i}}{S _{P}}

Q4: Show that any ordered binary tree $T \in T_{n}$ satisfies $W_{P} (T) \geq H_{P}$ for any sequence $P = (c_{1}, c_{2}, \dots, c_{n})$ of $n$ positive real numbers

To show that $W_{P} (T) \geq H_{P}$ , we need to use the definitions of $W_{P} (T)$ and $H_{P}$ and employ some fundamental principles of information theory and entropy.

Recall:

W_{P} (T) = i = 1 \sum n (c_{i} \cdot d_{T} (v_{i}))

H_{P} = - i = 1 \sum n (c_{i} \cdot lo g_{2} (\frac{c _{i}}{S _{P}}))

We start by rewriting $H_{P}$ in a more convenient form:

H_{P} = i = 1 \sum n c_{i} \cdot (lo g_{2} (S_{P}) - lo g_{2} (c_{i})) = lo g_{2} (S_{P}) \cdot i = 1 \sum n c_{i} - i = 1 \sum n (c_{i} \cdot lo g_{2} (c_{i}))

Since $\sum_{i = 1}^{n} c_{i} = S_{P}$ , we get:

H_{P} = S_{P} \cdot lo g_{2} (S_{P}) - i = 1 \sum n (c_{i} \cdot lo g_{2} (c_{i}))

Next, consider the following inequality derived from Jensen’s inequality for the concave function $f (x) = - x lo g_{2} (x)$ :

- i = 1 \sum n \frac{c _{i}}{S _{P}} \cdot lo g_{2} (\frac{c _{i}}{S _{P}}) \leq - lo g_{2} (i = 1 \sum n \frac{c _{i}}{S _{P}} \cdot \frac{c _{i}}{S _{P}})

This simplifies to:

- i = 1 \sum n \frac{c _{i}}{S _{P}} \cdot lo g_{2} (\frac{c _{i}}{S _{P}}) \leq - lo g_{2} (i = 1 \sum n \frac{c _{i}^{2}}{S _{P} ^{2}})

Since $\sum_{i = 1}^{n} c_{i} = S_{P}$ , we get:

- i = 1 \sum n \frac{c _{i}}{S _{P}} \cdot lo g_{2} (\frac{c _{i}}{S _{P}}) \leq - lo g_{2} (\frac{1}{S _{P} ^{2}} i = 1 \sum n c_{i}^{2})

So:

- i = 1 \sum n \frac{c _{i}}{S _{P}} \cdot lo g_{2} (\frac{c _{i}}{S _{P}}) \leq - lo g_{2} (\frac{1}{S _{P} ^{2}} \cdot S_{P} \cdot \frac{1}{n} i = 1 \sum n c_{i}) = lo g_{2} (S_{P})

The weighted path length $W_{P} (T)$ can be understood using Kraft’s inequality, which relates the depths of leaves in a binary tree to probabilities that sum up to 1. Let $p_{i} = \frac{c _{i}}{S _{P}}$ be the probability associated with the $i$ -th leaf. According to Kraft’s inequality:

i = 1 \sum n 2^{- d_{T} (v_{i})} \leq 1

We multiply both sides by $S_{P}$ :

S_{P} i = 1 \sum n p_{i} \cdot 2^{- d_{T} (v_{i})} \leq S_{P}

Using the fact that $p_{i} = \frac{c _{i}}{S _{P}}$ :

i = 1 \sum n c_{i} \cdot 2^{- d_{T} (v_{i})} \leq S_{P}

Now, we apply the definition of entropy:

H_{P} = - S_{P} i = 1 \sum n p_{i} lo g_{2} (p_{i})

Using Gibbs’ inequality, we know that:

- i = 1 \sum n p_{i} lo g_{2} (p_{i}) \leq i = 1 \sum n p_{i} d_{T} (v_{i})

Multiplying both sides by $S_{P}$ :

S_{P} \cdot H_{P} \leq S_{P} i = 1 \sum n p_{i} \cdot d_{T} (v_{i})

Substituting $p_{i} = \frac{c _{i}}{S _{P}}$ into the inequality:

H_{P} \leq i = 1 \sum n c_{i} \cdot d_{T} (v_{i})

Thus, we have shown that $W_{P} (T) \geq H_{P}$ for any ordered binary tree $T \in T_{n}$ and any sequence $P = (c_{1}, c_{2}, \dots, c_{n})$ of $n$ positive real numbers.

知识点

有序二叉树哈夫曼树信息论拉格朗日乘数法数学归纳法

重点词汇

Ordered binary tree 有序二叉树
Huffman tree 哈夫曼树
Depth 深度
Lagrange multipliers 拉格朗日乘子
Entropy 熵
Lagrange multiplier 拉格朗日乘数
Jensen’s inequality 詹森不等式
Gibbs’ inequality 吉布斯不等式
Kraft’s inequality 克拉夫特不等式

参考资料

Introduction to Algorithms, Chapter 16: Greedy Algorithms
Information Theory, Inference, and Learning Algorithms, David J.C. MacKay

Zephyr's Notes on ISCS & CBMS, UTokyo

Explorer

Explorer

IS_CS-2021W-01

IS CS-2021W-01

解题思路

Solution

Q1: Give the tree $T \in T_{4}$ that has the smallest value of $W_{P} (T)$ in case $P = (4, 2, 1, 1)$

Q2: Show that $\sum_{i = 1}^{n} 2^{- d_{T} (v_{i})} \leq 1$ holds for any ordered binary tree $T \in T_{n}$ with leaves $v_{1}, v_{2}, \dots, v_{n}$ using mathematical induction

Q4: Show that any ordered binary tree $T \in T_{n}$ satisfies $W_{P} (T) \geq H_{P}$ for any sequence $P = (c_{1}, c_{2}, \dots, c_{n})$ of $n$ positive real numbers

知识点

重点词汇

参考资料

Graph View

Table of Contents

Backlinks

Zephyr's Notes on ISCS & CBMS, UTokyo

Explorer

IS_CS-2021W-01

IS CS-2021W-01

解题思路

Solution

Q1: Give the tree T∈T4​ that has the smallest value of WP​(T) in case P=(4,2,1,1)

Q2: Show that ∑i=1n​2−dT​(vi​)≤1 holds for any ordered binary tree T∈Tn​ with leaves v1​,v2​,…,vn​ using mathematical induction

Q3: Assume that x1​,x2​,…,xn​ range over the set of positive real numbers so that ∑i=1n​xi​=1. Show that ∑i=1n​(ci​⋅log2​xi​) is maximized when xi​=ci​/SP​ for any sequence P=(c1​,c2​,…,cn​) of n positive real numbers

Q4: Show that any ordered binary tree T∈Tn​ satisfies WP​(T)≥HP​ for any sequence P=(c1​,c2​,…,cn​) of n positive real numbers

知识点

重点词汇

参考资料

Graph View

Table of Contents

Backlinks

Q1: Give the tree $T \in T_{4}$ that has the smallest value of $W_{P} (T)$ in case $P = (4, 2, 1, 1)$

Q2: Show that $\sum_{i = 1}^{n} 2^{- d_{T} (v_{i})} \leq 1$ holds for any ordered binary tree $T \in T_{n}$ with leaves $v_{1}, v_{2}, \dots, v_{n}$ using mathematical induction

Q4: Show that any ordered binary tree $T \in T_{n}$ satisfies $W_{P} (T) \geq H_{P}$ for any sequence $P = (c_{1}, c_{2}, \dots, c_{n})$ of $n$ positive real numbers