2016

Problem 7

Let $T (n)$ denote the worst-case running time of an algorithm that processes input data of size $n (\geq 1)$ . Let $⌊ x ⌋$ be the largest integer that is equal to or smaller than real number $x$ . Let $c (\geq 1)$ be a constant number. Suppose that $T (n)$ meets $T (1) = c$ and each of the following recurrences for $n > 1$ :

$T (n) = T (⌊ 3 n /4 ⌋) + c n$
$T (n) = 2 T (n - 1) + c n$
$T (n) = T (n - 1) + c (n^{2} + n)$
$T (n) = T (⌊ n /2 ⌋) + c$
$T (n) = 2 T (⌊ n /2 ⌋) + c n$

From the following complexity classes, select the smallest class for $T (n)$ that satisfies each of the above recurrences, and prove the property: $O (1), O (lo g n), O (n), O (n lo g n), O (n^{2}), O (n^{3}), O (3^{n})$ .

令 $T (n)$ 表示处理大小为 $n (\geq 1)$ 的输入数据的算法的最坏情况下的运行时间。令 $⌊ x ⌋$ 为不大于实数 $x$ 的最大整数。令 $c (\geq 1)$ 为一个常数。假设 $T (n)$ 满足 $T (1) = c$ 以及以下所有递归关系式（ $n > 1$ ）：

$T (n) = T (⌊ 3 n /4 ⌋) + c n$
$T (n) = 2 T (n - 1) + c n$
$T (n) = T (n - 1) + c (n^{2} + n)$
$T (n) = T (⌊ n /2 ⌋) + c$
$T (n) = 2 T (⌊ n /2 ⌋) + c n$

从以下复杂度类中选择 $T (n)$ 满足每个上述递归关系的最小类，并证明该性质： $O (1), O (lo g n), O (n), O (n lo g n), O (n^{2}), O (n^{3}), O (3^{n})$ 。

Problem 8

Answer the following questions about linear algebra.

Compute the inverse matrix of the following matrix,
$(1225) .$
Consider data points $(x_{i}, y_{i}), i = 1, \dots, n$ in a two-dimensional space. Variance with respect to the x-axis, variance with respect to the y-axis, and covariance are respectively defined as
$σ_{x} = \frac{1}{n} i = 1 \sum n (x_{i} - \overset{x}{ˉ})^{2}, σ_{y} = \frac{1}{n} i = 1 \sum n (y_{i} - \overset{y}{ˉ})^{2}, σ_{x y} = \frac{1}{n} i = 1 \sum n (x_{i} - \overset{x}{ˉ}) (y_{i} - \overset{y}{ˉ})$
where $\overset{x}{ˉ}, \overset{y}{ˉ}$ denote the averages with respect to the x and y axes, respectively.

A: Compute the variance-covariance matrix
$(σ_{x} σ_{x y} σ_{x y} σ_{y})$
for the following data points, $(- 2, - 2), (2, 2), (1, - 1), (- 1, 1)$ .

B: Compute all eigenvalues and eigenvectors of the variance-covariance matrix.
Prove that, if the eigenvalues of a regular matrix A are $λ_{1}, \dots, λ_{n}$ , those of the inverse matrix A $^{- 1}$ are $1/ λ_{1}, \dots, 1/ λ_{n}$ .

回答以下关于线性代数的问题。

计算以下矩阵的逆矩阵，
$(1225) .$
考虑数据点 $(x_{i}, y_{i}), i = 1, \dots, n$ 在二维空间中。相对于 x 轴的方差、相对于 y 轴的方差和协方差分别定义为
$σ_{x} = \frac{1}{n} i = 1 \sum n (x_{i} - \overset{x}{ˉ})^{2}, σ_{y} = \frac{1}{n} i = 1 \sum n (y_{i} - \overset{y}{ˉ})^{2}, σ_{x y} = \frac{1}{n} i = 1 \sum n (x_{i} - \overset{x}{ˉ}) (y_{i} - \overset{y}{ˉ})$
其中 $\overset{x}{ˉ}, \overset{y}{ˉ}$ 分别表示相对于 x 和 y 轴的平均值。

A: 计算方差-协方差矩阵
$(σ_{x} σ_{x y} σ_{x y} σ_{y})$
对于以下数据点， $(- 2, - 2), (2, 2), (1, 1), (- 1, 1)$ 。

B: 计算方差-协方差矩阵的所有特征值和特征向量。
证明，如果一个正规矩阵 A 的特征值是 $λ_{1}, \dots, λ_{n}$ ，那么其逆矩阵 A $^{- 1}$ 的特征值是 $1/ λ_{1}, \dots, 1/ λ_{n}$ 。

Problem 9

We sort an integer array, x. Answer the following questions, assuming that integer operations such as comparison, addition, subtraction, loading from memory, and storing to memory, take a unit time.

Fill (A) and (B) to complete the function below that constructs an array $y [s, s + 1, \dots, e - 1]$ sorted in ascending order by merging two subarrays $x [s, s + 1, \dots, m - 1]$ and $x [m, m + 1, \dots, e - 1]$ , each of which is sorted in ascending order. You may write multiple lines if needed.

void merge_two_arrays(int x[], int s, int m, int e, int y[]) {
    int i = s, j = m, k = s;
    while(i < m && j < e) {
        if(x[i] < x[j]) {
            // (A)
        } else {
            // (B)
        }
    }
    while(i < m) { y[k] = x[i]; k++; i++; }
    while(j < e) { y[k] = x[j]; k++; j++; }
}

Fill (C) to complete the function below that takes an integer array $x [s, s + 1, \dots, e - 1]$ as an input and outputs the sorted array $x [s, s + 1, \dots, e - 1]$ . Array $y$ is a temporary array of the same size as $x$ .

void merge_sort(int x[], int s, int e, int y[]) {
    if(e - s <= 1) return;
    int m = (s + e) / 2;
    // (C)
    merge_two_arrays(x, s, m, e, y);
    for(int i = s; i < e; i++) { x[i] = y[i]; }
}

Find the worst-case time complexity of sorting an integer array of size $n$ by merge_sort().
In order to accelerate sorting by merge_sort(), we implement a hardware that calculates a function cmp(x1, x2, x3) that returns 1 when x1 is the smallest element among x1, x2 and x3, 2 when x2 is the smallest, and 3 when x3 is the smallest. Show how to accelerate merge_sort() using the function cmp(). We assume that the function cmp() and branching by its return value take a unit time, respectively.

我们对整数数组 x 进行排序。回答以下问题，假设整数操作（如比较、加法、减法、从内存加载和存储到内存）占用单位时间。

填写 (A) 和 (B)，以完成以下函数，该函数通过合并两个升序排列的子数组 $x [s, s + 1, \dots, m - 1]$ 和 $x [m, m + 1, \dots, e - 1]$ 来构造一个升序排列的数组 $y [s, s + 1, \dots, e - 1]$ 。如有需要，可以写多行代码。

void merge_two_arrays(int x[], int s, int m, int e, int y[]) {
    int i = s, j = m, k = s;
    while(i < m && j < e) {
        if(x[i] < x[j]) {
            // (A)
        } else {
            // (B)
        }
    }
    while(i < m) { y[k] = x[i]; k++; i++; }
    while(j < e) { y[k] = x[j]; k++; j++; }
}

填写 (C)，以完成以下函数，该函数将整数数组 $x [s, s + 1, \dots, e - 1]$ 作为输入，并输出已排序的数组 $x [s, s + 1, \dots, e - 1]$ 。数组 $y$ 是与 $x$ 大小相同的临时数组。

void merge_sort(int x[], int s, int e, int y[]) {
    if(e - s <= 1) return;
    int m = (s + e) / 2;
    // (C)
    merge_two_arrays(x, s, m, e, y);
    for(int i = s; i < e; i++) { x[i] = y[i]; }
}

通过 merge_sort() 找出对大小为 $n$ 的整数数组进行排序的最坏情况下的时间复杂度。
为了加速 merge_sort() 排序，我们实现了一种硬件，它计算一个函数 cmp(x1, x2, x3)，当 x1 是 x1、x2 和 x3 中最小的元素时返回 1，当 x2 是最小的时返回 2，当 x3 是最小的时返回 3。展示如何使用函数 cmp() 加速 merge_sort()。我们假设函数 cmp() 和根据其返回值进行的分支分别占用单位时间。

Problem 10

In a directed graph, a path is a series of one or more arcs that connect a series of vertices. A cycle is a path that starts and ends on the same vertex. An Eulerian path (cycle, respectively) is a path (cycle) that visits every arc exactly once.

Next, we create a directed graph from string $L = c_{1} c_{2} \dots c_{n} (n \geq 2)$ . Let $s_{i, k}$ denote $c_{i} \dots c_{i + k - 1}$ , the substring of length $k (\geq 1)$ that starts from position $i$ in $L$ . Let $G_{L, k}$ denote a directed graph such that the vertex set is ${s_{i, k} ∣ i = 1, \dots, n - k + 1}$ , and the set of labeled arcs is ${(s_{i, k}, s_{i + 1, k}, i) ∣ i = 1, \dots, n - k}$ , The 3rd argument $i$ is the label. Answer the following questions.

When $L = A C A C A$ , the vertex set and labeled arc set of $G_{L, 2}$ are ${A C, C A}$ and ${(A C, C A, 1), (C A, A C, 2), (A C, C A, 3)}$ , respectively. List all Eulerian paths and cycles in $G_{L, 2}$ .
When $L = GCGCGC A GCG$ , list all Eulerian paths and cycles in each of $G_{L, 3}$ and $G_{L, 4}$ .
A vertex is balanced if the number of arcs entering the vertex is equal to the number of arcs leaving the vertex. A directed graph is balanced if every vertex is balanced. A directed graph is connected if it has a path from any vertex to any vertex. Prove that a directed graph with an Eulerian cycle is connected and balanced.
Conversely, if a graph is connected and balanced, prove that the graph has an Eulerian cycle.

在有向图中，一条路径是连接一系列顶点的一条或多条弧。一个环是起点和终点在同一个顶点的路径。欧拉路径（或环）是一条恰好遍历每条弧一次的路径（或环）。

接下来，我们从字符串 $L = c_{1} c_{2} \dots c_{n} (n \geq 2)$ 创建一个有向图。令 $s_{i, k}$ 表示从 $L$ 中位置 $i$ 开始的长度为 $k (\geq 1)$ 的子串 $c_{i} \dots c_{i + k - 1}$ 。令 $G_{L, k}$ 表示一个有向图，其顶点集为 ${s_{i, k} ∣ i = 1, \dots, n - k + 1}$ ，标记弧的集合为 ${(s_{i, k}, s_{i + 1, k}, i) ∣ i = 1, \dots, n - k}$ ，第三个参数 $i$ 是标签。回答以下问题。

当 $L = A C A C A$ 时， $G_{L, 2}$ 的顶点集和标记弧集分别为 ${A C, C A}$ 和 ${(A C, C A, 1), (C A, A C, 2), (A C, C A, 3)}$ 。列出 $G_{L, 2}$ 中的所有欧拉路径和环。
当 $L = GCGCGC A GCG$ 时，列出 $G_{L, 3}$ 和 $G_{L, 4}$ 中的所有欧拉路径和环。
如果一个顶点的进入弧的数量等于离开弧的数量，则该顶点是平衡的。如果每个顶点都是平衡的，则有向图是平衡的。如果有向图从任意顶点到任意顶点都有路径，则它是连通的。证明一个有欧拉环的有向图是连通且平衡的。
反之，如果一个图是连通且平衡的，证明该图有一个欧拉环。

Problem 11

Suppose that there is an urn that contains $m$ black balls and $(l - m)$ white balls $(0 < m < l)$ . You randomly draw $n$ balls with replacement $(n > 0)$ . Answer the following questions with explanation.

Find the probability that you draw a black ball for the first time at the $k$ -th draw $(1 < k < n)$ .
Suppose that you have drawn a black ball for the first time at the $k$ -th draw. Find the probability that you draw one or more black balls in the remaining $(n - k)$ draws.

Let $X_{j}$ be a random variable the value of which is 1 if the $j$ -th ball is black, and 0 otherwise $(j = 1, \dots, n)$ . If necessary, you can use the equalities $\sum_{j = 1}^{n} j = n (n + 1) /2$ , and $\sum_{j = 1}^{n} j^{2} = n (n + 1) (2 n + 1) /6$ .
Find the expected value $E [X_{j}]$ of $X_{j}$ .
Let $R = \sum_{j = 1}^{n} j X_{j}$ . Find the expected value $E [R]$ of $R$ .
Find the variance $Var [R] = E [R^{2}] - (E [R])^{2}$ of $R$ .

假设有一个包含 $m$ 个黑球和 $(l - m)$ 个白球的罐子 $(0 < m < l)$ 。你随机有放回地抽取 $n$ 个球 $(n > 0)$ 。回答以下问题并解释。

计算第一次在第 $k$ 次抽到黑球的概率 $(1 < k < n)$ 。
假设你第一次在第 $k$ 次抽到黑球。计算在接下来的 $(n - k)$ 次抽中至少再抽到一个黑球的概率。

令 $X_{j}$ 是一个随机变量，如果第 $j$ 个球是黑色的，则其值为 1，否则为 0 $(j = 1, \dots, n)$ 。如果需要，你可以使用以下等式： $\sum_{j = 1}^{n} j = n (n + 1) /2$ ， $\sum_{j = 1}^{n} j^{2} = n (n + 1) (2 n + 1) /6$ 。
计算 $X_{j}$ 的期望值 $E [X_{j}]$ 。
令 $R = \sum_{j = 1}^{n} j X_{j}$ 。计算 $R$ 的期望值 $E [R]$ 。
计算 $R$ 的方差 $Var [R] = E [R^{2}] - (E [R])^{2}$ 。

Problem 12

Assume that a global alignment of two sequences, $x_{1}, \dots, x_{m}$ and $y_{1}, \dots, y_{n}$ , is calculated by a dynamic programming using the iterative formula (A).

F (i, j) = max ⎩ ⎨ ⎧ F (i - 1, j - 1) + s (x_{i}, y_{j}) F (i - 1, j) - d F (i, j - 1) - d (A)

where $s (x_{i}, y_{j})$ is the score of aligning $x_{i}$ and $y_{j}$ , $F (i, j)$ is the maximum score of the alignments of $x_{1}, \dots, x_{i}$ and $y_{1}, \dots, y_{j}$ . Assume that $d > 0$ and that score $g (k)$ is given to a gap of length $k$ . Answer the following questions (1) – (5).

Show the general form of $g (k)$ .
Show the initial values $F (i, 0)$ for $i = 1, \dots, m$ and $F (0, j)$ for $j = 1, \dots, n$ , so that the maximum score of the alignments of the two sequences $x_{1}, \dots, x_{m}$ and $y_{1}, \dots, y_{n}$ is obtained as $F (m, n)$ after updating the iterative formula (A) for $i = 1, \dots, m$ and $j = 1, \dots, n$ . Notice that $F (0, 0) = 0$ .
Evaluate the computational time of calculating the maximum score of the alignments of the two sequences $x_{1}, \dots, x_{m}$ and $y_{1}, \dots, y_{n}$ , and describe it by using $m$ and $n$ .
When updating formula (A) for $i = 1, \dots, m$ and $j = 1, \dots, n$ , $π (i, j)$ is defined as follows:

Among the values of $F (i - 1, j - 1) + s (x_{i}, y_{j})$ , $F (i - 1, j) - d$ and $F (i, j - 1) - d$ ,

when $F (i - 1, j - 1) + s (x_{i}, y_{j})$ is the maximum, $π (i, j) = (i - 1, j - 1)$ ,

otherwise, when $F (i - 1, j) - d$ is the maximum, $π (i, j) = (i - 1, j)$ ,

otherwise, $π (i, j) = (i, j - 1)$ .

Briefly explain, within five lines, about the role of $π (i, j)$ in the alignment algorithm.

假设通过动态规划计算两个序列 $x_{1}, \dots, x_{m}$ 和 $y_{1}, \dots, y_{n}$ 的全局比对，使用迭代公式 (A)。

F (i, j) = max ⎩ ⎨ ⎧ F (i - 1, j - 1) + s (x_{i}, y_{j}) F (i - 1, j) - d F (i, j - 1) - d (A)

其中 $s (x_{i}, y_{j})$ 是比对 $x_{i}$ 和 $y_{j}$ 的得分， $F (i, j)$ 是 $x_{1}, \dots, x_{i}$ 和 $y_{1}, \dots, y_{j}$ 的比对的最大得分。假设 $d > 0$ 并且对于长度为 $k$ 的空隙给定得分 $g (k)$ 。回答以下问题 (1) – (5)。

展示 $g (k)$ 的一般形式。
展示初始值 $F (i, 0)$ 对于 $i = 1, \dots, m$ 和 $F (0, j)$ 对于 $j = 1, \dots, n$ ，使得在更新迭代公式 (A) 之后，对于 $i = 1, \dots, m$ 和 $j = 1, \dots, n$ ，两个序列 $x_{1}, \dots, x_{m}$ 和 $y_{1}, \dots, y_{n}$ 的比对最大得分为 $F (m, n)$ 。注意 $F (0, 0) = 0$ 。
评估计算 $x_{1}, \dots, x_{m}$ 和 $y_{1}, \dots, y_{n}$ 两个序列的比对最大得分的计算时间，并用 $m$ 和 $n$ 描述。
在更新公式 (A) 时，对于 $i = 1, \dots, m$ 和 $j = 1, \dots, n$ ， $π (i, j)$ 定义如下：

在 $F (i - 1, j - 1) + s (x_{i}, y_{j})$ , $F (i - 1, j) - d$ 和 $F (i, j - 1) - d$ 的值中，

当 $F (i - 1, j - 1) + s (x_{i}, y_{j})$ 是最大值时， $π (i, j) = (i - 1, j - 1)$ ,

否则，当 $F (i - 1, j) - d$ 是最大值时， $π (i, j) = (i - 1, j)$ ,

否则， $π (i, j) = (i, j - 1)$ 。

简要解释 $π (i, j)$ 在比对算法中的作用，限制在五行以内。

Zephyr's Notes on ISCS & CBMS, UTokyo

Explorer

Explorer

2016

2016

Problem 7

Problem 8

Problem 9

Problem 10

Problem 11

Problem 12

Graph View

Table of Contents

Backlinks