2020S2

Problem 1

Let be a propositional variable, and be a literal (i.e., a propositional variable or negation of a propositional variable). In this problem, a propositional formula of the following form is called a clause.

If , it is of the form , and if , it is of the form . Hereinafter, is a set of clauses, and is a set of propositional variables. If all clauses in are true under the interpretation in which all propositional variables in are true and the other variables are false, then is called a model of . The inclusion relation between sets is naturally defined between models.

Answer the following questions.

  1. Let . Enumerate all the subsets of that are models of .

We write for the set of clauses obtained from by (i) removing all the clauses that contain negation of a propositional variable in in the left hand side of , and then (ii) deleting all the negated literals (negation of propositional variables) from the remaining clauses.

  1. For in question (1), if , what is ?

  2. Show that if a model of satisfies , then is a model of .

  3. Show that if a model of satisfies , then is a model of .

  4. For and in question (2), obtain the minimum model of . Here, a model of is called a minimum model of if holds for every model of .

  5. Show that if the minimum model of coincides with , then is a minimal model of . Here, a model of is called a minimal model of if there does not exist a model of such that .

  6. Is a minimal model of always a minimum model of ? If so, prove the fact. Otherwise, give a counterexample.


是一个命题变量, 是一个文字(即,命题变量或命题变量的否定)。在这个问题中,以下形式的命题公式称为子句。

如果 ,则形式为 ,如果 ,则形式为 。以下, 是一组子句, 是一组命题变量。如果在解释中 中的所有子句在 中所有命题变量为真且其他变量为假的情况下为真,则称 的模型。模型之间的包含关系自然地定义在集合之间。

回答以下问题。

  1. 。枚举 的所有子集,它们是 的模型。

我们写 来表示通过 (i) 删除所有在 的左边包含 中命题变量的否定的子句,并 (ii) 从剩余的子句中删除所有否定的文字(命题变量的否定)后从 得到的子句集合。

  1. 对于问题 (1) 中的 ,如果 ,那么 是什么?

  2. 证明如果 的一个模型 满足 ,那么 的一个模型。

  3. 证明如果 的一个模型 满足 ,那么 的一个模型。

  4. 对于问题 (2) 中的 ,求出 的最小模型。这里,如果 的一个模型 满足对于 的每一个模型 ,有 ,则称 的最小模型。

  5. 证明如果 的最小模型与 重合,则 的最小模型。这里,如果 的一个模型 满足不存在 的一个模型 使得 ,则称 的最小模型。

  6. 一个 的最小模型 是否总是 的最小模型?如果是这样,证明这一事实。否则,给出一个反例。


Problem 2

Consider the problem of obtaining a sequence of part-of-speech (POS) tags for a given natural language sentence (word sequence) . For example, for the following four-word sentence

our goal is to output the following POS tag sequence.

In this example, NOUN, VERB, and DET are POS tags, denoting noun, verb, and determiner, respectively. is a finite set of all words, and is a finite set of all POS tags. Suppose that a POS-tagged corpus ( is a -th sentence in , is its POS tag sequence, and is the number of elements in ) is given as training data. In the following, for a word sequence , its length is represented as .

Answer the following questions.

  1. Consider the probability for assigning a POS tag to a word , and define the probability function for assigning a POS tag sequence to a word sequence as follows.

Supposing each element of training data is independently distributed following , answer a method for computing the maximum likelihood estimate of .

  1. Assume that is given for each . Describe an algorithm to obtain a POS tag sequence that maximizes for an input sentence .

  2. Consider the probability for assigning a POS tag to a word when a word immediately precedes in a sentence. Note that is considered as a special word <s> when is the first word of the sentence. The probability function is defined as follows.

Supposing each element of training data is independently distributed following , answer a method for computing the maximum likelihood estimate of .

Also, assuming that is given for each , describe an algorithm to obtain a POS tag sequence that maximizes for an input sentence .

  1. Explain why POS tagging using hidden Markov models is expected to attain higher accuracy than the methods described in questions (1) to (3). You must describe the definition of hidden Markov models and the POS tagging algorithm using hidden Markov models, and provide an explanation including an example where the methods described in questions (1) to (3) output a wrong POS tag but the POS tagging using hidden Markov models outputs a correct POS tag.

考虑为一个给定的自然语言句子(单词序列) 获得词性标签(POS 标签)序列 的问题。例如,对于以下四个单词的句子

我们的目标是输出以下词性标签序列。

在这个例子中,NOUN, VERB 和 DET 是词性标签,分别表示名词、动词和限定词。 是所有单词的有限集合, 是所有词性标签的有限集合。假设给定了一个词性标注的语料库 中的第 个句子, 是其词性标签序列, 中的元素数目)作为训练数据。在下文中,对于一个单词序列 ,其长度 表示为

回答以下问题。

  1. 考虑为单词 分配词性标签 的概率 ,并定义为单词序列 分配词性标签序列 的概率函数 如下。

假设训练数据 的每个元素是独立分布的,服从 ,回答计算 的最大似然估计的方法。

  1. 假设每个 给定了 。描述一个算法以获得对于输入句子 最大化 的词性标签序列

  2. 考虑当单词 在句子中立即出现在 之前时,为单词 分配词性标签 的概率 。注意,当 是句子的第一个单词时, 被认为是一个特殊的单词 <s>。概率函数 定义如下。

假设训练数据 的每个元素是独立分布的,服从 ,回答计算 的最大似然估计的方法。

同样,假设每个 给定了 ,描述一个算法以获得对于输入句子 最大化 的词性标签序列

  1. 解释为什么使用隐马尔可夫模型的词性标注方法预期会比问题 (1) 到 (3) 中描述的方法获得更高的准确率。你必须描述隐马尔可夫模型的定义和使用隐马尔可夫模型的词性标注算法,并提供一个包括在问题 (1) 到 (3) 中描述的方法输出错误词性标签但使用隐马尔可夫模型的词性标注方法输出正确词性标签的例子的解释。

Problem 3

Let us consider representing an orientation of an object in a three-dimensional space as a rotation from a canonical orientation. In the following, we consider the orientation of the object consisting of four cubes as shown in Figures 1 and 2 (note that each pair of cubes placed side by side shares a surface). We also define the canonical orientation as the one illustrated in Figure 1. Note that the center of the cube at the lower left in Figure 1 is placed at the origin, and the center of each cube lies on the axis. Figure 2 shows another orientation of this object, in which the center of the cube at the lower right is placed at the origin, and the center of each cube lies on the axis. When you draw another orientation of this object, you need to draw it with the axes from the same point of view as Figures 1 and 2. Angles must be represented in radian.

Answer the following questions.

  1. Draw the object with the orientation represented by Euler angles .

  2. Answer Euler angles given as the elementwise arithmetic mean of the two triplets of Euler angles, (corresponding to the canonical orientation) and . Also, draw the object with the orientation represented by the mean Euler angles.

Let us represent an orientation of the object using a transformation matrix that corresponds to a rotation from the canonical orientation given in Figure 1. Note that a point on the object at a three-dimensional coordinate moves to another three-dimensional coordinate through the rotation by a transformation matrix .

  1. Answer the transformation matrix that represents the orientation shown in Figure 2.

  2. Answer the transformation matrix given as the elementwise arithmetic mean of the two transformation matrices corresponding to the orientations shown in Figures 1 and 2. Also, draw and describe the shape of the object obtained by applying the mean transformation matrix to the object shown in Figure 1.

Let us represent an orientation of the object using a quaternion that expresses a rotation from the canonical orientation given in Figure 1. A quaternion is a four-dimensional unit vector, and a quaternion that expresses the three-dimensional rotation centered at the origin around a three-dimensional unit vector with an angle is represented as .

Note that the angle of a rotation around the three-dimensional unit vector is positive if the rotation is in the clockwise direction when the unit vector is viewed from its start point.

  1. Answer the quaternion that corresponds to the orientation shown in Figure 2. There are two answers for this question; give both of the two answers.

  2. Answer the quaternion given as the spherical linear average of the two quaternions corresponding to the orientations shown in Figures 1 and 2. There are two answers for this question; give both of the two answers. Also, draw the object with the orientation represented by each of the average quaternions. Note that the spherical linear average of two quaternions and is given as where ; the spherical linear average is undefined when .


让我们考虑在三维空间中表示物体的方向为相对于标准方向的旋转。在下文中,我们考虑由图 1 和图 2 所示的四个立方体组成的物体的方向(注意每对并排放置的立方体共享一个表面)。我们还将标准方向定义为图 1 所示的方向。注意,图 1 左下方立方体的中心放置在原点,每个立方体的中心都位于 轴上。图 2 显示了该物体的另一种方向,其中右下角立方体的中心放置在原点,并且每个立方体的中心都位于 轴上。当你绘制该物体的另一个方向时,你需要从与图 1 和图 2 相同的视角绘制 轴。角度必须用弧度表示。

回答以下问题。

  1. 画出由欧拉角 表示的方向的物体。

  2. 回答由欧拉角 (对应于标准方向)和 的两组三元组的元素平均值给出的欧拉角。还要画出由平均欧拉角表示的方向的物体。

我们使用 转换矩阵表示物体的方向,该转换矩阵对应于图 1 给出的标准方向的旋转。注意,物体上在三维坐标 上的点通过转换矩阵 旋转到另一个三维坐标

  1. 回答表示图 2 所示方向的转换矩阵。

  2. 回答由图 1 和图 2 所示方向的两个转换矩阵的元素平均值给出的转换矩阵。还要描述通过对图 1 中显示的物体应用平均转换矩阵得到的物体的形状。

我们使用四元数表示物体的方向,四元数表示从图 1 给出的标准方向的旋转。四元数是一个四维单位向量,表示绕三维单位向量 在原点的三维旋转的四元数表示为

注意,如果旋转方向是从单位向量的起点看时是顺时针方向,则围绕三维单位向量的旋转角度为正。

  1. 回答表示图 2 所示方向的四元数。这个问题有两个答案,请给出两个答案。

  2. 回答由图 1 和图 2 所示方向的两个四元数的球面线性平均给出的四元数。这个问题有两个答案,请给出两个答案。还要绘制由每个平均四元数表示的方向的物体。注意,两个四元数 的球面线性平均表示为 ,其中 ;当 时,球面线性平均未定义。


Problem 4

Consider a connected undirected graph with positive edge weights. A subgraph of obtained by removing some of the edges in is called a spanning tree of , if is a tree. The summation of weights of all the edges in a spanning tree is called the weight of the spanning tree. A minimum spanning tree of is a spanning tree of whose weight is minimum. You can assume appropriate data representation for graphs and trees in the questions below.

Answer the following questions.

  1. Let be the edge (or arbitrary one of the edges if there are multiple such edges) with the maximum weight in some arbitrary cycle in . Prove that there is a minimum spanning tree of that does not contain .

  2. Consider an arbitrary vertex subset of () for . Let be the edge (or arbitrary one of the edges if there are multiple such edges) with the minimum weight among the edges such that and . Prove that there is a minimum spanning tree that contains . Note that denotes an empty set.

  3. Describe an -time algorithm that finds an arbitrary path between two nodes on graph .

  4. Assume that we are given a graph and its minimum spanning tree . Let be the graph obtained by adding to a new edge with weight . Describe an -time algorithm that finds a minimum spanning tree of .

  5. Prove the correctness of the algorithm described in question (4).


考虑一个具有正边权的连通无向图 。如果 的一个子图 是一个树,则称其为 的生成树。生成树中所有边的权重之和称为生成树的权重。 的最小生成树是 的一个生成树,其权重最小。你可以在以下问题中假设图和树的适当数据表示。

回答以下问题。

  1. 是在 的某个任意循环 中具有最大权重的边(如果有多条这样的边,则任意选取一条)。证明存在一个不包含 的最小生成树。

  2. 对于 的任意顶点子集 (),设 是权重最小的边(如果有多条这样的边,则任意选取一条),该边位于 的顶点之间,即 使得 。证明存在一个包含 的最小生成树。注意, 表示空集。

  3. 描述一个 时间的算法,用于找到图 上的两个节点 之间的任意路径。

  4. 假设我们给定一个图 及其最小生成树 。设 是通过向 中添加一条新边 ,且权重 得到的图。描述一个 时间的算法,以找到 的一个最小生成树。

  5. 证明问题 (4) 中描述的算法的正确性。


Problem 5

Suppose that is a real function defined on a closed interval from to . Suppose that is an integer that is no less than 2, and define . Then, for each integer , define and , respectively. Namely, are the points that divide the interval from to into equal parts, and is the value of the function at .

Next, define , and define as the approximate value calculated by the composite trapezoid rule applied on using the points which divide the interval from to into equal parts.

Answer the following questions.

  1. Assume that is a four times continuously differentiable function. Let be an integer such that and define as the second order differential of at . Express an approximate value of whose error is , as a linear combination of , and .

  2. The approximation obtained by question (1) seems to become accurate when approaches zero. Answer, with a reason, whether this is correct or not in the calculation with the IEEE 754 double precision floating point operations.

  3. Express using , and .

  4. Assume that can be expressed by a quadratic function in each interval formed by the division into equal parts. Then, define similarly using the division into equal parts composed of the division of each original part into two halves. Express using and .


假设 是一个定义在从 的闭区间上的实函数。假设 是不小于 2 的整数,并定义 。然后,对于每个整数 ,分别定义 。即, 是将从 的区间分成 等分的点, 是函数 处的值。

接下来,定义 ,并定义 为使用复合梯形规则计算的近似值,该规则适用于使用将区间 分成 等分的点上的

回答以下问题。

  1. 假设 是四次连续可微函数。设 是整数,使得 ,并定义 处的二阶导数。表示 的一个近似值,其误差为 ,作为 的线性组合。

  2. 问题 (1) 中获得的近似值在 趋近于零时似乎变得准确。回答,在使用 IEEE 754 双精度浮点运算进行计算时,这是否正确,并给出理由。

  3. 使用 表示

  4. 假设 可以在每个由分成 等分形成的区间中用二次函数表示。然后,类似地定义 ,使用将每个原始部分分成两半的 等分的划分。使用 表示


Problem 6

The probability density function of the normal distribution with mean and variance is given by

Let and be random variables that independently follow and , respectively, and define for some constant . For an integer , let be two-dimensional random variables that independently follow the same distribution as , for which we write and .

Answer the following questions.

  1. Express the expectation and variance of using and .

  2. Show that the conditional distribution of given is a normal distribution, and express its expectation and variance using , , and .

  3. Let denote a realization of . Express the joint probability density function of using and .

  4. Consider maximum-likelihood estimation of by the EM algorithm for the case where the observation of is missing from , that is, the case where is observed. Then the update rule of estimators of by the EM algorithm for some initial value is given by

where and are the values obtained by the substitution in the expressions of and obtained in question (2), respectively, and denotes the expectation when follows and is fixed.

(i) Express using and .

(ii) Express using and .


正态分布 的概率密度函数,其均值 和方差 ,表示为

是独立服从 的随机变量,定义 ,其中 为常数。对于整数 ,设 为服从 相同分布的二维随机变量,我们记

回答以下问题。

  1. 使用 表示 的期望 和方差

  2. 证明 在给定 条件下的条件分布是正态分布,并使用 表示其期望 和方差

  3. 表示 的一个实现。使用 表示 的联合概率密度函数

  4. 考虑使用 EM 算法对 的极大似然估计的情况,其中 的观测值缺失于 ,即 被观测到的情况。然后,使用 EM 算法对某些初始值 的估计器的更新规则为

其中 是在问题 (2) 中获得的 的表达式中通过代入 得到的值, 表示当 服从 固定时的期望。

(i) 使用 表示

(ii) 使用 表示