CBMS-2020-11

题目来源:[[做题/文字版题库/CBMS/2020#Question 10|2020#Question 11]] 日期:2024-07-23 题目主题:Math-概率论-马尔可夫模型

Solution

Problem 1: Show the four probabilities, , , ,

For Machine A:

For sequence actgt:

For sequence actac:

For Machine B:

For sequence actgt:

For sequence actac:

Problem 2: Probability that actgt was produced by Machine A

Given three sequences from Machine A and two sequences from Machine B, the total number of sequences is five. Let the probability that a sequence is selected randomly from Machine A be denoted by , and the probability that a sequence is selected from Machine B be .

Using Bayes’ Theorem, the probability that the sequence was produced by Machine A given that the sequence was actgt is:

Where:

So:

Problem 3: Probability that the output sequence of Machine C is actgt when the input of Machine C was an output of Machine A

Let’s correctly calculate the probability by considering both scenarios where the input sequence is actgt or actac.

Case 1: Input Sequence is actgt

If the input sequence is actgt, there is no ac to convert. Hence, the probability that the sequence remains unchanged (no conversion needed) is .

Case 2: Input Sequence is actac

If the input sequence is actac, Machine C converts each occurrence of ac independently to gt with probability .

  • Probability of converting actac to actgt:

Now, we need to calculate the total probability that the output sequence of Machine C is actgt when the input is an output of Machine A. This involves considering both cases and their probabilities:

  1. Probability of Machine A producing actgt:

Then the first ac should not be converted.

  1. Probability of Machine A producing actac:

Then the first ac should not be converted, but the next ac should be converted.

The total probability is given by:

Therefore, the probability that the output sequence of Machine C is actgt when the input of Machine C was an output of Machine A is:

Problem 4: Probability that two sequences were produced by the same machine

Given:

  • Five sequences were produced: three from Machine A and two from Machine B.
  • Two sequences were randomly selected from these five.
  • One selected sequence was input to Machine C, resulting in actgt.
  • The other selected sequence is also actgt.

We need to find the probability that both sequences were originally produced by the same machine (either both from A or both from B).

Let’s approach this step-by-step:

  1. First, let’s define our events:
    • Let be the event that both sequences are actgt after one goes through Machine C.
    • Let be the event that both sequences were from the same machine.
  2. We need to calculate using Bayes’ theorem:
  1. Let’s calculate each component: a) : The probability that both selected sequences are from the same machine.
    • Probability of both from A:
    • Probability of both from B:
    • b) : The probability of getting two actgt sequences given they’re from the same machine. If both from A:

If both from B:

So,

c) : The overall probability of the observed event.

where is the probability of getting two actgt sequences given they’re from different machines:

And,

  1. Putting it all together:

Where:

This completes the solution for each problem.

知识点

概率论马尔可夫链 贝叶斯定理转移概率

  • 概率论 (Probability Theory): 研究随机现象规律的数学学科。
  • 马尔可夫模型 (Markov Model): 具有马尔可夫性质(即未来状态仅与当前状态有关,与过去状态无关)的随机过程模型。
  • 贝叶斯定理 (Bayes’ Theorem): 用于计算后验概率的公式,通过已知的先验概率和似然函数来更新概率。
  • 转移概率 (Transition Probability): 从一个状态转移到另一个状态的概率。

难点思路

在解答这类题目时,主要难点在于理解和应用马尔可夫模型的概率计算,以及如何运用贝叶斯定理进行概率的更新。对于每一个具体问题,我们可以按照以下步骤进行解答:

  1. 计算基础概率:根据给定模型参数计算序列在不同机器下的生成概率。
  2. 应用贝叶斯定理:通过已知信息更新概率,计算后验概率。
  3. 考虑转换:对于涉及序列转换的情况,计算转换前后的概率,并结合条件概率公式进行综合计算。

解题技巧和信息

  • 贝叶斯定理应用技巧:

    1. 条件概率的分解:将复杂的概率计算分解为条件概率的乘积。
    2. 归一化:确保计算的概率值归一化,使得总和为 1。
  • 马尔可夫模型技巧:

    1. 初始概率与转移概率:明确初始状态的概率分布和状态之间的转移概率矩阵。
    2. 序列概率计算:利用转移概率逐步计算整个序列的生成概率。
  • 综合概率计算:

    1. 将多种概率结果进行加权平均,以得到综合概率。

重点词汇

  • Probability Theory 概率论
  • Markov Model 马尔可夫模型
  • Bayes’ Theorem 贝叶斯定理
  • Transition Probability 转移概率
  • Posterior Probability 后验概率

参考资料

  1. Probability and Statistics for Engineering and the Sciences by Jay L. Devore, Chap. 5 (Discrete Random Variables and Probability Distributions)
  2. Introduction to Probability Models by Sheldon M. Ross, Chap. 4 (Markov Chains)
  3. Pattern Recognition and Machine Learning by Christopher M. Bishop, Chap. 13 (Hidden Markov Models)