CBMS-2018-12

题目来源:[[做题/文字版题库/CBMS/2018#Question 12|2018#Question 12]] 日期:2024-07-27 题目主题: CS-算法-最长公共子序列

解题思路

我们首先定义 ,并描述它们的递归关系。然后我们计算给定字符串的矩阵 。接着,我们编写伪代码以从矩阵 中获取一个最长的公共子序列。最后,我们编写伪代码以在给定矩阵 的情况下计算包含 的第 个位置的公共子序列的最大长度。

Solution

Part 1: Recurrence Relations

Recurrence for

The value of can be computed as follows:

  • If , then .
  • If , then .

This can be summarized as:

Recurrence for

The value of can be computed as follows:

  • If , then .
  • If , then .

This can be summarized as:

Part 2: Compute Matrices and for and

Matrix

Let’s fill the matrix for and :

x\yACACG
A11111
C12222
T12222
G12223
G12223

Matrix

Let’s fill the matrix for and :

x\yGCACA
G11111
G11111
T11111
C12222
A12333

Part 3: Pseudocode for Obtaining One of the Longest Common Subsequences

To extract one of the longest common subsequences from the matrix , we use the following algorithm. This algorithm traces back from the bottom-right corner of the matrix to the top-left corner, reconstructing the longest common subsequence by following the path of optimal choices recorded in .

Explanation

  • Start from the bottom-right corner of the matrix , i.e., .
  • Compare characters of and :
    • If , include in the result and move diagonally to .
    • If , move in the direction that gives the larger value (either up or left).
  • Continue this process until reaching the top-left corner of the matrix.
  • The result will be one of the longest common subsequences.

Pseudocode

function getLCS(x, y, alpha)
    i = length(x)
    j = length(y)
    lcs = ""
    while i > 0 and j > 0
        if x[i] == y[j]
            lcs = x[i] + lcs
            i = i - 1
            j = j - 1
        else if alpha[i-1, j] >= alpha[i, j-1]
            i = i - 1
        else
            j = j - 1
    return lcs

Part 4: Pseudocode for Computing Maximal Length of Common Subsequences Containing the -th Position of

Given matrices and , we can compute the maximal length of common subsequences that include the -th position of . This is done by evaluating the length of subsequences that start from the beginning and end at the -th position, combined with subsequences that start at the -th position and extend to the end.

Explanation

  • For each position in :
    • Combine the length of the prefix up to () with the length of the suffix from onwards () as the length of the common subsequence.
    • If , since is included in both the prefix and suffix, subtract 1 from the total length.
  • The maximum value obtained through this process gives the desired length.

Pseudocode

function maxLengthWithPosition(x, y, alpha, beta, i)
    maxLength = 0
    for j = 1 to length(y)
        currentLength = alpha[i, j] + beta[n-i+1, m-j+1]
        if x[i] == y[j]
            currentLength = currentLength - 1
        maxLength = max(maxLength, currentLength)
    return maxLength

知识点

动态规划最长公共子序列递归

难点思路

对于递归关系的理解和矩阵填充的具体实现可能会比较复杂,需要仔细考虑每一步的递推关系。

解题技巧和信息

  • 动态规划表格填充方法:先初始化,然后按照递推关系逐步填充。
  • 递归关系需要对字符串字符的匹配情况进行详细考虑,以确保递推关系的正确性。

重点词汇

  • common subsequence 公共子序列
  • recurrence relation 递推关系
  • prefix 前缀
  • suffix 后缀
  • dynamic programming 动态规划

参考资料

  1. Introduction to Algorithms, 3rd Edition, Cormen et al., Chapter 15.