2022S

Problem 1

Answer the following questions on operating systems.

For the scheduling of five processes $P_{0}$ , $P_{1}$ , $P_{2}$ , $P_{3}$ , and $P_{4}$ , the arrival time (ms) and the computation time (ms) of each process $P_{i}$ are denoted by $A_{i}$ and $C_{i}$ , respectively. Also, assume that only one process is allowed to execute at any instant, and the overhead of context switches can be ignored. Obtain the average turnaround time and the average response time when the five processes are scheduled by the Preemptive Shortest Job First algorithm, where $A_{0} = 35$ , $A_{1} = A_{2} = A_{3} = 25$ , $A_{4} = 0$ , $C_{0} = 10$ , $C_{1} = 15$ , $C_{2} = 20$ , $C_{3} = 30$ , and $C_{4} = 50$ . Here, the turnaround time refers to the time interval from the arrival of the process to the completion of its execution, and the response time refers to the time interval from the arrival of the process to the beginning of its execution.
Obtain the average turnaround time and the average response time when the five processes with the same arrival and computation times as those given in question (1) are scheduled by the Non-Preemptive Shortest Job First algorithm.
Obtain the average turnaround time and the average response time when the five processes with the same arrival and computation times as those given in question (1) are scheduled by the Round Robin algorithm with the time slice 10 ms. The next time slice starts immediately when the current process does not exhaust its time slice. Also, a new process is added to the end of the Round Robin queue upon its arrival, and ties are broken in favor of the processes with shorter remaining computation times if multiple processes arrive at the end of the queue simultaneously.
In real-world operating systems, the overhead of context switches cannot be ignored. Explain the pros and cons of the Round Robin algorithm from the viewpoint of CPU scheduling and memory management, when this overhead is considered.
The Aging scheme is often used to determine process priorities in real-world operating systems. Explain the basic concept of the Aging scheme and its advantage over the classical static-priority scheme.

回答以下关于操作系统的问题。

对于五个进程 $P_{0}$ 、 $P_{1}$ 、 $P_{2}$ 、 $P_{3}$ 和 $P_{4}$ 的调度，每个进程 $P_{i}$ 的到达时间（毫秒）和计算时间（毫秒）分别用 $A_{i}$ 和 $C_{i}$ 表示。同样，假设任何时刻只允许一个进程执行，并且可以忽略上下文切换的开销。当五个进程由抢占式最短作业优先算法调度时，求平均周转时间和平均响应时间，其中 $A_{0} = 35$ ， $A_{1} = A_{2} = A_{3} = 25$ ， $A_{4} = 0$ ， $C_{0} = 10$ ， $C_{1} = 15$ ， $C_{2} = 20$ ， $C_{3} = 30$ ， $C_{4} = 50$ 。这里，周转时间指的是从进程到达到执行完成的时间间隔，响应时间指的是从进程到达到开始执行的时间间隔。
当五个进程具有与问题（1）中给定的相同的到达时间和计算时间时，求它们在非抢占式最短作业优先算法调度下的平均周转时间和平均响应时间。
当五个进程具有与问题（1）中给定的相同的到达时间和计算时间时，求它们在时间片为 10 毫秒的轮转法调度下的平均周转时间和平均响应时间。下一个时间片在当前进程未用尽其时间片时立即开始。同样，新进程在到达时添加到轮转队列的末尾，并且如果多个进程同时到达队列末尾，则优先处理剩余计算时间较短的进程。
在实际操作系统中，无法忽略上下文切换的开销。解释在考虑这种开销时，轮转算法在 CPU 调度和内存管理方面的优缺点。
老化方案经常用于确定实际操作系统中的进程优先级。解释老化方案的基本概念及其相对于传统静态优先级方案的优势。

Problem 2

The following program written in C language defines a function mysort(a, i, j), which sorts an integer array a from a[i] to a[j-1] in ascending order (where $i < j$ ). The function multfrac(k, l, m) used in the program returns the least integer that is greater than or equal to $k \times \frac{l}{m}$ , when $k$ , $l$ , and $m$ are positive integers. Assume that $w$ , $x$ , $y$ , and $z$ are positive integer constants. Assume also that integer operations never overflow.

int multfrac(int k, int l, int m) {
    return (k * l + (m-1)) / m;
}
 
void compare_swap(int *p, int *q) {
    if (*p > *q) {
        int tmp = *p;
        *p = *q;
        *q = tmp;
    }
}
 
void mysort(int a[], int i, int j) {
    int k = j - i;
    if (k < 4) {
        // X
    }
    else {
        mysort(a, i, i + multfrac(k, x, w));
        mysort(a, j - multfrac(k, y, w), j);
        mysort(a, i, i + multfrac(k, z, w));
    }
}

Answer the following questions.

Answer appropriate code to fill the blank X when $(w, x, y, z)$ is $(4, 3, 3, 3)$ . You are not allowed to use a function call except for compare_swap. The code may consist of multiple lines.
Let $T (n)$ denote the number of times that the code fragment X is executed while mysort(a, 0, n) is called. Give the order of $T (n)$ on $n$ when $(w, x, y, z)$ is $(4, 3, 3, 3)$ .
Answer whether or not mysort always works correctly for each case where $(w, x, y, z)$ is $(4, 2, 3, 3)$ , $(4, 3, 2, 3)$ , $(4, 3, 3, 2)$ , and $(4, 2, 3, 2)$ .
Give a necessary and sufficient condition on $w$ , $x$ , $y$ , and $z$ that guarantees mysort to always work correctly.

以下用 C 语言编写的程序定义了一个函数 mysort(a, i, j)，该函数按升序对整数数组 a 从 a[i] 到 a[j-1] 进行排序（其中 $i < j$ ）。程序中使用的函数 multfrac(k, l, m) 返回大于或等于 $k \times \frac{l}{m}$ 的最小整数，其中 $k$ 、 $l$ 和 $m$ 是正整数。假设 $w$ 、 $x$ 、 $y$ 和 $z$ 是正整数常量。还假设整数操作永不溢出。

int multfrac(int k, int l, int m) {
    return (k * l + (m-1)) / m;
}
 
void compare_swap(int *p, int *q) {
    if (*p > *q) {
        int tmp = *p;
        *p = *q;
        *q = tmp;
    }
}
 
void mysort(int a[], int i, int j) {
    int k = j - i;
    if (k < 4) {
        // X
    }
    else {
        mysort(a, i, i + multfrac(k, x, w));
        mysort(a, j - multfrac(k, y, w), j);
        mysort(a, i, i + multfrac(k, z, w));
    }
}

回答以下问题。

当 $(w, x, y, z)$ 为 $(4, 3, 3, 3)$ 时，回答填充空白 X 的适当代码。除了 compare_swap 之外，不允许使用函数调用。代码可以包含多行。
设 $T (n)$ 表示在调用 mysort(a, 0, n) 时代码片段 X 执行的次数。当 $(w, x, y, z)$ 为 $(4, 3, 3, 3)$ 时，给出 $T (n)$ 关于 $n$ 的阶数。
回答对于 $(w, x, y, z)$ 为 $(4, 2, 3, 3)$ 、 $(4, 3, 2, 3)$ 、 $(4, 3, 3, 2)$ 和 $(4, 2, 3, 2)$ 的每种情况，mysort 是否总是正确工作。
给出 $w$ 、 $x$ 、 $y$ 和 $z$ 的必要且充分的条件，以保证 mysort 始终能正确工作。

Problem 3

Let $Σ_{1} = {a, b}$ and $Σ_{2} = {t, f}$ . For a word $w \in Σ_{1}^{*}$ , we write $∣ w ∣$ for the length of $w$ . We also write $ϵ$ for the empty word (i.e., the word of length 0). For a word $w \in Σ_{1}^{*}$ , we define the function $f_{w} \in Σ_{1}^{*} \to Σ_{2}^{∣ w ∣}$ by:

f_{w} (w^{'}) = {x_{1} \dots x_{∣ w^{'} ∣} \in Σ_{2}^{∣ w^{'} ∣} ∣ x_{i} = {t f if w^{'} = u w v for some u, v \in Σ_{1}^{*} such that ∣ u ∣ = i - 1 otherwise for each i \in {1, \dots, ∣ w^{'} ∣}} .

In other words, $f_{w} (w^{'})$ is the word obtained from $w^{'}$ by replacing the start position of each subword that matches $w$ with $t$ and any other position with $f$ . For example, $f_{aa} (baaab) = f tt ff$ and $f_{ab} (abbab) = ttttt$ . Furthermore, we extend the function $f_{w}$ to the function $f_{w}^{*}$ that maps a language over $Σ_{1}$ to a language over $Σ_{2}$ by the following definition:

f_{w}^{*} (L) = {f_{w} (w^{'}) ∣ w^{'} \in L} .

For example, $f_{ab}^{*} ({(abb)^{n} ∣ n \geq 0}) = {(t fff)^{n} ∣ n \geq 0}$ .

Answer the following questions.

Compute $f_{aba} (babababa)$ .
Express $f_{aba} (Σ_{1}^{*})$ by using a regular expression.
Suppose that a word $w \in Σ_{1}^{*}$ (where $∣ w ∣ > 0$ ) and a deterministic finite automaton $A = (Q, Σ_{1}, δ, q_{0}, F)$ are given, and that $L$ is the language accepted by $A$ . Here, $Q, Σ_{1}, δ, q_{0}, F$ are respectively the set of states, the transition function, the initial state, and the set of accepting states of $A$ . Assume that the transition function $δ \in Q \times Σ_{1} \to Q$ is total. Give a non-deterministic finite automaton that accepts $f_{w}^{*} (L)$ , with a brief explanation. You may use $ϵ$ -transitions.
If the following proposition is true, then give a proof sketch (it suffices to show a pushdown automaton that accepts $f_{w}^{*} (L)$ or a context-free grammar that generates $f_{w}^{*} (L)$ , with a brief explanation). Otherwise, give a counterexample.

Proposition: “For every word $w \in Σ_{1}^{*}$ , if $L \subseteq Σ_{1}^{*}$ is a context-free language, then $f_{w}^{*} (L)$ is also a context-free language.”

设 $Σ_{1} = {a, b}$ 和 $Σ_{2} = {t, f}$ 。对于一个单词 $w \in Σ_{1}^{*}$ ，我们用 $∣ w ∣$ 表示 $w$ 的长度。我们也用 $ϵ$ 表示空字（即长度为 0 的字）。对于一个单词 $w \in Σ_{1}^{*}$ ，我们定义函数 $f_{w} \in Σ_{1}^{*} \to Σ_{2}^{∣ w ∣}$ 如下：

f_{w} (w^{'}) = {x_{1} \dots x_{∣ w^{'} ∣} \in Σ_{2}^{∣ w^{'} ∣} ∣ x_{i} = {t f if w^{'} = u w v for some u, v \in Σ_{1}^{*} such that ∣ u ∣ = i - 1 otherwise for each i \in {1, \dots, ∣ w^{'} ∣}} .

换句话说， $f_{w} (w^{'})$ 是从 $w^{'}$ 获得的单词，通过用 $t$ 替换每个匹配 $w$ 的子单词的起始位置，并用 $f$ 替换其他任何位置。例如， $f_{aa} (baaab) = f tt ff$ 和 $f_{ab} (abbab) = ttttt$ 。此外，我们将函数 $f_{w}$ 扩展为函数 $f_{w}^{*}$ ，该函数将 $Σ_{1}$ 上的语言映射到 $Σ_{2}$ 上的语言，定义如下：

f_{w}^{*} (L) = {f_{w} (w^{'}) ∣ w^{'} \in L} .

例如， $f_{ab}^{*} ({(abb)^{n} ∣ n \geq 0}) = {(t fff)^{n} ∣ n \geq 0}$ 。

回答以下问题。

计算 $f_{aba} (babababa)$ 。
使用正则表达式表示 $f_{aba} (Σ_{1}^{*})$ 。
假设一个单词 $w \in Σ_{1}^{*}$ （其中 $∣ w ∣ > 0$ ）和一个确定性有限自动机 $A = (Q, Σ_{1}, δ, q_{0}, F)$ 已给出，且 $L$ 是 $A$ 接受的语言。这里， $Q, Σ_{1}, δ, q_{0}, F$ 分别是状态集、转移函数、初始状态和接受状态集。假设转移函数 $δ \in Q \times Σ_{1} \to Q$ 是完全的。给出一个接受 $f_{w}^{*} (L)$ 的非确定性有限自动机，并简要解释。您可以使用 $ϵ$ -转换。
如果以下命题为真，则给出一个证明草图（证明接受 $f_{w}^{*} (L)$ 的下推自动机或生成 $f_{w}^{*} (L)$ 的上下文无关文法即可，简要说明）。否则，给出一个反例。

命题：“对于每个单词 $w \in Σ_{1}^{*}$ ，如果 $L \subseteq Σ_{1}^{*}$ 是一个上下文无关语言，则 $f_{w}^{*} (L)$ 也是一个上下文无关语言。”

Problem 4

Answer the following questions on computer architecture.

When the instruction $j$ may use the result generated by the instruction $i$ , we say there is a data dependency from the instruction $j$ to the instruction $i$ , and write $j \to i$ . Give all data dependencies in the program below. The behavior of each instruction is described as a comment (the description after #) in the program. The processor which executes the program has 32 registers from $x 0$ to $x 31$ , and $x 0$ is the zero register that always keeps the value 0. In the comment, we represent a memory access to the address addr on the memory as memory[addr].

instruction 0) addi x3, x0, 64  # x3 <- x0 + 64
instruction 1) addi x4, x0, 0   # x4 <- x0 + 0
instruction 2) addi x5, x0, 0   # x5 <- x0 + 0
instruction 3) Loop: lw x6, 0(x4)  # x6 <- memory[x4 + 0]
instruction 4) add x5, x5, x6   # x5 <- x5 + x6
instruction 5) addi x4, x4, 4   # x4 <- x4 + 4
instruction 6) blt x4, x3, Loop  # if x4 < x3, goto Loop
instruction 7) sw x5, 4096(x0)  # memory[x0 + 4096] <- x5

Consider a 5-stage pipeline processor that issues up to one instruction per clock cycle. The processor consists of 5 stages: instruction fetch (IF) stage, instruction decode and register fetch (ID) stage, execution (EX) stage, memory access (MA) stage, and register write back (WB) stage. The bit width of each register is 32. The processor has the instruction and data memories that can be accessed in one clock cycle, and load-word lw and store-word sw instructions do not stall on the MA stage. If there is a load-use data hazard for lw instruction, the IF, ID, and EX stages are stalled for one clock cycle. The branch instruction blt (branch if less than) stalls the IF and ID stages until the branch result is determined in the EX stage. Thus, the processor does not fetch subsequent instructions for two clock cycles after a branch instruction is fetched. Execution results in the EX stage and load results in the MA stage are properly forwarded to the EX stage.

Explain what the load-use data hazard is. Explain also how load-use data hazards occur when the program in question (1) is executed on the processor.
Calculate the number of clock cycles required for the execution of the program in question (1) on the processor in question (2). Calculate also the average IPC (instructions per cycle) up to two places of decimals.
Using the program in question (1) and the processor in question (2) as an example, explain the mechanism and role of dynamic branch prediction.

回答以下关于计算机体系结构的问题。

当指令 $j$ 可能使用由指令 $i$ 生成的结果时，我们说指令 $j$ 对指令 $i$ 存在数据依赖关系，并记作 $j \to i$ 。列出下面程序中的所有数据依赖关系。每条指令的行为在程序中的注释（# 后的描述）中描述。执行该程序的处理器有 32 个寄存器，从 $x 0$ 到 $x 31$ ，其中 $x 0$ 是始终保持值为 0 的零寄存器。在注释中，我们将对内存地址 addr 的访问表示为 memory[addr]。

instruction 0) addi x3, x0, 64  # x3 <- x0 + 64
instruction 1) addi x4, x0, 0   # x4 <- x0 + 0
instruction 2) addi x5, x0, 0   # x5 <- x0 + 0
instruction 3) Loop: lw x6, 0(x4)  # x6 <- memory[x4 + 0]
instruction 4) add x5, x5, x6   # x5 <- x5 + x6
instruction 5) addi x4, x4, 4   # x4 <- x4 + 4
instruction 6) blt x4, x3, Loop  # if x4 < x3, goto Loop
instruction 7) sw x5, 4096(x0)  # memory[x0 + 4096] <- x5

考虑一个 5 级流水线处理器，该处理器每个时钟周期最多发出一条指令。处理器由 5 个阶段组成：指令获取（IF）阶段、指令解码和寄存器获取（ID）阶段、执行（EX）阶段、存储器访问（MA）阶段和寄存器写回（WB）阶段。每个寄存器的位宽为 32 位。处理器具有可以在一个时钟周期内访问的指令和数据存储器，并且加载字（lw）和存储字（sw）指令不会在 MA 阶段停止。如果存在 lw 指令的加载-使用数据冒险，则 IF、ID 和 EX 阶段将停止一个时钟周期。分支指令 blt（如果小于则分支）会使 IF 和 ID 阶段停止，直到在 EX 阶段确定分支结果。因此，处理器在提取分支指令后的两个时钟周期内不会获取后续指令。执行结果在 EX 阶段产生，加载结果在 MA 阶段正确转发到 EX 阶段。

解释什么是加载-使用数据冒险。还解释在处理器上执行问题（1）中的程序时，加载-使用数据冒险是如何发生的。
计算在问题（2）中的处理器上执行问题（1）中的程序所需的时钟周期数。同时计算平均 IPC（每周期指令数），保留到小数点后两位。
以问题（1）中的程序和问题（2）中的处理器为例，解释动态分支预测的机制和作用。

Zephyr's Notes on ISCS & CBMS, UTokyo

Explorer

Explorer

2022S

2022S

Problem 1

Problem 2

Problem 3

Problem 4

Graph View

Table of Contents

Backlinks