2024S
Problem 1
Given an integer , we define a language over an alphabet by:
Here, is the set of integers and . That is, is the language that consists of words whose -th symbol from the last is .
Answer the following questions.
-
Give a non-deterministic finite automaton that accepts .
-
Describe using a regular expression. You may write the -time concatenation of a regular expression as .
-
Is a regular language? If so, give a finite automaton that accepts . If not, prove that is not regular. You may use the pumping lemma for regular languages.
-
Prove that any deterministic finite automaton that accepts has at least states.
给定一个整数 ,我们通过以下方式定义了一个字母表 上的语言 :
这里, 是整数集,并且 。也就是说, 是由那些倒数第 个符号是 的单词组成的语言。
回答以下问题。
-
给出一个接受 的非确定性有限自动机。
-
使用正则表达式描述 。你可以将正则表达式 的 次连接写为 。
-
是正则语言吗?如果是,给出一个接受 的有限自动机。如果不是,证明 不是正则的。你可以使用正则语言的抽水引理。
-
证明任何接受 的确定性有限自动机至少有 个状态。
Problem 2
Consider a processor with a direct-mapped data cache that stores 256 bytes of data in total. The cache line size (block size) of the data cache is 16 bytes. Through the data cache, the processor reads data from the memory by the load-word instruction lw
and writes data to the memory by the store-word instruction sw
. The address and data bit-widths of the load-word/store-word instructions are 32. When the bit representation of a memory address is , the index and the offset of the data cache are and , respectively. The processor has 32 integer registers from x0
to x31
, and x0
is the zero register that always keeps the value 0.
The following program applies the average filter with size 3 to the one-dimensional array with head address 0x1000
and size 402 and stores the result on the one-dimensional array with head address 0x2000
and size 400. Each element of the arrays and is a 32-bit signed integer. The behavior of each instruction is described as a comment (the description after #) in the program, where memory[addr]
represents a memory access to the address addr
. The initial values of the registers x5
, x6
, and x7
are 0x1640
, 0x1000
, and 0x2000
, respectively.
Instruction 0) addi x2, x0, 3 # x2 <- x0 + 3
Instruction 1) Loop: lw x3, 0(x6) # x3 <- memory[x6 + 0]
Instruction 2) lw x9, 4(x6) # x9 <- memory[x6 + 4]
Instruction 3) add x8, x8, x9 # x8 <- x8 + x9
Instruction 4) lw x9, 8(x6) # x9 <- memory[x6 + 8]
Instruction 5) add x8, x8, x9 # x8 <- x8 + x9
Instruction 6) div x8, x8, x2 # x8 <- x8 / x2
Instruction 7) sw x8, 0(x7) # memory[x7 + 0] <- x8
Instruction 8) addi x6, x6, 4 # x6 <- x6 + 4
Instruction 9) addi x7, x7, 4 # x7 <- x7 + 4
Instruction 10) blt x6, x5, Loop # if x6 < x5, goto Loop
Answer the following questions:
-
Calculate the cache hit rate up to three places of decimals for the execution of the program on the processor . Suppose that every cache line of the data cache is invalid when the execution of the program starts, and there is no prefetcher.
-
Calculate the IPC (instructions per cycle) up to three places of decimals for the execution of the program on the processor . Suppose that the processor has an instruction memory with no access delay, and there is no delay on the instruction fetch, instruction decode, data forwarding, and write-back to the register file. The processor starts at most one instruction execution for every clock cycle. Until the instruction completes, the processor does not start the subsequent instructions. The processor executes each of
add
,addi
, andblt
instructions in one clock cycle, and executesdiv
instruction in four clock cycles. The processor executes each oflw
andsw
instructions in one clock cycle in case of cache hits and four clock cycles in case of cache misses. -
Consider a modification to the data cache to improve the cache hit rate for the execution of the program . Explain the modification with the reason why the cache hit rate is improved. Note that the data cache capacity should not be modified.
-
Explain a software-level optimization of the program , using a concrete example, for improving the cache hit rate without any modification to the processor or the data cache.
考虑一个处理器 ,它有一个直接映射的数据缓存,总共存储 256 字节的数据。数据缓存的缓存行大小(块大小)为 16 字节。通过数据缓存,处理器 通过加载字指令 lw
从内存读取数据,并通过存储字指令 sw
将数据写入内存。加载字/存储字指令的地址和数据位宽为 32。当一个内存地址的位表示为 时,数据缓存的索引和偏移量分别为 和 。处理器 有 32 个整数寄存器,从 到 ,其中 是始终保持值 0 的零寄存器。
下面的程序 将大小为 3 的均值滤波应用于一维数组 (头地址为 0x1000,大小为 402),并将结果存储在一维数组 (头地址为 0x2000,大小为 400)上。数组 和 的每个元素都是 32 位有符号整数。每条指令的行为在程序中的注释(# 之后的描述)中描述,其中 表示对地址 的内存访问。寄存器 、 和 的初始值分别为 0x1640、0x1000 和 0x2000。
指令 0) addi x2, x0, 3 # x2 <- x0 + 3
指令 1) Loop: lw x3, 0(x6) # x3 <- memory[x6 + 0]
指令 2) lw x9, 4(x6) # x9 <- memory[x6 + 4]
指令 3) add x8, x8, x9 # x8 <- x8 + x9
指令 4) lw x9, 8(x6) # x9 <- memory[x6 + 8]
指令 5) add x8, x8, x9 # x8 <- x8 + x9
指令 6) div x8, x8, x2 # x8 <- x8 / x2
指令 7) sw x8, 0(x7) # memory[x7 + 0] <- x8
指令 8) addi x6, x6, 4 # x6 <- x6 + 4
指令 9) addi x7, x7, 4 # x7 <- x7 + 4
指令 10) blt x6, x5, Loop # if x6 < x5, goto Loop
回答以下问题:
-
计算程序 在处理器 上执行的缓存命中率,保留到小数点后三位。假设当程序执行开始时,数据缓存的每一行都是无效的,并且没有预取器。
-
计算程序 在处理器 上执行的 IPC(每周期指令数),保留到小数点后三位。假设处理器 具有无访问延迟的指令存储器,并且指令提取、指令解码、数据转发和寄存器文件的写回没有延迟。处理器最多在每个时钟周期启动一条指令的执行。在指令完成之前,处理器不会启动后续指令。处理器在一个时钟周期内执行每条
add
、addi
和blt
指令,并在四个时钟周期内执行div
指令。处理器在缓存命中情况下在一个时钟周期内执行每条lw
和sw
指令,而在缓存未命中情况下在四个时钟周期内执行。 -
考虑对数据缓存进行修改以提高程序 执行的缓存命中率。解释修改的内容以及缓存命中率提高的原因。注意数据缓存容量不应修改。
-
解释程序 的一种软件级优化,使用一个具体的例子,在不对处理器 或数据缓存进行任何修改的情况下提高缓存命中率。
Problem 3
Let be an undirected graph with no self-loops (edges joining the same vertex) nor multi-edges (two or more edges joining the same two vertices), with , . If there is a vertex in a connected graph such that after deleting , the resulting graph is not connected, we call a cut vertex of .
Answer the following questions:
-
Describe an algorithm to check whether or not is connected. Estimate the time complexity of the algorithm.
-
Describe a time algorithm to find a spanning tree , given a connected graph .
-
Let be a spanning tree of a connected graph , and assume that is a non-leaf node of (i.e., the degree of in is at least two) and that is not a cut vertex of . Let be an edge of that is incident with . Prove that one can obtain another spanning tree of from by replacing with another edge of .
-
Describe a time algorithm, given a connected graph , to find all cut vertices of .
设 为一个无向图,该图没有自环(连接到同一顶点的边)也没有重边(连接同一对顶点的两个或更多边),且 ,。如果在一个连通图 中有一个顶点 ,使得删除 后所得的图不再连通,我们称 为 的割点。
回答以下问题:
-
描述一个算法来检查 是否连通。估计该算法的时间复杂度。
-
描述一个 时间的算法,给定一个连通图 ,找到一棵生成树 。
-
设 为连通图 的一棵生成树,并假设 是 的一个非叶节点(即 在 中的度数至少为二),并且 不是 的割点。令 为 中与 相邻的一条边。证明通过用 的另一条边 替换 ,可以得到 的另一棵生成树。
-
描述一个 时间的算法,给定一个连通图 ,找到 的所有割点。
Problem 4
Let us consider the following function described in a C-like programming language with the call-by-value evaluation strategy. We assume that, unlike in the C language, there is no bound on integer data, and no overflow occurs.
For example, is evaluated as follows, where the return value is 1 and the number of calls of the function during the evaluation is 3.
Answer the following questions:
-
Give the number of calls of the function during the evaluation of .
-
Show that the return value of is 1 for every non-negative integer .
-
Let be a non-negative integer. Express the number of calls of the function during the evaluation of in terms of .
-
Let be a non-negative integer. Express the number of calls of the function when is evaluated by using the call-by-name strategy instead of the call-by-value strategy, in terms of .
-
Give an example of a program that does not terminate with the call-by-value strategy but terminates with the call-by-name strategy.
让我们考虑以下在类似 C 的编程语言中描述的函数 ,该语言采用值传递(call-by-value)求值策略。我们假设,与 C 语言不同,整数数据没有限制,也不会发生溢出。
例如, 的计算过程如下,返回值为 1,并且在计算过程中函数 被调用了 3 次。
回答以下问题:
-
计算 在求值过程中函数 被调用的次数。
-
证明对于每一个非负整数 , 的返回值都是 1。
-
设 是一个非负整数。表达 在求值过程中函数 被调用次数的公式,使用 表示。
-
设 是一个非负整数。表达 在使用按名调用策略(call-by-name)而非按值调用策略(call-by-value)时,函数 被调用次数的公式,使用 表示。
-
给出一个程序的例子,该程序在按值调用策略下不终止,但在按名调用策略下终止。