2023S

Problem 4

Answer the following questions on computer architecture.

When the following program in C language was compiled and executed on a processor A, the output (a) was obtained. When the same program was compiled and executed on a processor B, the output (b) was obtained. Explain the reason why the difference occurred from the viewpoint of processor architecture.
```
#include <stdio.h>
 
union my_uni {
    int v;
    char arr[4];
};
 
int main(){
    union my_uni val = {0x12345678};
    int i;
    for(i=0; i<4; i++){
        printf("0x%x\n", val.arr[i]);
    }
    return 0;
}
```
Output (a):
```
0x12
0x34
0x56
0x78
```
Output (b):
```
0x78
0x56
0x34
0x12
```
Explain data-hazard and control-hazard on a pipeline processor with no forwarding mechanism, using a concrete example.
Consider a 4-way set-associative cache memory that stores totally 32 kibibytes ( $32 \times 2^{10}$ bytes) of data. The address width of the cache memory is 32 bits, and the cache line size (block size) is 64 bytes. Calculate the bit width of the cache index and that of a tag of the cache memory, respectively. Calculate also the total RAM capacity (the number of bits) for storing the tags of the cache memory.
Consider a processor with an instruction cache and a data cache. Suppose that the CPI (cycles per instruction) of the processor is $C$ when there is no cache miss on both the instruction and data caches. When there is a cache miss on any of the caches, a cache miss penalty of $P$ clock cycles is additionally imposed. Suppose that when a program was executed on the processor, the ratio of the number of load/store instructions to the total number of executed instructions was $R_{l s}$ . Suppose also that, for that program execution, the cache miss rate of the instruction cache was $R_{i}$ , the cache miss rate of the data cache was $R_{d}$ , and the IPC (instructions per cycle) of the processor was $I$ . Express I in terms of $C$ , $R_{i}$ , $R_{d}$ , $R_{l s}$ , and $P$ .

回答以下有关计算机体系结构的问题。

当在处理器 A 上编译并执行以下 C 语言程序时，输出 (a) 被获得。当在处理器 B 上编译并执行相同的程序时，输出 (b) 被获得。请从处理器架构的角度解释为什么会出现这种差异。

#include <stdio.h>
 
union my_uni {
    int v;
    char arr[4];
};
 
int main(){
    union my_uni val = {0x12345678};
    int i;
    for(i=0; i<4; i++){
        printf("0x%x\n", val.arr[i]);
    }
    return 0;
}

输出 (a):

0x12
0x34
0x56
0x78

输出 (b):

0x78
0x56
0x34
0x12

使用具体的例子解释没有转发机制的流水线处理器上的数据冒险和控制冒险。
考虑一个 4 路组相连缓存，其总共存储了 32 kibibytes（ $32 \times 2^{10}$ 字节）的数据。该缓存的地址宽度为 32 位，缓存行大小（块大小）为 64 字节。分别计算缓存索引的位宽和缓存的标签位宽。还要计算存储缓存标签的总 RAM 容量（位数）。
考虑一个具有指令缓存和数据缓存的处理器。假设处理器的 CPI（每指令周期数）在指令缓存和数据缓存都没有缓存未命中的情况下为 $C$ 。当任一缓存出现未命中时，会另外增加 $P$ 个时钟周期的缓存未命中惩罚。假设当程序在处理器上执行时，加载/存储指令的数量与执行的总指令数量的比率为 $R_{l s}$ 。还假设，在该程序执行中，指令缓存的缓存未命中率为 $R_{i}$ ，数据缓存的缓存未命中率为 $R_{d}$ ，处理器的 IPC（每周期指令数）为 $I$ 。用 $C$ ， $R_{i}$ ， $R_{d}$ ， $R_{l s}$ 和 $P$ 来表达 $I$ 。

Zephyr's Notes on ISCS & CBMS, UTokyo

Explorer

Explorer

2023S

2023S

Problem 4

Graph View

Table of Contents

Backlinks