CSAPP Lab4 解题分析

Lab4 的Cache Lab 的说明资料太少了,csim.c 里面空空的,惊了。身为弱渣的我只好先拿别人的代码来参考参考。这里选取的代码csim.c

Part A: Writing a Cache Simulator

需要些一个Cache模拟器,将 valgrind 内存访问记录作为参数,模拟是否命中缓存。输出命中,不命中,移除。

lab里提供了一个写好的程序 csim-ref ,使用LRU(least-recently used)替换原则,模拟缓存命中情况。

实验指导上给出了命令行参数

Usage: ./csim-ref [-hv] -s <s> -E <E> -b <b> -t <tracefile>

  • -h: Optional help flag that prints usage info

  • -v: Optional verbose flag that displays trace info

  • -s <s>: Number of set index bits ($ S = 2^{s} $ is the number of sets)

  • -E <E>: Associativity (number of lines per set)

  • -b <b>: Number of block bits ($ B = 2^{b} $ is the block size)

  • -t <tracefile>: Name of the valgrind trace to replay

所以在 csim.c 里也可以写个打印帮助的函数

1
2
3
4
5
6
7
void printUsage(){
printf("Usage: ./csim [-h] [-v] -s <s> -E <E> -b <b> -t <tracefile>\n");
printf("-s: number of set index(2^s sets)\n");
printf("-E: number of lines per set\n");
printf("-b: number of block offset bits\n");
printf("-t: trace file name\n");
}

参考书中对cache的描述,需要些用结构体定义出cache的一些属性

1
2
3
4
5
6
7
8
9
10
11
12
typedef struct{
int s;//2^s sets
int S;//S=2^s
int b;//2^b per block
int B;//B=2^b
int E;//number of lines per set
//number to track the solution
int hitNum;
int missNum;
int evictionNum;
int verbosity;
} cacheProperty;

对应的说明可以看书中 Figure 6.28.

根据Figure 6.27,cache是一个set的数组,每个set包含一个或者多个行,每个行又包括一个有效位,tag标识位, 数据块。所以可以定义出cache的组成

1
2
3
4
5
6
7
8
9
10
11
12
13
14
typedef struct {
int valid;
unsigned long long tag;
char *block;
int usedCounter;
} setLine;

typedef struct {
setLine *lines;
} cacheSet;

typedef struct {
cacheSet *sets;
} cache;

接着就需要考虑cache的初始化。其实就是使用malloc分配内存给定义的cache。一开始所有的有效位和tag标识位都为0。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
cache initiate(long long S, int E) 
{
cache currentCache;
cacheSet set;
setLine line;
currentCache.sets = (cacheSet *) malloc(sizeof(cacheSet) * S);
//loop to construct the set of the cache
int i;
for (i = 0; i < S; i++)
{
set.lines = (setLine *) malloc(sizeof(setLine) * E);
currentCache.sets[i] = set;
//loop to construct the lines
int j;
for (j = 0; j < E; j++)
{
line.valid = 0;
line.tag = 0;
set.lines[j] = line;
line.usedCounter = 0;
}
}
return currentCache;
}

对于判断是否命中,要判断两个部分,有效位是否设置,cache的line中tag标识位是否匹配。

1
2
3
4
5
6
7
8
int checkHit(setLine line, unsigned long long tag){
if(line.valid){
if(line.tag == tag){
return 1;
}
}
return 0;
}

判断行里是否已经满了。

1
2
3
4
5
6
7
8
9
int checkFull(cacheSet set, cacheProperty property){
int i = 0;
for(i = 0; i<property.E; i++){
if(set.lines[i].valid == 0){
return 1;
}
}
return 0;
}

当需要缓存时,需要找到一个有效的缓存空间。

1
2
3
4
5
6
7
8
9
10
int findIndex(cacheSet set, cacheProperty property){
int i = 0;
for(i = 0; i<property.E; i++){
if(set.lines[i].valid == 0){
return i;
}
}
//shouldn't get a -1 anyway, method only called when there is granted space
return -1;
}

当缓存不用了,需要移出这块缓存里的内容。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
int findEvict(cacheSet set, cacheProperty property){
int min = set.lines[0].usedCounter;
int i = 0;
int index = 0;
for(i = 0; i < property.E ; i++){
if(min>set.lines[i].usedCounter){
index = i;
min = set.lines[i].usedCounter;
}
}
return index;
}

int findMax(cacheSet set, cacheProperty property){
int max = set.lines[0].usedCounter;
int i = 0;
int index = 0;
for(i = 0; i < property.E ; i++){
if(set.lines[i].usedCounter>max){
index = i;
max = set.lines[i].usedCounter;
}
}
return index;
}

接下来最关键的是写一个模拟器,模拟cache操作。首先,按照书中的介绍,tag的数量在64位系统下等于64减去块偏移位数量和组索引位数量。根据tag数量,可以求出组索引。接着将这块cache记录到Cache区中。通过循环遍历一块cache块里的行,查看命中情况。

如果命中,更新cache的属性。如果不命中,并且缓存并没满,那么需要添加新的缓存内容。如果因为缓存满了,那么需要将缓存中最先使用的缓存块移除缓存。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
cacheProperty simulate(cache currentCache,cacheProperty property, unsigned long long address){
//compute the size of the tag, 64 bit system
int tagSize = 64-(property.b + property.s);
unsigned long long tag = address >> (property.s + property.b);
//use the tagSize to compute for the set index
unsigned long long temp = address << (tagSize);
unsigned long long setIndex = temp >> (tagSize + property.b);
cacheSet set = currentCache.sets[setIndex];
//loop through lines in the set
int i = 0;
int hit = 0;
for (i = 0; i<property.E; i++){
setLine currentLine = set.lines[i];
//check if there is a hit
if(checkHit(currentLine, tag) == 1){
//if hit, update the staffs in the property
property.hitNum+=1;
int max = 0;
hit = 1;
max = findMax(set, property);
currentCache.sets[setIndex].lines[i].usedCounter = currentCache.sets[setIndex].lines[max].usedCounter+1;
}
}
if(hit == 0 && checkFull(set, property) == 1){
//if not full then it is a miss, update the staffs in the property&set
property.missNum+=1;
int index = 0;
index = findIndex(set, property);
set.lines[index].tag = tag;
set.lines[index].valid = 1;
int max = 0;
max = findMax(set, property);
currentCache.sets[setIndex].lines[index].usedCounter = currentCache.sets[setIndex].lines[max].usedCounter+1;
}else if(hit == 0){
//evict, update the staffs in the property&set
property.missNum+=1;
property.evictionNum+=1;
int evictIndex = 0;
//find the evict line
evictIndex = findEvict(set, property);
set.lines[evictIndex].tag = tag;
int max = 0;
max = findMax(set, property);
currentCache.sets[setIndex].lines[evictIndex].usedCounter = currentCache.sets[setIndex].lines[max].usedCounter+1;
}
return property;
}

当然最后需要些一个主函数来调用测试接口。这里关于trace文件。每一条对内存访问的记录格式是 [空格]操作符 地址,大小,以 I 开头的是载入指令的记录,不算在内存访问中。

  • M 表示数据修改,需要一次载入 + 一次存储,也就是相当于两次访问
  • S 表示数据存储
  • L 表示数据载入
  • 地址指的是一个 64 位的 16 进制内存地址
  • 大小表示该操作内存访问的字节数
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
int main(int argc, char** argv)
{
cache currentCache;
cacheProperty property;
//initiate the char to store the trace
char* trace;
char input;
property.verbosity = 0;
//parse input
while( (input=getopt(argc,argv,"s:E:b:t:vh")) != -1)
{
switch(input){
case 's':
property.s = atoi(optarg);
break;
case 'E':
property.E = atoi(optarg);
break;
case 'b':
property.b = atoi(optarg);
break;
case 't':
trace = optarg;
break;
case 'v':
property.verbosity = 1;
break;
//help
case 'h':
printUsage();
exit(0);
default:
printUsage();
exit(-1);
}
}
//compute S and B
property.S = pow(2.0,property.s);
property.B = pow(2.0,property.b);
//initialize the cache
currentCache = initiate(property.S, property.E);
//initialize the counters
property.missNum = 0;
property.hitNum = 0;
property.evictionNum = 0;
//input file
FILE *tmp;
//wrap the trace into tmp
char command;
unsigned long long address;
int size;
tmp = fopen(trace, "r");
while(fscanf(tmp, " %c %llx,%d", &command, &address, &size) == 3){
switch(command){
//just ignore I
case 'I':
break;
case 'L':
property = simulate(currentCache, property, address);
break;
case 'S':
property = simulate(currentCache, property, address);
break;
//twice for M
case 'M':
property = simulate(currentCache, property, address);
property = simulate(currentCache, property, address);
break;
default:
break;
}
}
//print result
printSummary(property.hitNum, property.missNum, property.evictionNum);
fclose(tmp);
return 0;
}

总结

先总结Part A部分。虽然并没有完全动手写这个代码,仅仅是参考分析他人的代码(主要是没看懂trace文件和不知道怎么调用接口),但对于cache的操作,通过上面分析源码的含义并参考书中的知识,大致有了比较深刻的理解。