CSAPP Lab4 解题分析

Xin Qiu Jun 07, 2016

Lab4 的Cache Lab 的说明资料太少了，csim.c 里面空空的，惊了。身为弱渣的我只好先拿别人的代码来参考参考。这里选取的代码csim.c。

Part A: Writing a Cache Simulator

需要些一个Cache模拟器，将 valgrind 内存访问记录作为参数，模拟是否命中缓存。输出命中，不命中，移除。

lab里提供了一个写好的程序 csim-ref ，使用LRU(least-recently used)替换原则，模拟缓存命中情况。

实验指导上给出了命令行参数

Usage: ./csim-ref [-hv] -s <s> -E <E> -b <b> -t <tracefile>

-h: Optional help flag that prints usage info
-v: Optional verbose flag that displays trace info
-s <s>: Number of set index bits ($ S = 2^{s} $ is the number of sets)
-E <E>: Associativity (number of lines per set)
-b <b>: Number of block bits ($ B = 2^{b} $ is the block size)
-t <tracefile>: Name of the valgrind trace to replay

所以在 csim.c 里也可以写个打印帮助的函数

void printUsage(){
    printf("Usage: ./csim [-h] [-v] -s <s> -E <E> -b <b> -t <tracefile>\n");
    printf("-s: number of set index(2^s sets)\n");
    printf("-E: number of lines per set\n");
    printf("-b: number of block offset bits\n");
    printf("-t: trace file name\n");
}

参考书中对cache的描述，需要些用结构体定义出cache的一些属性

typedef struct{
    int s;//2^s sets
    int S;//S=2^s
    int b;//2^b per block	
    int B;//B=2^b
    int E;//number of lines per set
    //number to track the solution	
    int hitNum;
    int missNum;
    int evictionNum;
    int verbosity;
} cacheProperty;

对应的说明可以看书中 Figure 6.28.

根据Figure 6.27，cache是一个set的数组，每个set包含一个或者多个行，每个行又包括一个有效位，tag标识位，数据块。所以可以定义出cache的组成

typedef struct {
    int valid;
    unsigned long long tag;
    char *block;
    int usedCounter;
} setLine;

typedef struct {
    setLine *lines;
} cacheSet;

typedef struct {
    cacheSet *sets;
} cache;

接着就需要考虑cache的初始化。其实就是使用malloc分配内存给定义的cache。一开始所有的有效位和tag标识位都为0。

cache initiate(long long S, int E) 
{
    cache currentCache;	
    cacheSet set;
    setLine line;
    currentCache.sets = (cacheSet *) malloc(sizeof(cacheSet) * S);
    //loop to construct the set of the cache
    int i;
    for (i = 0; i < S; i++) 
    {
        set.lines =  (setLine *) malloc(sizeof(setLine) * E);
        currentCache.sets[i] = set;
        //loop to construct the lines
        int j;
        for (j = 0; j < E; j++) 
        {
            line.valid = 0; 
            line.tag = 0; 
            set.lines[j] = line;
            line.usedCounter = 0;
        }
    } 
    return currentCache;
}

对于判断是否命中，要判断两个部分，有效位是否设置，cache的line中tag标识位是否匹配。

int checkHit(setLine line, unsigned long long tag){
    if(line.valid){
        if(line.tag == tag){
            return 1;
        }
    }
    return 0;
}

判断行里是否已经满了。

int checkFull(cacheSet set, cacheProperty property){
    int i = 0;
    for(i = 0; i<property.E; i++){
        if(set.lines[i].valid == 0){
            return 1;
        }
    }
    return 0;
}

当需要缓存时，需要找到一个有效的缓存空间。

int findIndex(cacheSet set, cacheProperty property){
    int i = 0;
    for(i = 0; i<property.E; i++){
        if(set.lines[i].valid == 0){
            return i;
        }
    }
    //shouldn't get a -1 anyway, method only called when there is granted space
    return -1;
}

当缓存不用了，需要移出这块缓存里的内容。

int findEvict(cacheSet set, cacheProperty property){
    int min = set.lines[0].usedCounter;
    int i = 0;
    int index = 0;
    for(i = 0; i < property.E ; i++){
        if(min>set.lines[i].usedCounter){
            index = i;
            min = set.lines[i].usedCounter;
        }
    }
    return index;
}

int findMax(cacheSet set, cacheProperty property){
    int max = set.lines[0].usedCounter;
    int i = 0;
    int index = 0;
    for(i = 0; i < property.E ; i++){
        if(set.lines[i].usedCounter>max){
            index = i;
            max = set.lines[i].usedCounter;
        }
    }
    return index;
}

接下来最关键的是写一个模拟器，模拟cache操作。首先，按照书中的介绍，tag的数量在64位系统下等于64减去块偏移位数量和组索引位数量。根据tag数量，可以求出组索引。接着将这块cache记录到Cache区中。通过循环遍历一块cache块里的行，查看命中情况。

如果命中，更新cache的属性。如果不命中，并且缓存并没满，那么需要添加新的缓存内容。如果因为缓存满了，那么需要将缓存中最先使用的缓存块移除缓存。

cacheProperty simulate(cache currentCache,cacheProperty property, unsigned long long address){
    //compute the size of the tag, 64 bit system
    int tagSize = 64-(property.b + property.s);
    unsigned long long tag = address >> (property.s + property.b);
    //use the tagSize to compute for the set index
    unsigned long long temp = address << (tagSize);
    unsigned long long setIndex = temp >> (tagSize + property.b);
    cacheSet set = currentCache.sets[setIndex];
    //loop through lines in the set
    int i = 0;
    int hit = 0;
    for (i = 0; i<property.E; i++){
        setLine currentLine = set.lines[i];
        //check if there is a hit
        if(checkHit(currentLine, tag) == 1){
            //if hit, update the staffs in the property
            property.hitNum+=1;
            int max = 0;
            hit = 1;
            max = findMax(set, property);
            currentCache.sets[setIndex].lines[i].usedCounter = currentCache.sets[setIndex].lines[max].usedCounter+1;
        }
    }
    if(hit == 0 && checkFull(set, property) == 1){
    //if not full then it is a miss, update the staffs in the property&set
        property.missNum+=1;
        int index = 0;
        index = findIndex(set, property);
        set.lines[index].tag = tag;
        set.lines[index].valid = 1;
        int max = 0;
        max = findMax(set, property);
        currentCache.sets[setIndex].lines[index].usedCounter = currentCache.sets[setIndex].lines[max].usedCounter+1;
    }else if(hit == 0){
        //evict, update the staffs in the property&set
        property.missNum+=1;
        property.evictionNum+=1;
        int evictIndex = 0;
        //find the evict line
        evictIndex = findEvict(set, property);
        set.lines[evictIndex].tag = tag;
        int max = 0;
        max = findMax(set, property);
        currentCache.sets[setIndex].lines[evictIndex].usedCounter = currentCache.sets[setIndex].lines[max].usedCounter+1;
    }
    return property;
}

当然最后需要些一个主函数来调用测试接口。这里关于trace文件。每一条对内存访问的记录格式是 [空格]操作符地址,大小，以 I 开头的是载入指令的记录，不算在内存访问中。

M 表示数据修改，需要一次载入 + 一次存储，也就是相当于两次访问
S 表示数据存储
L 表示数据载入
地址指的是一个 64 位的 16 进制内存地址
大小表示该操作内存访问的字节数

int main(int argc, char** argv)
{
    cache currentCache;
    cacheProperty property;
    //initiate the char to store the trace
    char* trace;
    char input;
    property.verbosity = 0;
    //parse input
    while( (input=getopt(argc,argv,"s:E:b:t:vh")) != -1)
    {
        switch(input){
        case 's':
            property.s = atoi(optarg);
            break;
        case 'E':
            property.E = atoi(optarg);
            break;
        case 'b':
            property.b = atoi(optarg);
            break;
        case 't':
            trace = optarg;
            break;
        case 'v':
            property.verbosity = 1;
            break;
        //help
        case 'h':
            printUsage();
            exit(0);
        default:
            printUsage();
            exit(-1);
        }
    }
    //compute S and B
    property.S = pow(2.0,property.s);
    property.B = pow(2.0,property.b);
    //initialize the cache
    currentCache = initiate(property.S, property.E);
    //initialize the counters
    property.missNum = 0;
    property.hitNum = 0;
    property.evictionNum = 0;
    //input file
    FILE *tmp;
    //wrap the trace into tmp
    char command;
    unsigned long long address;
    int size;
    tmp = fopen(trace, "r");
    while(fscanf(tmp, " %c %llx,%d", &command, &address, &size) == 3){
        switch(command){
            //just ignore I
            case 'I':
                break;
            case 'L':
                property = simulate(currentCache, property, address);
                break;
            case 'S':
                property = simulate(currentCache, property, address);
                break;
            //twice for M
            case 'M':
                property = simulate(currentCache, property, address);
                property = simulate(currentCache, property, address);	
                break;
            default:
                break;
        }
    }
    //print result
    printSummary(property.hitNum, property.missNum, property.evictionNum);
    fclose(tmp);
    return 0;
}

总结

先总结Part A部分。虽然并没有完全动手写这个代码，仅仅是参考分析他人的代码(主要是没看懂trace文件和不知道怎么调用接口)，但对于cache的操作，通过上面分析源码的含义并参考书中的知识，大致有了比较深刻的理解。