清华操作系统课程学习

因为学习操作系统,因为还有一些其他事情要做,所以没有直接上MIT 6.828,选择了清华的操作系统课和配套的ucore。

从机器启动到操作系统运行的过程

在机器接通电源后, 会向主板发送一个信号。一旦主板收到电源妥协信号后,主板会尝试启动CPU,CPU复位寄存器里的所有数据, 设置预定值给每个寄存器。同时执行系统初始化软件完成基本IO初始化和引导加载功能。为最终调用操作系统内核准备好正确的环境。最终引导加载程序把操作系统内核映像加载到RAM中,并将系统控制权传递给它。

以Intel 80386为例,计算机加电后,CPU从物理地址0xFFFFFFF0(由初始化的CS:EIP确定,此时CS和IP的值分别是0xF000和0xFFF0)开始执行。8086 处理器有一个20位寻址总线,这意味着它可以对0到 2^20 位地址空间进行操作( 1Mb )。不过它只有16位的寄存器,通过这个16位寄存器最大寻址是 2^16 即 0xffff (64 Kb)。实模式使用分段机制来管理整个内存空间。所有内存被分成固定的 64KB 大小的小块。由于我们不能用16位寄存器寻址大于 64KB 的内存,一种替代的方法被设计出来了。一个地址包括两个部分:数据段起始地址和从该数据段起的偏移量。为了得到内存中的物理地址,我们要让数据段乘16并加上偏移量:

1
PhysicalAddress = Segment * 16 + Offset

通过阅读 kernel boot protocol,在内核被引导如内存后,内存使用情况将入下表所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
         | Protected-mode kernel  |
100000 +------------------------+
| I/O memory hole |
0A0000 +------------------------+
| Reserved for BIOS | Leave as much as possible unused
~ ~
| Command line | (Can also be below the X+10000 mark)
X+10000 +------------------------+
| Stack/heap | For use by the kernel real-mode code.
X+08000 +------------------------+
| Kernel setup | The kernel real-mode code.
| Kernel boot sector | The kernel legacy boot sector.
X +------------------------+
| Boot loader | <- Boot sector entry point 0x7C00
001000 +------------------------+
| Reserved for MBR/BIOS |
000800 +------------------------+
| Typically used by MBR |
000600 +------------------------+
| BIOS use only |
000000 +------------------------+

所以当 bootloader 完成任务,将执行权移交给 kernel,kernel 的代码从以下地址开始执行:

1
0x1000 + X + sizeof(KernelBootSector) + 1

上面的公式中, X 是 kernel bootsector 被引导如内存的位置。

Lab 1

练习1

1.操作系统镜像文件ucore.img是如何一步一步生成的?

在进行 make V=后,通过显示的信息,可以知道先对C文件进行了编译。分别编译了一下文件:

  • cc kern/init/init.c 初始化系统
  • cc kern/libs/readline.c 从输入中读取一行
  • cc kern/libs/stdio.c 标准化IO
  • cc kern/debug/kdebug.c debug支撑文件
  • cc kern/debug/kmonitor.c 命令行显示器
  • cc kern/debug/panic.c 显示错误警告
  • cc kern/driver/clock.c 时钟相关
  • cc kern/driver/console.c 控制台显示
  • cc kern/driver/intr.c 中断请求
  • cc kern/driver/picirq.c 中断控制器
  • cc kern/trap/trap.c 中断处理
  • cc kern/trap/trapentry.S 中断预处理
  • cc kern/trap/vectors.S 中断向量表
  • cc kern/mm/pmm.c 全局描述符表
  • cc libs/printfmt.c 输出格式
  • cc libs/string.c 字符串操作
  • cc boot/bootasm.S Boot Loader程序
  • cc boot/bootmain.c Boot Loader程序
  • cc tools/sign.c 硬盘主引导扇区检测

接着 ld 命令对编译好的 .o 文件进行链接,生成了引导扇区。

之后用 dd 创建了一块空间,并将 bootblock 放入这块空间做成 img 硬盘。

2.一个被系统认为是符合规范的硬盘主引导扇区的特征是什么?

tools/sign.c 源文件中可以得知,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

.....
if (st.st_size > 510) {
fprintf(stderr, "%lld >> 510!!\n", (long long)st.st_size);
return -1;
}
char buf[512];
......
buf[510] = 0x55;
buf[511] = 0xAA;
......
if (size != 512) {
fprintf(stderr, "write '%s' error, size is %d.\n", argv[2], size);
return -1;
}
fclose(ofp);
printf("build 512 bytes boot sector: '%s' success!\n", argv[2]);
......

规范的硬盘主引导扇区大小为512, 并且最后两位必须是 0x55AA.

练习2

使用qemu执行并调试lab1中的软件,要做一下几个操作:

  1. 从CPU加电后执行的第一条指令开始,单步跟踪BIOS的执行。

  2. 在初始化位置0x7c00设置实地址断点,测试断点正常。

  3. 从0x7c00开始跟踪代码运行,将单步跟踪反汇编得到的代码与bootasm.S和 bootblock.asm进行比较。

  4. 自己找一个bootloader或内核中的代码位置,设置断点并进行测试。

根据参考的提示,在 .PHONY: qemu qemu-nox debug debug-nox 下添加

1
2
3
4
lab1-mon: $(UCOREIMG)
$(V)$(TERMINAL) -e "$(QEMU) -S -s -d in_asm -D $(BINDIR)/q.log -monitor stdio -hda $< -serial null"
$(V)sleep 2
$(V)$(TERMINAL) -e "gdb -q -x tools/lab1init"

在调用qemu时添加-d in_asm -D q.log参数可以将运行的汇编指令保存在q.log中。

另外需要先在lab1/tools/文件夹下新建一个lab1init文件。添加以下内容

1
2
3
4
5
6
file bin/kernel
target remote :1234
set architecture i8086
break *0x7c00
continue
x /2i $pc

分别代表加载kernel, 与qemu链接, BIOS是i8086 16位实模式, 在0x7c00处设置断点, 继续运行, 显示指令寄存器上的两条指令。

make lab1-mon 产生的汇编代码还是稍微有点区别,主要在0x7c1e开始有点区别。

#####gdb的单步命令

在gdb中,有next, nexti, step, stepi等指令来单步调试程序,他们功能各不相同,区别在于单步的“跨度”上。

  • next 单步到程序源代码的下一行,不进入函数。
  • nexti 单步一条机器指令,不进入函数。
  • step 单步到下一个不同的源代码行(包括进入函数)。
  • stepi 单步一条机器指令。

练习3

参考MIT6.828 boot.S文件分析这篇文章。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
#include <asm.h>

# Start the CPU: switch to 32-bit protected mode, jump into C.
# The BIOS loads this code from the first sector of the hard disk into
# memory at physical address 0x7c00 and starts executing in real mode
# with %cs=0 %ip=7c00.

#启动CPU,切换到32位保护模式,跳转到C代码
#BIOS从硬盘的第一个扇区加载这个代码到
#物理内存地址为0x7c00的地方,cs=0,ip=7c00

#下面的3条.set指令类似于宏定义
#内核代码段选择子
.set PROT_MODE_CSEG, 0x8 # kernel code segment selector
#内核数据段选择子
.set PROT_MODE_DSEG, 0x10 # kernel data segment selector
#保护模式使能标志
.set CR0_PE_ON, 0x1 # protected mode enable flag

# start address should be 0:7c00, in real mode, the beginning address of the running bootloader
#开始地址在实模式下是0:7c00,这个开始地址就是运行bootloader的地址

#定义一个全局名字start
.globl start
start:
#CPU启动为16位模式
.code16 # Assemble for 16-bit mode
#关中断
cli # Disable interrupts
#清除方向标志
cld # String operations increment

# Set up the important data segment registers (DS, ES, SS).
#设置重要的数据段寄存器
#ax,ds, es, ss清零
xorw %ax, %ax # Segment number zero
movw %ax, %ds # -> Data Segment
movw %ax, %es # -> Extra Segment
movw %ax, %ss # -> Stack Segment

# Enable A20:
# For backwards compatibility with the earliest PCs, physical
# address line 20 is tied low, so that addresses higher than
# 1MB wrap around to zero by default. This code undoes this.
#打开A20地址线
#为了兼容早期的PC机,第20根地址线在实模式下不能使用
#所以超过1MB的地址,默认就会返回到地址0,重新从0循环计数,
#下面的代码打开A20地址线
seta20.1:
#从0x64端口读入一个字节的数据到al中
inb $0x64, %al # Wait for not busy(8042 input buffer empty).
testb $0x2, %al
#如果上面的测试中发现al的第2位为0,就不执行该指令
#否则就循环检查
jnz seta20.1

#将0xd1写入到al中
movb $0xd1, %al # 0xd1 -> port 0x64
#将al中的数据写入到端口0x64中,也就是8042的P2端口
outb %al, $0x64 # 0xd1 means: write data to 8042's P2 port

seta20.2:
inb $0x64, %al # Wait for not busy(8042 input buffer empty).
#测试al的第2位是否为0
testb $0x2, %al
jnz seta20.2

#将0xdf写入到al中
movb $0xdf, %al # 0xdf -> port 0x60
#将al中的数据写入到0x60端口中,也就是让P2端口的A20位设置为1
outb %al, $0x60 # 0xdf = 11011111, means set P2's A20 bit(the 1 bit) to 1

# Switch from real to protected mode, using a bootstrap GDT
# and segment translation that makes virtual addresses
# identical to physical addresses, so that the
# effective memory map does not change during the switch.
#将全局描述符表描述符加载到全局描述符表寄存器
lgdt gdtdesc
#cr0中的第0位为1表示处于保护模式
#cr0中的第0位为0,表示处于实模式
#把控制寄存器cr0加载到eax中
movl %cr0, %eax
#将eax中的第0位设置为1
orl $CR0_PE_ON, %eax
#将eax中的值装入cr0中
movl %eax, %cr0

# Jump to next instruction, but in 32-bit code segment.
# Switches processor into 32-bit mode.
ljmp $PROT_MODE_CSEG, $protcseg
#跳转到32位模式中的下一条指令
#将处理器切换为32位工作模式
#下面这条指令执行的结果会将$PROT_MODE_CSEG加载到cs中,cs对应的高速缓冲存储器会加载代码段描述符
#同样将$protcseg加载到ip中

.code32 # Assemble for 32-bit mode
protcseg:
# Set up the protected-mode data segment registers
#设置保护模式下的数据寄存器
#将数据段选择子装入到ax中
movw $PROT_MODE_DSEG, %ax # Our data segment selector
#将ax装入到其他数据段寄存器中,在装入的同时,
#数据段描述符会自动的加入到这些段寄存器对应的高速缓冲寄存器中
movw %ax, %ds # -> DS: Data Segment
movw %ax, %es # -> ES: Extra Segment
movw %ax, %fs # -> FS
movw %ax, %gs # -> GS
movw %ax, %ss # -> SS: Stack Segment

# Set up the stack pointer and call into C. The stack region is from 0--start(0x7c00)
#设置栈指针,并且调用c函数
movl $0x0, %ebp
#调用main.c中的bootmain函数
movl $start, %esp
call bootmain

# If bootmain returns (it shouldn't), loop.
#如果bootmain返回的话,就一直循环
spin:
jmp spin

# Bootstrap GDT
#强制4字节对齐
.p2align 2 # force 4 byte alignment
#全局描述符表
gdt:
SEG_NULLASM # null seg
#代码段描述符
SEG_ASM(STA_X|STA_R, 0x0, 0xffffffff) # code seg for bootloader and kernel
#数据段描述符
SEG_ASM(STA_W, 0x0, 0xffffffff) # data seg for bootloader and kernel

#全局描述符表对应的描述符
gdtdesc:
.word 0x17 # sizeof(gdt) - 1
.long gdt # address gdt

练习4

通过分析源代码和通过qemu来运行并调试bootloader&OS,

  • bootloader如何读取硬盘扇区的?
  • bootloader是如何加载ELF格式的OS?

在bootmain.c中有介绍。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
/* readsect - read a single sector at @secno into @dst */
static void
readsect(void *dst, uint32_t secno) {
// wait for disk to be ready
waitdisk();

outb(0x1F2, 1); // count = 1
outb(0x1F3, secno & 0xFF);
outb(0x1F4, (secno >> 8) & 0xFF);
outb(0x1F5, (secno >> 16) & 0xFF);
outb(0x1F6, ((secno >> 24) & 0xF) | 0xE0);
outb(0x1F7, 0x20); // cmd 0x20 - read sectors

// wait for disk to be ready
waitdisk();

// read a sector
insl(0x1F0, dst, SECTSIZE / 4);
}

读一个扇区的流程大致如下:

  1. 等待磁盘准备好
  2. 发出读取扇区的命令
  3. 等待磁盘准备好
  4. 把磁盘扇区数据读到指定内存

通过readseg函数将OS读取到ELFHDR所指示的地址(0x10000)。
判断头部是否合法,如不合法直接跳出。
把kernel的每一个程序段都载入内存。
通过程序入口地址加载作为ELF可执行文件的kernel。

练习5

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
/* *
* print_stackframe - print a list of the saved eip values from the nested 'call'
* instructions that led to the current point of execution
*
* The x86 stack pointer, namely esp, points to the lowest location on the stack
* that is currently in use. Everything below that location in stack is free. Pushing
* a value onto the stack will invole decreasing the stack pointer and then writing
* the value to the place that stack pointer pointes to. And popping a value do the
* opposite.
*
* The ebp (base pointer) register, in contrast, is associated with the stack
* primarily by software convention. On entry to a C function, the function's
* prologue code normally saves the previous function's base pointer by pushing
* it onto the stack, and then copies the current esp value into ebp for the duration
* of the function. If all the functions in a program obey this convention,
* then at any given point during the program's execution, it is possible to trace
* back through the stack by following the chain of saved ebp pointers and determining
* exactly what nested sequence of function calls caused this particular point in the
* program to be reached. This capability can be particularly useful, for example,
* when a particular function causes an assert failure or panic because bad arguments
* were passed to it, but you aren't sure who passed the bad arguments. A stack
* backtrace lets you find the offending function.
*
* The inline function read_ebp() can tell us the value of current ebp. And the
* non-inline function read_eip() is useful, it can read the value of current eip,
* since while calling this function, read_eip() can read the caller's eip from
* stack easily.
*
* In print_debuginfo(), the function debuginfo_eip() can get enough information about
* calling-chain. Finally print_stackframe() will trace and print them for debugging.
*
* Note that, the length of ebp-chain is limited. In boot/bootasm.S, before jumping
* to the kernel entry, the value of ebp has been set to zero, that's the boundary.
* */
void
print_stackframe(void) {
/* LAB1 YOUR CODE : STEP 1 */
/* (1) call read_ebp() to get the value of ebp. the type is (uint32_t);
* (2) call read_eip() to get the value of eip. the type is (uint32_t);
* (3) from 0 .. STACKFRAME_DEPTH
* (3.1) printf value of ebp, eip
* (3.2) (uint32_t)calling arguments [0..4] = the contents in address (unit32_t)ebp +2 [0..4]
* (3.3) cprintf("\n");
* (3.4) call print_debuginfo(eip-1) to print the C calling function name and line number, etc.
* (3.5) popup a calling stackframe
* NOTICE: the calling funciton's return addr eip = ss:[ebp+4]
* the calling funciton's ebp = ss:[ebp]
*/
uint32_t ebp = read_ebp();
uint32_t eip = read_eip();

int i, j;
for(i = 0; ebp != 0 && i < STACKFRAME_DEPTH; i++)
{
cprintf("epb:0x%08x eip:0x%08x args:", ebp, eip);
uint32_t *args = (uint32_t *)ebp + 2;
for (j = 0; j < 4; j++)
{
cprintf("0x%08x ", args[j]);
}
cprintf("\n");
print_debuginfo(eip - 1);
eip = ((uint32_t *)ebp)[1];
ebp = ((uint32_t *)ebp)[0];
}

}

提示还是挺详细的,按照提示一步步写就能写出。首先读取ebp和eip的值,然后将堆栈里的信息打印出来,之后移动eip和ebp。

练习6

1.中断描述符表(也可简称为保护模式下的中断向量表)中一个表项占多少字节?其中哪几位代表中断处理代码的入口?

通过查看/kern/mm/mmh.h,将struct gatedesc中元素的位相加,共64bit也就是8个字节。另外低16bit和高16bit为段偏移量,查看这个资料,可以知道段选择子是16bit。

Name Bit Full Name
Offset 48..63 Offset 16..31
P 47 Present
DPL 45,46 Descriptor Privilege Level
S 44 Storage Segment
Type 40..43 Gate Type 0..3
0 32..39 Unused 0..7
Selector 16..31 Selector 0..15
Offset 0..15 Offset 0..15

2.请编程完善kern/trap/trap.c中对中断向量表进行初始化的函数idt_init。在idt_init函数中,依次对所有中断入口进行初始化。使用mmu.h中的SETGATE宏,填充idt数组内容。每个中断的入口由tools/vectors.c生成,使用trap.c中声明的vectors数组即可。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
/* idt_init - initialize IDT to each of the entry points in kern/trap/vectors.S */
void
idt_init(void) {
/* LAB1 YOUR CODE : STEP 2 */
/* (1) Where are the entry addrs of each Interrupt Service Routine (ISR)?
* All ISR's entry addrs are stored in __vectors. where is uintptr_t __vectors[] ?
* __vectors[] is in kern/trap/vector.S which is produced by tools/vector.c
* (try "make" command in lab1, then you will find vector.S in kern/trap DIR)
* You can use "extern uintptr_t __vectors[];" to define this extern variable which will be used later.
* (2) Now you should setup the entries of ISR in Interrupt Description Table (IDT).
* Can you see idt[256] in this file? Yes, it's IDT! you can use SETGATE macro to setup each item of IDT
* (3) After setup the contents of IDT, you will let CPU know where is the IDT by using 'lidt' instruction.
* You don't know the meaning of this instruction? just google it! and check the libs/x86.h to know more.
* Notice: the argument of lidt is idt_pd. try to find it!
*/
extern uintptr_t __vectors[];
int i;
for (i = 0; i < sizeof(idt) / sizeof(struct gatedesc); i ++) {
SETGATE(idt[i], 0, GD_KTEXT, __vectors[i], DPL_KERNEL);
}
// set for switch from user to kernel
SETGATE(idt[T_SWITCH_TOK], 0, GD_KTEXT, __vectors[T_SWITCH_TOK], DPL_USER);
// load the IDT
lidt(&idt_pd);
}

3.请编程完善trap.c中的中断处理函数trap,在对时钟中断进行处理的部分填写trap函数中处理时钟中断的部分,使操作系统每遇到100次时钟中断后,调用print_ticks子程序,向屏幕上打印一行文字”100 ticks”。

1
2
3
4
5
6
7
8
9
10
11
12
case IRQ_OFFSET + IRQ_TIMER:
/* LAB1 YOUR CODE : STEP 3 */
/* handle the timer interrupt */
/* (1) After a timer interrupt, you should record this event using a global variable (increase it), such as ticks in kern/driver/clock.c
* (2) Every TICK_NUM cycle, you can print some info using a funciton, such as print_ticks().
* (3) Too Simple? Yes, I think so!
*/
ticks ++;
if (ticks % TICK_NUM == 0) {
print_ticks();
}
break;

To be continued.