XV6 Lab1

PC Physical Address Space

Firstly, we will review the 32-bit PC RAM addressing, its addressing ability is 2^32 = 4GB. Its layout can be shown as

However, the earliest PC’s CPU was 16-bit, and with the 20-bit address line, ideally it should be able to address up to 1M(2^20) RAM space. The fact is that 16-bit CPU can only address 64K(2^16) space. In order to access 20-bit address space, an external memory reference was made up of a 16-bit Offset address added to a 16-bit segment number, shifted 4 bits so as to produce a 20-bit physical address. The final address can be expressed as ‘Segment * 16 + Offset’. But the space would become 0x10FFEF(0xFFFF0+0xFFFF), which exceeds 1M, which caused some unexpected results when programmers accessed the address between 1M and 0x10FFEF, there was no error hint when this happened. It just rounded up using mod 1M, this phenomenon is called ‘wrap-around’.

Let’s focus on this layout, there are several points should be covered:

  • Earliest PC mentioned above could actually access 640KB space, which smaller than the whole space.
  • Address spaces between 640K and 1M were leaved for some special uses. For example, the BIOS, which occupies 64KB from 0x000F0000 through 0x000FFFFF.
  • When Intel broke the barrier of 1M, the PC preserved the original layout of 1M, which ensures the backward compatibility.

Download xv6 source code from here, then start QEMU emulator.

We can tell that this OS start at address 0x000FFFF0, which is located in very top of BIOS space, this address can be calculated by 0xF000 * 16 + 0xFFF0. And the first instruction is ljmp $0xf000, $0xe05b, the purpose of this instruction is to give adequate space for further instructions, after all, we cannot expect 16 bits space is enough for all following jobs.

You may ask why start at 0xFFFF0, this is crucial since it ensures that BIOS always takes control at the very first time when there is NO other software in the RAM can be executed by the processor.

Let’s run several instructions just for guessing, in fact after quite a few steps(with si command in GNU Debugger aka GDB), I noticed that there is a comment: ‘The architecture is assumed to be i386’, which means the OS is now 32-bit.

After BIOS initialized hardware, it searches the bootable device such as floppy, hard drive or CD-ROM. When BIOS successfully finds the bootable device, BIOS reads the boot loader from the device and transfers control to it. I am guessing that the boot loader is designed to initialize the OS itself, just by its name :).

Boot Loader

Floppy and hard drive can be divided into regions are all 512 bytes, called sectors. If the drive is bootable, the first sector is called boot sector, because this is where the boot loader code resides. When the BIOS find this sector, it loads the 512-byte sector into memory at 0x7C00 through 0x7DFF, then use jmp instruction to jump back to the address 0x7C00, passing control to the boot loader. What we should notice is that those addresses above are all fixed.

In this lab, the boot loader is 512 bytes, which contains the code in the path ‘boot/‘. There are 2 files in the directory, one C source code file and one x86 assembly source code file.

In boot.S, after some initializations, we focus on the code below:

1
2
3
4
lgdt    gdtdesc
movl %cr0, %eax
orl $CR0_PE_ON, %eax
movl %eax, %cr0

lgdt instruction use global descriptor table(GDT) to make virtual addresses identical to their physical ones. Not gonna discuss this here. All we need to know it is needed for the 16-bit to 32-bit transition. During the transition from real mode to protected mode, lgdt makes sure the addresses have no need to change at all. CRO_PE_ON constant means enable protected mode.

After setting up protected mode, this code

1
2
3
# Set up the stack pointer and call into C.
movl $start, %esp
call bootmain

as the comment says, move stack pointer to the beginning and call bootmain function in C file.

Now, the main.c file should be analyzed.

1
2
#define SECTSIZE	512
#define ELFHDR ((struct Elf *) 0x10000) // scratch space

The first macro is sector’s space size, which has been discussed above is 512 bytes. The second one is actually a pointer or if you excuse the expression, put an EFL struct on that address(0x10000). The ELF struct is defined like

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
struct Elf {
uint32_t e_magic; // must equal ELF_MAGIC
uint8_t e_elf[12];
uint16_t e_type;
uint16_t e_machine;
uint32_t e_version;
uint32_t e_entry;
uint32_t e_phoff;
uint32_t e_shoff;
uint32_t e_flags;
uint16_t e_ehsize;
uint16_t e_phentsize;
uint16_t e_phnum;
uint16_t e_shentsize;
uint16_t e_shnum;
uint16_t e_shstrndx;
};

Then let’s see function readseg

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// Read 'count' bytes at 'offset' from kernel into physical address 'pa'.
// Might copy more than asked
void
readseg(uint32_t pa, uint32_t count, uint32_t offset)
{
uint32_t end_pa;

end_pa = pa + count;

// round down to sector boundary
pa &= ~(SECTSIZE - 1);

// translate from bytes to sectors, and kernel starts at sector 1
offset = (offset / SECTSIZE) + 1;

// If this is too slow, we could read lots of sectors at a time.
// We'd write more to memory than asked, but it doesn't matter --
// we load in increasing order.
while (pa < end_pa) {
// Since we haven't enabled paging yet and we're using
// an identity segment mapping (see boot.S), we can
// use physical addresses directly. This won't be the
// case once JOS enables the MMU.
readsect((uint8_t*) pa, offset);
pa += SECTSIZE;
offset++;
}
}

When this function is firstly called, its purpose is to read the 1st page off the disk. pa &= ~(SECTSIZE - 1) means sticking to the boundary of each sector. ‘offset’ in this function determines which sector will be read. The while loop ensures all sectors which contain this special ELF format file are read. Therefore, the readseg function is about to fill out the ELF struct, collecting kernel information, but yet, not loading kernel.

Before loading the kernel, ELF_MAGIC should be checked. To load the whole kernel, the information collected such as ELFHDR->e_phoff(the offset of program segment), ELFHDR->e_phnum(the number of program segments) is very helpful. Also, the readseg function this time fills out the other needed information for ELF struct. Once the kernel is completely loaded into the memory, it is time to go into the kernel. With this function.

1
((void (*)(void)) (ELFHDR->e_entry))();

Not finished yet, let’s have a look at function readsect, as we can see, this function directly manipulate I/O ports to read kernel file saved on the disk. I will not dive into the detail about ports from 0x1F0 to 0x1F7. I can find their uses discussed on the internet.

Q&A

  • At what point does the processor start executing 32-bit code? What exactly causes the switch from 16- to 32-bit mode ?

    1
    2
    ljmp    $PROT_MODE_CSEG, $protcseg
    orl $CR0_PE_ON, %eax
  • What is the last instruction of the boot loader executed, and what is the first instruction of the kernel it just loaded ?

    1
    ((void (*)(void)) (ELFHDR->e_entry))();

The first instruction could be seen by analyze the binary file of kernel. Run command

1
objdump -x kernel


With start address, we can easily find the first instruction. Put a breakpoint at this address and here it is.

  • Where is the first instruction of the kernel ?
    As mentioned above, 0x10000C it is.

  • How does the boot loader decide how many sectors it must read in order to fetch the entire kernel from disk? Where does it find this information ?
    ELFHDR->e_phnum determines how many sectors needed. By firstly reading the first page of disk, the boot loader gets this information.

Load Kernel

Before going deeply, it is necessary to understand what an ELF binary is. When a C program is compiled and linked, the compiler(such as GCC) transforms each C(‘.c’) source file into an object file(‘.o’), for example, considering this program segment in main.c

1
2
3
4
5
6
7
#include <stdio.h>

int main(void)
{
printf("Hello, world");
return 0;
}

then run command

1
cc -Og -c main.c

you get an output file named main.o. We will talk about this process in the next posts. All we need to know is that this object file contains assembly instructions encoded in the binary format which can be recognized by hardware(There is actually the machine code in it.). After this transformation, the linker combines all object files into one single binary image such as obj/kern/kernel.

The ELF binary itself can be very complicated, so to simplify what need to be done, ELF executable file can be considered to be a header with loading information, followed by several program sections, each of which is a contiguous chunk of code or data intended to be loaded into memory at a specified address.

The program sections we are interested in are listed below

  • .text: This is where executable instructions reside.
  • .rodata: The read-only data, such as constant C strings.
  • .data: This place holds the initialized data, such as global variables defined like int x = 5.

About uninitialized data, for instance like int x, there is no value assigned to this x, in C we know its value is implicitly zero. When the linker computes the memory layout of a program, it reserves space for uninitialized variables like this, in the section .bss that immediately follows .data in memory. Since uninitialized variables are implicitly assigned to zero, there is no need to store actual contents for this section; instead, the linker merely records address and size of the .bss section. We can run command

1
objdump -x obj/kern/kernel

to understand these sections.

The screenshot above reveals much more information we now actually need, for now, let’s take a look at 2 important parts: VMA & LMA. The VMA is called link address(also called virtual address). The LMA is called load address. The load address of a section is the memory address at which that section should be loaded into memory. The link address of a section is the memory address from which the section expects to execute. The linker layouts the VMA to physical address(mapping).

The boot loader uses ELF header(defined in inc/elf.h) to decide how to load the sections. The program headers shown below

1
2
3
4
5
6
7
8
9
10
struct Proghdr {
uint32_t p_type;
uint32_t p_offset;
uint32_t p_va;
uint32_t p_pa;
uint32_t p_filesz;
uint32_t p_memsz;
uint32_t p_flags;
uint32_t p_align;
};

specify which parts of the ELF object(p_offset) to load into memory and the destination address(p_pa) each should occupy. Again, run command

1
objdump -x obj/kern/kernel

We notice ‘LOAD‘ area shown above, which need to be loaded into memory. Let’s go back to the main.c file, the ph->p_pa in function bootmain contains each section’s destination physical address.

In kernel, we know that the link address and load address are all different. Link address is always 0xf0000000 greater than load address. However, typically in boot loader, they are the same(0x7C00). By passing -Ttext 0x7C00 to the linker, link address is set. Thus the linker will produce the correct memory addresses in the generated code.

About exercise 6, we check the state of memory address 0x00100000. Two breakpoints should be set up on 0x7c00 where BIOS enters boot loader, and 0x10000c where boot loader enters the kernel.

At first breakpoint 0x7c00, the address 0x00100000 hasn’t been linked yet. Therefore, there is nothing here. Then, at the second breakpoint 0x10000c, 0x00100000 has already been linked. So there is some part of kernel revealed.

Kernel

Like the boot loader, the kernel also uses assembly code to set things up so that C code can execute properly.

First of all, about the virtual address mapping, in short, we use memory management unit(MMU) to map virtual address to physical one. Up until kern/entry.S sets the CR0_PG flag, memory references are treated as physical addresses (strictly speaking, they’re linear addresses, but boot/boot.S set up an identity mapping from linear addresses to physical addresses and we’re never going to change that). Once CR0_PG is set, memory references are virtual addresses that get translated by the virtual memory hardware to physical addresses.

Exercise 7: Let’s examine memory at 0x100000 and 0xf0100000 before and after movl %cr0, %eax executed.

Before movl %cr0, %eax, 0xf010000 has not been mapped, so there is nothing in this area. After this instruction, 0xf010000 is mapped to 0x00100000, so when we examine this virtual address, we actually examine 0x00100000 physical address. And if we delete this instruction, we will get an error: ‘Trying to execute code outside RAM or ROM at 0xf010002c’, because this address 0xf010002c was taken as a physical address! The fact is there is no area in this address, we do not actually have that large memory(Available space is in 0x00000000~0x0fffffff).

Formatted Printing to the Console

Many people take printf() for granted, whereas it should be implemented by ourselves.

Firstly, read code located in kern/printf.c, lib/printfmt.c and kern/console.c, we need to figure out what relationship among them. It will become clear in later labs why printfmt.c is located in the separate lib directory.

Exercise 8: All we need to do is add octal numbers implementation in specific fragment of code. It is very simple since we can follow the hex one.

1
2
3
4
case 'o':
num = getuint(&ap, lflag);
base = 8;
goto number;

Then let’s look at these questions that need answering.

  • Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?
    The interface is obviously cputchar(), provided by console.c and called by printf.c. The function exported by console.c is that it constantly output characters to I/O devices.

  • Explain the following from console.c:

    1
    2
    3
    4
    5
    6
    7
    8
       if (crt_pos >= CRT_SIZE) {
    int i;
    memmove(crt_buf, crt_buf + CRT_COLS,
    (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
    for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
    crt_buf[i] = 0x0700 | ' ';
    crt_pos -= CRT_COLS;
    }

The location of this piece of code is in cga_put(), so it’s about VGA device. crt_pos represents the current location of our cursor. crt_buf is continually input chars(a string) before cursor, therefore we can use crt_buf[i] to find where our cursor stops. CRT_SIZE is actually the whole console window’s size. When the console window is full(crt_pos >= CRT_SIZE), this code uses memmove to move one line up, meanwhile, clear last column to prepare for incoming inputs.

  • For the following questions you might wish to consult the notes for Lecture 2. These notes cover GCC’s calling convention on the x86. Trace the execution of the following code step-by-step:
    1
    2
    int x = 1, y = 3, z = 4;
    cprintf("x %d, y %x, z %d\n", x, y, z);

fmt points to string "x %d, y %x, z %d\n", ap points to the memory address of second parameter.

  • Run the following code. What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise.
    1
    2
    unsigned int i = 0x00646c72;
    cprintf("H%x Wo%s", 57616, &i);

57616 is 0xell0, so the first part of this output would be Hell0. Let’s figure out what &i will look like. Before doing that, we need to know about little-endian and big-endian. 0x00646c72‘s layout in memory would be 72 6c 64 00, decimal ones are 114 108 100 0. The corresponding ASCII chars are r l d, so the second part of output is World. The output is string Hell0 World. If the x86 were instead big-endian, we only need to set i to 0x726c6400.

  • In the following code, what is going to be printed after ‘y=’? (note: the answer is not a specific value.) Why does this happen?
    1
    cprintf("x=%d y=%d", 3);

It is an unknown value.

  • Let’s say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change cprintf or its interface so that it would still be possible to pass it a variable number of arguments?
    Possible va_start:
    1
    ap = (var_list)&(last) - sizeof(last)

and possible va_arg:

1
#define va_arg(ap,type) (*(type *)((ap -= sizeof(type)) + sizeof(type)))

Challenge: Colorful console output:

1
cprintf("\033[30mWelcome \033[31mto \033[34mthe \033[35mJOS \033[36mkernel \033[37mmonitor!\n");

The Stack

In the final part of this exercise, we will dive into more details about the way C programming language uses stack in x86 architecture. In this process, what we need to do is write some kernel code to show enough function calling information during the running of the kernel. It seems like so-called call stack in our daily developing work. What’s more, I found that StackExchange has opened discuss panel related to reverse engineering. Plus, there is an answer I found helpful explaining the uses of registers: %rbp, %rsp.

Exercise 9: Determine where the kernel initializes its stack, and exactly where in memory its stack is located. How does the kernel reserve space for its stack? And at which “end” of this reserved area is the stack pointer initialized to point to?

The stack should be initialized before calling the first C function. Therefore, the stack should begin in entry.S file. The specific location should be here.

1
2
movl 0x0,%ebp
movl (bootstacktop),%esp

The memory address the stack pointer pointed to can be analyzed by looking into .asm file generated by the compiler, which is at path obj/kern/kernel.asm. Let’s check the address of bootstacktop, which is 0xf0110000. The reserve space is about two parts. One is constant KSTKSIZE defined in file, another is bootstacktop. The bootstacktop is where the stack pointer points to. If the stack grows downward, the “end” is the bootstacktop.

Now, it’s time to write an entire backtrace function used in debugging. When a function calls another one, the %eip saves the address right after the call instruction, which ensures when callee function returns, the caller function continues as wanted, safe and sound. Another things we should notice is before doing mov %esp, %ebp, we make sure read_ebp() runs first, saving caller’s %ebp value. The mon_backtrace should be implemented like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
uint32_t *ebp = (uint32_t *)read_ebp(); // already 'movl %esp, %ebp'
uint32_t *eip = (uint32_t *)ebp[1]; // next instruction -4(%ebp)

//list arguments using pointer references
uint32_t arg0, arg1, arg2, arg3, arg4;
arg0 = ebp[2];
arg1 = ebp[3];
arg2 = ebp[4];
arg3 = ebp[5];
arg4 = ebp[6];

//print more specific information, such as call function name, number of arguments, source file name
struct Eipdebuginfo info;

cprintf("Stack backtrace:\n");
while (0x0 != ebp) {
debuginfo_eip((uintptr_t)eip, &info);
cprintf("ebp %x eip %x args %x %x %x %x %x\n", ebp, eip, arg0, arg1, arg2, arg3, arg4);

// Ignore stuff after the colon. Cut eip_fn_name by info.eip_fn_namelen
char real_name[info.eip_fn_namelen+1];
real_name[info.eip_fn_namelen] = '\0';

for (int i = 0; i < info.eip_fn_namelen; i++)
real_name[i] = info.eip_fn_name[i];

cprintf("%s:%d: %s+%d\n", info.eip_file, info.eip_line, real_name, (uint32_t)eip-info.eip_fn_addr);
ebp = (uint32_t *)ebp[0]; // *(uint32_t *)((uint32_t *)$ebp+0)
eip = (uint32_t *)ebp[1];
arg0 = ebp[2];
arg1 = ebp[3];
arg2 = ebp[4];
arg3 = ebp[5];
arg4 = ebp[6];
}

return 0;
}

This simplified backtrace should be like this.

However, we definitely won’t be satisfied with this simple information it gives. We usually need to find where the program stops or where the program crashes, we should be able to locate detailed code position. For instance, when we are curious about which function crashes, more specific details should be given. To help out, the kernel provides us with debuginfo_eip(), which, defined in kern/kdebug.c, looks up eip in the symbol table and returns the debugging information for that address.

In kdebug.c file, we can see two extern constants,

1
2
extern const struct Stab __STAB_BEGIN__[];	// Beginning of stabs table
extern const struct Stab __STAB_END__[]; // End of stabs table

and Stab is a C structure defined in inc/stab.h.

1
2
3
4
5
6
7
8
// Entries in the STABS table are formatted as follows.
struct Stab {
uint32_t n_strx; // index into string table of name
uint8_t n_type; // type of symbol
uint8_t n_other; // misc info (usually empty)
uint16_t n_desc; // description field
uintptr_t n_value; // value of symbol
};

To understand what exactly this Stab is, we must pay some attention to file kern/kernel.ld.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/* Include debugging information in kernel memory */
.stab : {
PROVIDE(__STAB_BEGIN__ = .);
*(.stab);
PROVIDE(__STAB_END__ = .);
BYTE(0) /* Force the linker to allocate space
for this section */
}

.stabstr : {
PROVIDE(__STABSTR_BEGIN__ = .);
*(.stabstr);
PROVIDE(__STABSTR_END__ = .);
BYTE(0) /* Force the linker to allocate space
for this section */
}

__STAB_BEGIN__&__STAB_END__ can easily be found here. At least, we now know where __STAB_* come from. As course mentions, running command

1
objdump -G obj/kern/kernel

also helps us understand that.

Contents of .stab section correspond to Stab structure elements defined in inc/stab.h. Also,

1
#define	N_SLINE		0x44	// text segment line number

makes me focus on the SLINE in .stab section. The text segment, remember, is where our code resides. Therefore, I guess this is gonna be where we extract more information from.

Now it’s time to write some code to implement this. We concentrate on this function debuginfo_eip(), which uses stab_binsearch() to narrow down the scope we search and finally we are able to find the line number for an address. It is very simple to mimic other search phase. With SLINE we discussed above, so the code should be like this.

1
2
3
4
5
6
stab_binsearch(stabs, &lline, &rline, N_SLINE, addr);
if (lline <= rline) {
info->eip_line = stabs[lline].n_desc;
} else {
return -1;
}

The final more sophisticated backtrace can act like this! Hooray~