RISC-V Machine Boot Code
Last updated on February 24, 2023 pm
RISC-V Machine Boot Code
Basically, when a RISC-V machine powers on, the bootloader built in hardware or emulator will go to memory address 0x8000_0000
to load an OS.
At this time, the computer works in machine mode, and our boot.S
needs to do something that can only be done in this privilege level. And here is a line-by-line explanation.
Some Definitions
If you want, you can take a glance at code first, and come back when encountering any problems.
Directives
All keywords begin with a period are called directives. They are not defined in RISC-V specification, but in assembler’s side, to give some hints to assembler. Its syntax may be different from assembler to assembler, and it’s not corresponding to any specific instruction.
Pseudoinstructions
Technically, pseudoinstructions are not instruction. They resemble common instructions and are used to improve programming efficiency. One line of pseudo instruction may correspond to more than one instruction when assembled by assembler.
For example, li
(load immediate) is a common pseudoinstruction.
ABI
ABI stands for application binary interface. Instead of naming registers’ absolute location, say, x0
, x1
and so on, it is highly recommended to refer them with their ABI, such as zero
, ra
and so on.
CSR and Zicsr
Control and status registers (CSR) are used to control and monitor the operation of various hardware components. They are part of privileged component in architechure, and thus, accessible by using different opcodes. These opcodes for CSRs are differently categorized and named as Zicsr.
Code
1 |
|
Branching out Other Harts
Hart is a conventional name for a Hardware Thread.
In line 9, csrr
is a pseudoinstruction meaning ‘read CSR’. You can simply understand it as ‘read mhartid
and save into t0
‘, but since mhartid
is a CSR, it requires using csrr
, which is also in Zicsr, to read it.
mhartid
can be understood as ‘machine mode hart id’. When hardware boots, all its harts are activated and run this assembly code, whereas this snippet is only intended to use one hart.
t0
is the first temporary register, mapping to register x5
.
In line 10, bnez
is also a pseudoinstruction, standing for ‘branch if not equal to zero’.
So this two lines are trying to figure out which hart is running, and then lead it to the pend
label, unless it is the first hart.
Let a hart jump endlessly is a common way to pend it. And in line 22, we added an instruction called wfi
, meaning ‘wait for interrupt’. It could be seen as entering sleep mode for this hart.
Setting up Stacks
The layout of stacks is quite baffling and took me a lot of time to figure out.
In general, we assign a bunch of memory, let sp
points to the end of first hart’s stack, and move sp
to its corresponding hart’s stack, with respect to label stacks
(line 18-19), line 13 and line 14 accordingly.
The stacks have a gross size of STACK_SIZE * CPU_NUM
in bytes. The directive .space
means fill these size with zeros automatically, the same as .skip
. When line 13 refers to stacks
, it refers to the beginning, or in other words, smallest address of this assigned memory. By adding exactly one STACK_SIZE
, the sum should be the end address of first hart’s stack.
Always remember these rules when trying to understand this part:
- The low address is the start of the stack, whereas the high address is its end.
- The stack pointer
sp
always points to the end of the stack. - Hart id starts at 0.
Since we designed the size of the stack to be 1024 bytes, which is 210, we can left shift the hart ID 10 bits and add it to location of the first stack to find its own stack. Line 12 is the instruction that exactly does this. The opcode slli
instructs the hart to logical shift t0
left, which also means multiplying it by 1024.
Going to C
In line 16, the program jumps to enter
which is declared in C language. Calling a C function is just like calling an assembly function since they are both converted to binary and located in text
section. When booting, quickly switching to high-level programming language can be helpful.
Initializing stacks can also be done in C. Referring to xv6-riscv/kernel/start.c
, it is written as:
1 |
|
Here __attribute__
is a GNU GCC feature used to provide additional information to the compiler. In this case, it ensures that stack0
is aligned on a 16-byte boundary. It gives each hart 4096 bytes and NCPU
stands for the number of harts.
Loading Address
Note that address 0x8000_0000
is different from so-called magic number, which is 0xAA55
, in x86. The latter is defined as the end of boot sector. BIOS would go through all storage devices and try to boot from this section.
Thus, when linking objects together, we need to add flag -Ttext=0x80000000
to make sure the .text
section is located at where we want. From the doc, -Ttext
here is a short hand for --section-start=.text
.
This 8000_0000
is not standardized, but conventionally, it should be this location, as you can find out in QEMU source code. For example, in qemu/hw/riscv/virt.c
, in the array of memory map virt_memap
, the last one is VIRT_DRAM
and starts at 0x8000_0000
.
References
- xv6: a simple, Unix-like teaching operating system, Russ Cox, Frans Kaashoek, Robert Morris, MIT, 5 September 2022, 2.6 Code: starting xv6, the first process and system call (P27 - 28)
- xv6-riscv/entry.S at riscv · mit-pdos/xv6-riscv
- Writing a Simple Operating System – from Scratch, Nick Blundell, University of Birmingham, 2 December 2010, Chapter 2 Computer Architecture and the Boot Process (P3 - 7)
- 55 and AA. What’s special about 55 and AA? Or more… | by Larry K. | Medium
- riscv-asm-manual/riscv-asm.md at master · riscv-non-isa/riscv-asm-manual
- Pseudo Ops (Using as)
- Options (LD)
- Documentation for binutils 2.40
- [完结] 循序渐进,学习开发一个RISC-V上的操作系统 - 汪辰 - 第7章(上)-Hello RVOS_哔哩哔哩_bilibili
- riscv-operating-system-mooc/start.S at main · plctlab/riscv-operating-system-mooc
- RISC-V System emulator — QEMU documentation
- qemu/virt.c at master · qemu/qemu
- Specifications - RISC-V International
- RISC-V Instruction Set Manual, Volume I: RISC-V User-Level ISA | Five EmbedDev
- Taking control of RISC-V: RISCV OS in Rust
- osblog/boot.S at master · sgmarz/osblog