Next Previous Contents

1. Boot Structure

The boot process for a machine is configured from the makefile located at: linux/arch/arm/boot/Makefile

1.1 boot/Makefile

This file normally defines the name of the kernel built, and some linker specific options.

This is the output of the build which is not in any specific binary / object format that is executable. This is the raw kernel which can be used for debugging or asm dumps.

SYSTEM = $(TOPDIR)/vmlinux

These variables decide the actually uncompressing of the kernel

ZTEXTADDR - Address where zImage is located by the bootloader 
ZRELADDR - Address where the zImage will be relocated
PARAMS_PHYS - Address where tagged parameters are to be found
INITRD_PHYS - Physical Address of the Ramdisk
ZBSSADDR - Address where the real kernel should execute from

To define these for a specific architecture, you would have:

(these values are for the xScale PXA250)

ZRELADDR = 0xA0008000
PARAMS_PHYS = 0xA0000100

INITRD_PHYS and ZBSSADDR would depend on the actual architecture and requirements, but these two parameters are vital to start any compressed zImage. If you aren't using a compressed zImage, then you might still have to set PARAMS_PHYS to setup tagged parameters.

You will probably still be able to pass parameters through tags if you want the kernel to locate a ramdisk.

PARAMS_PHYS will point to the tagged parameters which have been placed by a boot loader.

1.2 boot/compressed/head-machine.S

For each machine type, there is a specific head-machine.S which sets machine specific options. This could be head-xscale.S or head-netwinder.S or head-sa1100.S or head-clps7500.S depending on the actual machine and is selected through linux/arch/arm/boot/compressed/Makefile

This is linked INTO head.S and not before head.S, it executes inside head.S after some initial code is executed.

An Example: head-xscale.S

.start is defined here and is the actual execution start of the architecture specific header.

The next thing done is to flush the cache by traversing memory larger than twice the largest possible size of a D-Cache (This technique seems to be used in more than one place during the kernel start).

Most of the code here is to switch on machine specific options, which cannot/should not be included in head.S directly.


Optionally, the architecture id (MACH_TYPE_MACHNAME) is hardcoded inside the decompresser. In case the bootloader (like Blob) can decompress the kernel from itself, this would not be required as the entry code of the kernel can read the architecture number from the registers passed to it. I assume that either the initial registers passed to the decompresser will have to be preserved (the boot-loader document specifies that we have to set r0 - 0, r1 - MACH_TYPE_MACHNAME, r2 - PARAMS_PHYS). we now enter continue through the common section of head.S (just after start:)

1.3 linux/arch/arm/boot/compressed/head.S

This is the entry header for the kernel decompressing routine and is actually located at: linux/arch/arm/boot/compressed/head.S


The file actually starts with definitions for code required for debugging (serial debugging) before the decompression actually starts.

#ifdef DEBUG

The following macros are used by kphex and kputc internally and need to be defined should you need debugging support on the platform at this stage


this macro can be defined for any new architecture if you need debug messages before the kernel is uncompressed.

#endif /* DEBUG */

Bootstrapping and Relocating

The actual code execution starts here:


.word 0x016f2818 @ Magic numbers to help the loader
.word start @ absolute load/run zImage address 
.word _edata @ zImage end address

"start" and "_edata" contain the offsets to start: and _edata: for the zImage. They are intended to assist in relocating the kernel.

0x016f2818 is a signature for boot loaders to identify the image as a linux kernel zImage. This is at a physical offset 0x24 from the start of the actual zImage and can be used to verify whether a given image is actually a compressed linux kernel. In the case of a relocation, this would probably help a bootloader detect a bootable zImage. We now preserve r0,r1 by storing them in r8,r7 so that the scratch registers are available for manipulation through the rest of the code.

r8 = #0

r7 = r1

This part of the kernel is written to provide relocation capabilities. A GOT (Global Offset Table) is defined and used for the purpose of relocation. At this point, the code actually checks the GOT and makes sure that any offsets are redefined if we are executing from a different location from what was expecetd.

The code works as follows:

LC0 contains the GOT which can be used for relocating the code.

LC0: .word LC0 @ r1
.word __bss_start @ r2
.word _end @ r3
.word _load_addr @ r4
.word _start @ r5
.word _got_start @ r6
.word _got_end @ ip
.word user_stack+4096 @ sp
LC1: .word reloc_end - reloc_start


r7,r8 are preserved during this operation (actually they are preserved almost through the rest of the code)

r2 = _bss_start, r3 = _end from the GOT

not_relocated: mov r0, #0
1: str r0, [r2], #4 @ clear bss
str r0, [r2], #4
str r0, [r2], #4
str r0, [r2], #4
cmp r2, r3
blo 1b

Enabling the Cache (cache_on)

enable the cache now, do NOT touch r4 while doing so

set r3 := #8 

this is the offset to find the cache_on routine from the processor cache entry tables. (refer to the processor_info table below - ptr_cache_on is at offset #8)

call call_cache_fn 

this routine actually chooses which cache functions to run based on the processor id and the offset supplied (r3).

The proc_types table is laid out as follows:

for each processor there is an identical record that lists out

| proc_types |

- processor_id -

- processor_mask -

- ptr_cache_on -

- { return mov pc,lr } -

the last record is

- 0 -

- 0 -

- { mov pc, lr } -

- { mov pc, lr } -

- { return mov pc,lr } -

As observed, the last record would forcibly return to the caller (and initiate an error indicating that the processor was not identified).

the appropriate cache function [ptr_cache_on] is called by a direct branch (without modifying the link register lr)

set r12 := lr


every arch specific cache_on preserves lr by copying it to r12. It may be possible that lr is affected when the cache is initialized**

( the cache function finally returns with a mov pc,r12 )

Decompress the kernel

r1 := sp
r2 := sp + 64K
if r4 > r2 <em>(_load_addr > sp+64K)? do wont_overwrite()
r0 := r4 + 4M
if r0 < r5 <em>(_load_addr + 4M < _start)? do wont_overwrite()

In case we run into the kernel (within our stack space), we need to workaround overwriting the kernel on decompression

r1 := r5 + r0 (_start + _length) - _end of kernel
r2 := _start
r3 := *LC1 (reloc_end - reloc_start)
r3 := r2 + r3 (_start + reloc_length)
copy relocation code from _start to _end

parameters to decompress_kernel() are now assigned before calling

r5 := r2 (sp +64K)
r0 := r5 (sp +64K)
r3 := r7 (r7 = architecture_id)

wont_overwrite() is called when it is safe to proceed with the initial addresses and stack space assignments. This is the ideal case and the simplest one with no complications.

r0 := r4 ( _load_addr of kernel )
r3 := r7 ( architecture_id )

decompress_kernel() is actually a C routine (there is a comment where the GOT is setup, stating that the GOT is required to be properly setup for accessing C calls).

Call Kernel

do cache_clean_flush() -- flush the caches softly!
do cache_off() -- turn off all caches
r0 := #0
r1 := r7 (architecture_id)
mov pc, r4 (r4 = _load_addr)

The kernel should start executing at this point (you'll have to refer to code in kernel/head.S to trace further execution).


Flushing the D-Cache


r1 := pc & ~(31) /** bic r1, pc, #31 */
r2 := r1 + 64K (64K is twice the size of 32K - max d-cache size)
while( r1 < r2) {
r12 = *r1
r1 += 8

1.4 kernel/head.S

This is the actual start of the kernel (sans any decompression code).

(was head-arm[vo].S in 2.4.xx kernels and has now been merged into a single file.)

(PS: This is why all TEXTADDR end with 0x8000)

ENTRY(stext) in this file actually executes from TEXTADDR

At this point, the following state is assumed

MMU: off

I-Cache: - don't bother -

D-Cache: off

r0: 0

r1: architecture_id (sourced from mach-types)

This code can also be called by a bootloader (which can uncompress the kernel by itself or run an uncompressed kernel image) and hence should also be able to execute without any special settings or helpers.


We now force ourselves to enter SVC mode irrespective of the mode we are in during entry.

(set CPSR_C using r0 - saving original r0 to r12)


enter SVC mode with FIQs and IRQs turned off



r5 := &__proc_info_end (actually the start of the table).

using r5 it loads the following

r7 := __proc_info_end

r9 := __proc_info_begin

r10:= [table]

It traverses the table, the first two entries in the table are



It uses

mrc p15, 0, r9, c0, c0 @ get processor id
r6 := processor_mask & r9
if ( r6 == processor_id ) { return; }

at this point r10 := _proc_info or '0' depending on whether a match was made. We can easily check whether the processor was identified by checking the value of r10.

if( r10 == 0 ) { exit( err = 'p' ) }


This uses the value of r1 that was passed through to identify the actual machine type. It parses through a table just like __lookup_processor_type() till it hits a match.

This returns:

r5 := Physical Start Address of RAM

r6 := Physical Start Address of I/O

r7 := byte offset into page tables for I/O

Again, we can check whether we had a positive identification by checking r7.

if( r7 == 0 ) { exit( err = 'a' ) }


The page tables are located at address of 'stext' - 0x4000 this is exactly 16k below the actual start of the kernel.

This is done using the pgtbl macro which takes an extra rambase parameter that is never used**. The entire 0x4000 (16K) 1 level table is cleared with '0's 4 MB of space is mapped as pages of 1 MB each and the rest is aligned on a 32MB boundary. If a serial debug is required before paging_init() is called, the serial device (if it is external to the processor) also must be mapped through this table. If the serial device is through the processor, then you would have to know the virtual address of the serial device after the initial page mapping is done. CONFIG_DEBUG_LL selects whether the serial device is mapped here. There are certain architecture specific initialization code required here if serial debugging on an external UART is required.

lr = &__turn_mmu_on()


This was __ret() in 2.4.xx

The adr psuedo-op is used here to ensure that the code remains completely relocatable. r10 contains the base address of 'xxx_proc_info' table where xxx is the processor identified by __lookup_processor_type().

pc = r10 + #12

this actually jumps to ___xxx_setup()

The MMU is setup at this point and activated. (the lr actually points to __turn_mmu_on(), and hence mov pc,lr jumps there). Once the MMU is setup, control will now be passed to __mmap_switched.


This is the last function called inside head.S after which control is passed to 'start_kernel' (which is located in entry-arm[vo].S)

Next Previous Contents