Static, global and the magic between compile and run

Until now, if you tried to initialize a (non-constant) global or static variable with the Everykey framework, the resulting firmware file turned out to be huge, far too big to fit on the device. Even worse, if you initialized a global or static to zero, you might be able to flash, but the program would randomly not work. The workaround was to initialize only const variables and then to set up everything by hand in your `main` function. Not so beautiful. If you’re not interested in the why and how, here’s the good news in brief: It works now – just update your repository, make clean and make. If you want to get a better understanding about linking and bootstrapping internals, read on – it’s often considered dark magic, but probably far easier than you believe.

The cause of the problem was skipping an essential step that is supposed to happen between compilation and execution for the sake of simplicity: Data Initialization. Let’s see what actually happens:

The makefile first compiles your C code into machine language and data. But any locations that are supposed to refer to an actual, absolute address in memory are intentionally left blank. Why? Because memory locations are not known yet at this point. Instead, gcc classifies the different types of its results (executable code, data, global variables, etc) and adds a special export symbol for each location that might be interesting for others. The symbol does not specify an absolute address, just the location within the compilation unit. Other special import symbols are added for each location that should later on refer to an actual memory address – just the location within the compilation unit and a name where it should point to.

The linker is supposed to glue together all the compilation units generated by the compiler. In principle, this process is rather simple: It scans through all .o object files (the compiler’s results) and collects all the different types of data. After that, it orders the contents of the object files and calculates the location of all the symbols, once the actual location of the symbols is known, the linker can switch the symbols for the actual memory addresses. If one imported symbol could not be matched to an exported one, you get an error. That’s basically it for static linking.

But wait, there’s more: The linker is not only supposed to layout the runtime addresses, it also creates our firmware file! Do they differ? Yes, exactly in situations that initialize a global or static variable. On one hand, the variable must be located in RAM (because your code may modify its contents at runtime). On the other hand, it’s initial value must be stored in FLASH, because that’s the only memory known at startup (hey, that’s why it’s called flashing). In order to solve this problem, we have to do two things:

Tell the linker that these variables should later on have a location in RAM, but initial values should go to a separate region in the firmware file (which goes to the FLASH memory)
Add a step in the startup code that copies the stored initial values from FLASH to their respective location in RAM

Fortunately, it’s quite easy to tell the linker to do the first thing using a special markup in the linker script (which is, by the way, more of a list of rules than a script). The second step has to be done by ourselves, it adds four lines of code to the bootstrap() function in startup.c. But how does the bootstrap code know what to copy from where to where? Nice trick: We intentionally import symbols that are not exported anywhere else in the code. Usually, this would cause a linker error, unless we tell the linker to export these symbols itself. That’s exactly what we do.

Linker scripts can be slightly confusing to read. For readability, all input segments names (stuff classified by the compiler) start with a dot. Output segments (stuff ordered by the linker) don’t start with a dot (at least that’s how we did it) – plus there are now some comments in the linker script (lpc1343.ld).

By default, gcc classifies it output using the following names (don’t ask why, it’s just convention):

.text : compiled code. This can go into FLASH (if you want to build self-modifying code, you have to copy it to RAM first)
.rodata: globals and static variables declared as const. Since their contents cannot be modified, they can also go into FLASH
.data: non-const globals and static variables that are initialized to a value other than zero: These need to be linked to RAM, but we need to store their initial values in a mirror region in FLASH.
.bss: non-const globals and static variables that are either initialized to zero or not initialized (which defaults to zero): These need to be linked to RAM, but we don’t need to store their initial values (they are all zero, so there’s no need to waste FLASH memory). The startup code just needs to clear the corresponding RAM region to zeroes.

In addition, we declare a special segment “.vectors” in startup.c, which only contains the vector table (initial stack pointer, start address of the bootstrap code and default addresses of interrupt and fault handlers). The Cortex M3 expects this table to start at memory location 0x0 to properly launch into our code. We use this dedicated segment to ensure that the linker puts it to the right place.

Just for completeness, here’s the rest of the bootstrap story:

The linker will write its results into an .obj file, containing all necessary information. After the linker is done, objcopy will extract a “memory dump” of the linker results – the contents of the .hex file is exactly what should be in the microcontroller’s FLASH memory, starting at address 0x0. One last step is missing: Our specific microcontroller expects a valid checksum to accept the firmware file. There’s an entry in the vector table left empty for this value. The `checksum` tool will calculate the correct value and modify the value in the .bin file.

When flashing, this binary data is written into the microcontroller’s flash memory. Later at runtime, after the controller hardware has finished its internal setup, it initializes the stack pointer (with the value at address 0, part of the vector table) and jumps into our bootstrap code (whose address is stored at memory location 4 – also part of the vector table). Once in software land, we can do the copy trick described above, do some more common initialization (for example, enabling the external quartz oscillator and some basic peripherals) and call main(), which is the starting point of the actual application.

As you see, there are quite a couple of components involved, but no dark magic at all.

3 thoughts on “Static, global and the magic between compile and run”

Elli said on February 12, 2013 at 20:16:

Thanks for the thorough explanation! Great to see Anykey is thriving again. You’ve got one happy groupie here. 🙂
Reply ↓
matthias said on February 14, 2013 at 03:39:

Thanks! Btw: I saved your hat 🙂
Reply ↓
Elli said on February 14, 2013 at 14:51:

You did?! At least someone was being a responsible adult that day! 😀 I’m looking forward to seeing you guys soon! And I hope there’ll be another Anykey Workshop soon aswell. Btw: I really like your writing style. But enough with the flattery for now. See you soon. Also, drop by any time. It was fun arguing with you.
Reply ↓

Everykey Blog

yet another hardware blog

Static, global and the magic between compile and run

3 thoughts on “Static, global and the magic between compile and run”

Leave a reply to Elli Cancel reply