Concepts you may want to Google beforehand: C, object code, linker, disassemble
Goal: Learn to write the same low-level code as we did with assembler, but in C
Let's see how the C compiler compiles our code and compare it to the machine code generated with the assembler.
We will start writing a simple program which contains a function, function.c
.
Open the file and examine it.
To compile system-independent code, we need the flag -ffreestanding
, so compile
function.c
in this fashion:
i386-elf-gcc -ffreestanding -c function.c -o function.o
Let's examine the machine code generated by the compiler:
i386-elf-objdump -d function.o
Now that is something we recognize, isn't it?
Finally, to produce a binary file, we will use the linker. An important part of this
step is to learn how high level languages call function labels. Which is the offset
where our function will be placed in memory? We don't actually know. For this
example, we'll place the offset at 0x0
and use the binary
format which
generates machine code without any labels and/or metadata
i386-elf-ld -o function.bin -Ttext 0x0 --oformat binary function.o
Note: a warning may appear when linking, disregard it
Now examine both "binary" files, function.o
and function.bin
using xxd
. You
will see that the .bin
file is machine code, while the .o
file has a lot
of debugging information, labels, etc.
As a curiosity, we will examine the machine code.
ndisasm -b 32 function.bin
I encourage you to write more small programs, which feature:
- Local variables
localvars.c
- Function calls
functioncalls.c
- Pointers
pointers.c
Then compile and disassemble them, and examine the resulting machine code. Follow
the os-guide.pdf for explanations. Try to answer this question: why does the
disassemblement of pointers.c
not resemble what you would expect? Where is
the ASCII 0x48656c6c6f
for "Hello"?