You are using Windows (All test done on Win2000) You are using gcc 3.0.1 (dev kit advance) devkitadvance is installed in c:\devkitadv\.... and that c:\devkitadv\bin is on your PATH i.e. set PATH=%PATH;c:\devkitadv\bin
| A. | I hate inacurate docs | B. | I can not spell | C. | I hate inacurate docs |
static unsigned int getValue( void )
{
return 0xC0DEF00D;
}
which returns a unsigned 32 bit quantity.gcc -mthumb -S test1.cyou will have created test1.s an assembler version of the test1.c file
.align 2
.thumb_func
.type getValue,function
getValue:
push {r7, lr}
mov r7, sp
ldr r0, .L8
mov sp, r7
pop {r7, pc}
.L9:
.align 2
.L8:
.word -1059131379
.Lfe2:
.size getValue,.Lfe2-getValue
| ARM code | Meaning |
.align 2
| notify the linker that this is should be 2 byte aligned half word aligned (N.B. all thumb MUST be half word aligned |
.thumb_func
| notify the linker that this is a thumb function |
.type getValue,function |
notify the linker what type of item follows.
In this case a function
(local to this file) called getValue
|
getValue: |
this is a label that we have seen earlier is a function. |
push {r7, lr} |
This is the first of two prolog instructions.
This saves any regsters that might be required and the old stack frame r7 in thumb mode is similar to fp in arm mode it points to the current stack frame for accessing parameters that have been passed in and so that the saved values can be reloaded in the epilog N.B. this is a RISC CPU so the return address is not automatically pushed on the stack as a CISC CPU would (i.e. Z80, 6502, 68K or x86 to name but a few) instead the return address is store in a register namely lr or link register (other RISC i.e. MIPS and PowerPC do exactly the same) you should note that hear we are pushing the frame pointer and link register on to the stack |
mov r7, sp |
This is the second and final instruction in the prolog we set the frame pointer correctly (having saved the old value in the previous instruction) so that the stack pointer (sp) can be used without us having to know how much stack is being used within the function for calling other functions |
ldr r0, .L8 |
Finally the actual code, in this example it is just this we loada register in this case r0 with a value from a label local to this item here is is .L8 N.B. the label must lie within no nearer than 4 bytes and no further than 1024 bytes from the instruction and by word (4byte) aligned r0 is now loaded with the value to return. |
mov sp, r7 |
having set r0 we can leave first we must restore the stack pointer so we can recover the registers we saved in the prolog. simply setting the stack pointer to current frame pointer will do |
pop {r7, pc} |
now the stack pointer is restored we can restore the the other registers But note that we pushed frame pointer and link register now we are restoring frame pointer and program counter of course we are! in the the prolog the link register held the return address. Now we want to go back there so to save a pointless copy instruction from link register to program counter we just pop the required value straigh into the program counter and that's the end of the function. in one hit the frame pointer now pointer to the parent stack frame and the program counter is at the instruction directly after the function call (branch and link) that sent it here. |
.L9: |
a local label that identified the end of the item (see below) |
.align 2 |
tell the assmebler that the next instruction/directive must be 2 byte aligned (half word aligned) |
.L8: |
a local label that you have seen earlier that marks the constant that we are going to load into r0 |
.word -1059131379 |
Tell the assembler to put the word (32bit) literal (this is 0xC0DEF00D) as a signed 32 bit integer |
.Lfe2: |
label to identify the end of the item |
.size getValue,.Lfe2-getValue |
Tell the linker how bit this item was in this case it is a function and its associated constants |
static unsigned int getValue( void )
{
asm volatile
("
ldr r0, .myValue
b .exit_this_function
.align 2
.myValue:
.word 0xFEEDFACE
.align 2
.exit_this_function:
");
}
but, if you run the following command
gcc -mthumb -S test2.cyou will have created test2.s is an assembler version of the test2.c file
.align 2
.thumb_func
.type getValue,function
getValue:
push {r7, lr}
mov r7, sp
ldr r0, .myValue
b .exit_this_function
.align 2
.myValue:
.word 0xFEEDFACE
.align 2
.exit_this_function:
.code 16
mov sp, r7
pop {r7, pc}
.Lfe2:
.size getValue,.Lfe2-getValue
Now I'm not going to anotate this as I'm sure you can see that the compiler has taken the inline
code and wrapped it with the same prolog and epilog code
that it would out put around a normal "C" functionstatic unsigned int getValue( void )
{
asm volatile
("
ldr r0, .myValue
mov sp, r7
pop {r7, pc}
.align 2
.myValue:
.word 0xFEEDFACE
");
}
because you know what prolog code would be generated, so you can write your own
epilog.
I would not advise this approach, for the folowing reason.$(GCC) -O3 -S -mthumb -o test2_opt.s test2.cis should generate test2_opt.s is an assembler version of the test2.c file BUT this time we have told gcc to fully optimise the code (-O3)
.align 2 .thumb_func .type getValue,function getValue: ldr r0, .myValue b .exit_this_function .align 2 .myValue: .word 0xFEEDFACE .align 2 .exit_this_function: .code 16 bx lr .Lfe5: .size getValue,.Lfe5-getValueat first glance you will be forgiven for thinking that the compiler has gone against your wishes and plonked the CPU into ARM mode somewhere in main before calling this.
gcc -S -mthumb -o param1.s param1.c
gcc -O3 -S -mthumb -o param1_opt.s param1.cthen look into param1.c,param1.s and param1_opt.s and you will see examples of leaf functions before and after optimisation with differing prolog and epilogs due the optimisations that the complier could perform. I have no idea yet why in arm mode gcc uses mov pc, lr and in thumb mode uses bx as I did not have either in interwork mode and thumb has a mov pc, lr instruction
ldr r0, thumb_func_addr+1 mov lr, pc bx r0 ... rest of arm code ...This works because pc is 8 bytes ahead of the current instruction. and as long as the thumb sub-routine returns with a bx instruction all is o.k. if it uses pop {r4, pc} instruction then things will go pearshaped.
ldr r0, arm_func_addr mov lr, pc bx r0 ... rest of thumb code ...as you see the bottom bit of the link register is set to the correct mode for the return.