ARM Assembly #14 - Using the BL Instruction That Automatically Saves the Return Address
The BL instruction is an extended version of the B instruction described in the previous post, adding function-call capability.
While the B instruction simply jumps to a target label, the BL instruction automatically stores the return address in the LR (Register R14) so execution can return after the branch.
For this reason, BL is a core mechanism for implementing function calls in ARM and forms the foundation for concepts like the stack, calling conventions, and frame pointers.
Why We Need to Store the Return Address
When a function finishes executing, it must return to the point where it was called. If the return address is not stored, the CPU won’t know which instruction to execute next, so the address must be saved somewhere.
The BL Instruction
bl label
The syntax looks identical to b label, but the internal behavior is different.
When executing bl label, the CPU jumps to the label and simultaneously stores the address of the next instruction (the return address) into LR.
After the subroutine finishes, the value stored in LR is loaded back into PC to restore the original control flow.
The LR (Link Register)
ARM uses the LR as a dedicated register to temporarily store the return address when making function calls.
In ARMv4, register R14 is used as the LR.
bl foo
mov r0, #2 @ LR has this instruction address.
When the subroutine finishes, execution must return using the value stored in LR by writing it into PC.
mov pc, lr
Since ARMv4 does not support the
bxinstruction, returning is done usingmov pc, lr. Starting from ARMv5,bx lrbecame the standard return instruction.
Example Code
.text
.global _start
_start:
mov r0, #2
bl foo
mov r1, r0
b .
foo:
add r0, r0, #3
mov pc, lr
Execution Flow
mov r0, #2→ R0 = 2bl foo→ PC = foo, LR = address ofmov r1, r0add r0, r0, #3→ R0 = 5mov pc, lr→ PC jumps tomov r1, r0- run
mov r1, r0
This example shows a simple flow where the subroutine foo adds 3 to R0 and returns the result to be stored in R1.
Without mov pc, lr, the result would never be written to R1 and the program flow would not continue properly.
Debugging
(gdb) target remote :1234
(gdb) display/i $pc
(gdb) display/i $lr
By inspecting PC and LR right after bl foo, you can clearly see that BL stores the next instruction’s address in LR.
In particular, if LR matches the address of mov r1, r0, the behavior of BL becomes very intuitive.
Conclusion
In the next post, we’ll examine condition codes for conditional execution. Until now, every instruction executed unconditionally, but ARM allows adding two-letter condition codes so instructions run only under specific conditions.