ARM Assembly — Pete's Tech Blog

ARM Loops and Stack Intro

The Stack

The stack is a special area of RAM that is reserved for a program by the Operating System. It is used primarily as memory that the program can use to organize and store local variables and function arguments which require more space than can be stored in available CPU registers.

The maximum stack size for a program is determined by the operating system. In Linux, the default maximum stack size in Kb can be output with:

ulimit -s

The stack limit above is 8Mb.

We will examine the stack in greater details in future sections, but for now understand these characteristics:

The stack is a linear data structure that follows a Last-In, First-Out (LIFO) principle
The last element added is always the first to be removed
New data can be "pushed" onto the stack or "popped" off the stack
The stack "grows down" in memory, which can be confusing because the "top" of the stack will always have the lowest memory address
The sp register stores the memory address for the top of the stack

Loops

A loop is a simple logical construct which repeatedly executes instructions until a condition is met. To demonstrate this functionality, we will write a program which will execute a block of code 10 times. The code will print the counter for the loop, showing what iteration it is on, and will utilize the stack to facilitate this:

	.section .rodata 
/* Linux Syscall constants */
	b_STDOUT = 0x01 
	b_WRITE  = 0x04 

/* Offset to convert a value to a single digit ASCII character decimal */
	b_ASCII_OFFSET = 0x30

.section .data
		begin_msg: 
		.ascii "Starting while loop:\n" 
		len_begin_msg = ( . - begin_msg)

		end_msg:
		.ascii "Loop ended.\n"
		len_end_msg = ( . - end_msg)


.section .text               
.global _start

_start:


print_begin_msg:
		ldr r7, =b_WRITE 
		ldr r0, =b_STDOUT 
		ldr r1, =begin_msg 
		ldr r2, =len_begin_msg
		svc #0

mov r3, #0x0 
/*  
		This sets r3 to 0 to prepare it to use as our counter for the loop.
		Use of r3 is arbitrary, any GP register will do, but r3 is the next register
		not used by the write syscall, which will be used in the loop
*/
begin_while:

		print_counter:
				ldr r7, =b_WRITE
				ldr r0, =b_STDOUT
				ldr r1, =b_ASCII_OFFSET /* Start with a value of 0x30 */
				add r1, r1, r3 
				/* 
						Add our counter value to 0x30 to get the ASCII decimal number for the counter
						0x30 is the hex value for the decimal ASCII 0, 0x31 is 1 etc.
				*/
				orr r1, r1, #0x0a00
				/*
						The orr instruction performs a logical or between two registers or immediate values.
						This effectively combines the 0x0a value for an ASCII newline character with our
						original value for the ASCII value of the loop counter and stores both in r1
				*/

				push {r1}
				/* 
						The push instruction will store the values in a list of registers in the memory stack.
						The values will be placed on the stack in order of the register numbers,
						so the lowest number register will by at the top of the stack and the highest number at the bottom.

						This command can be written in several different forms:
				
						sub sp, sp, #4
						str r1, [sp]
				
						This subtracts 4 bytes from the stack pointer address, then it stores (str) the value of 
						r1 at the stack pointer address

						str r1, [sp, #-4]!
				
						This executes the same thing in a single instruction: it stores r1 at the stack pointer
						address minus 4 bytes, then the ! character decrements the stack pointer
				*/

				mov r1, sp
				/*
						The stack pointer stores the memory address of the last data placed on the stack
						The last dat placed on the stack was the value stored in r1, which contains our
						two ASCII character codes. This instruction will store the memory address to that
						location in the stack in r1. This is necessary, because the write syscall takes
						a memory address as an argument for a string to write, not the actual value.
				*/
				mov r2, #0x2
				/*
						We will set arg3 for the syscall to 2 bytes, because we will print both the number character and the newline character.
				*/
				svc #0
				add sp, sp, #0x4
				/*
						This will move the stack pointer back up to its original position.
						This will allow us to overwrite the previous characters every time the loop runs.
						If we did not include this instruction, the stack would continue to grow every time the
						loop ran.
				*/

				cmp r3, #0x9
				/*
						This instruction performs a subtraction operation between the value in register r3 and the immediate value 9.
						The compare instruction (cmp) disgards the results of the subtraction operation, but it updates the zero (Z) and 
						negative (N) flags in the cpsr appropriately: 
						
						If the values are equal, the zero flag will be set to 1. 
						If the result of the subtraction is negative, then the negative flag is set to 1. 
						The carry (C) and overflow (V) flags are also set based on the result.

						This instruction is the same as writing:
						subs r0, r3, #0x09
						
						The subtract and set flags (subs) instruction performs the same operation as cmp, except it has
						the option of storing the result in a register. Even though r0 can be used to store the result 
						in the example above, by convention this indicates that the value should be disgarded.
						For a simple comparison, this isn't useful, but if we wanted to compare values 
						and store the result in r1, we could write:

						subs r1, r3, #0x09
						*/
				bge end_while
				/*
						The branch greater or equal to (bge) instruction checks the values of the cpsr flags
						If the zero flag is (1), it means the that the comparison was equal and it branches to the
						end_while label by setting the pc to the end_while label's memory address.
						If the negative flag is not set, that means that r3 was greater than #0x9, so the 
						program execution will also move to the end_while label.
				*/
		add r3, r3, #0x01 /* Increment our counter by 1 */
b begin_while 
/* 
		branch (b) is an unconditional branch instruction, this will always change the pc to the address of
		the begin_while label and continue execution.
*/

end_while:

print_end_msg:
		ldr r7, =b_WRITE 
		ldr r0, =b_STDOUT 
		ldr r1, =end_msg 
		ldr r2, =len_end_msg
		svc #00000000

exit_normally:

		mov r7, #0x00000001 
		mov r0, #0x00000000    
		svc #0x00000000

After reading through the source code and studying the comments, assemble and link it. Run it in qemu so that we can debug it with gdb.

ARM Debugging Loops Part 1

Once we are attached to our remote program in gdb, we will open our assembly and register layouts as before. We are familiar with the write syscall already, so we will advance forward to the start of our loop. Enter:

advance print_counter

|-Register group: general------------------------------------------------------------------------------------------------------------------------------------------------|
|r0             0x15                21                                               r1             0x200ec             131308                                           |
|r2             0x15                21                                               r3             0x0                 0                                                |
|r4             0x0                 0                                                r5             0x0                 0                                                |
|r6             0x0                 0                                                r7             0x4                 4                                                |
|r8             0x0                 0                                                r9             0x0                 0                                                |
|r10            0x200ec             131308                                           r11            0x0                 0                                                |
|r12            0x0                 0                                                sp             0x408009c0          0x408009c0                                       |
|lr             0x0                 0                                                pc             0x1008c             0x1008c <print_counter>                          |
|cpsr           0x10                16                                               fpscr          0x0                 0                                                |
|fpsid          0x410430f0          1090793712                                       fpexc          0x40000000          1073741824                                       |
|AFSR0_EL1      0x0                 0                                                AFSR1_EL1      0x0                 0                                                |
|DBGDIDR        0x3515f021          890630177                                        DBGDSAR        0x0                 0                                                |
|DBGBVR         0x0                 0                                                DBGBCR         0x0                 0                                                |
|DBGWVR         0x0                 0                                                DBGWCR         0x0                 0                                                |
|PAR            0x0                 0                                                DBGBVR         0x0                 0                                                |
|DBGBCR         0x0                 0                                                DBGWVR         0x0                 0                                                |
|DBGWCR         0x0                 0                                                TEECR          0x0                 0                                                |
|MIDR_EL1       0x412fc0f1          1093648625                                       CTR            0x8444c004          -2075869180                                      |
|TCMTR          0x0                 0                                                TTBR0_EL1      0x0                 0                                                |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|    0x10074 <_start>                mov     r7, #4                                                                                                                      |
|    0x10078 <_start+4>              mov     r0, #1                                                                                                                      |
|    0x1007c <_start+8>              ldr     r1, [pc, #96]   ; 0x100e4 <exit_normally+12>                                                                                |
|    0x10080 <_start+12>             mov     r2, #21                                                                                                                     |
|    0x10084 <_start+16>             svc     0x00000000                                                                                                                  |
|    0x10088 <_start+20>             mov     r3, #0                                                                                                                      |
|  > 0x1008c <print_counter>         mov     r7, #4                                                                                                                      |
|    0x10090 <print_counter+4>       mov     r0, #1                                                                                                                      |
|    0x10094 <print_counter+8>       mov     r1, #48 ; 0x30                                                                                                              |
|    0x10098 <print_counter+12>      add     r1, r1, r3                                                                                                                  |
|    0x1009c <print_counter+16>      orr     r1, r1, #2560   ; 0xa00                                                                                                     |
|    0x100a0 <print_counter+20>      push    {r1}            ; (str r1, [sp, #-4]!)                                                                                      |
|    0x100a4 <print_counter+24>      mov     r1, sp                                                                                                                      |
|    0x100a8 <print_counter+28>      mov     r2, #2                                                                                                                      |
|    0x100ac <print_counter+32>      svc     0x00000000                                                                                                                  |
|    0x100b0 <print_counter+36>      add     sp, sp, #4                                                                                                                  |
|    0x100b4 <print_counter+40>      subs    r0, r3, #9                                                                                                                  |
|    0x100b8 <print_counter+44>      bge     0x100c4 <print_end_msg>                                                                                                     |
|    0x100bc <print_counter+48>      add     r3, r3, #1                                                                                                                  |
|    0x100c0 <print_counter+52>      b       0x1008c <print_counter>                                                                                                     |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
remote Thread 1.30460 In: print_counter                                                                                                                 L??   PC: 0x1008c
(gdb) lay reg
(gdb) advance print_counter
0x0001008c in print_counter ()
(gdb)