ARM Assembly

ARM Assembly Introduction
Now that you have learned some assembly fundamentals, it is time examine a different architecture.
ARM processors are an extremely popular choice for devices such as smartphones, tablets, TVs, routers, IoT systems, and other embedded devices. In this section we will examine the basic ARM 32-bit architecture, write a Hello World program, cross-assemble it, link it, run it by emulating an ARM processor on our x86_64 machine, and debug it with GDB.
ARM Tool Installation
For this section you will need to install gcc for arm, gdb for multiple architectures, and qemu user tools for emulation
To install the necessary packages enter:
sudo apt install gcc-arm-linux-gnueabihf gdb-multiarch qemu-user
ARM Registers
While we haven't examined the x86_64 architecture yet, you will discover that the 32-bit implementation of ARM is much more similar to its 64-bit counterpart than the x86/i386 architecture is to its 64-bit x86_64/AMD64 counterpart. This is because ARM has kept much more parity developing the 32-bit and 64-bit implementations of its architecture. We will first be examining the common 32-bit registers used by ARMv7 or ARMv8 (when operating in 32-bit mode). As with Intel for the x86 processor, extensive documentation for the ARM architecture is available here.
It should also be noted that ARM devides its processors in to 3 profiles:
  1. a - Application profile, used for general purpose computing
  2. m - Microcontroller profile, used for small low-power applications such as sensors
  3. r - Real-time profile, used in applications that require predictable and consisting timing with processor results

ARM uses a version number to refer to the major revision of the architecture and instruction sets, such as v7, v8, v9, etc. Different versions may support either 32-bit or 64-bit operations, or both.

32-bit ARMv7 registers can be broken down as follows:

General purpose registers Special Function Registers Program Status Registers (Similar to the x86 eflag register) Floating point registers r0 r13 sp (stack pointer) (equivalent to x86 esp) cpsr (current program status register) 32-bit (float) r1 r14 lr (link register) (equivalent to x86 ebp) spsr (saved program status register) s0 to s31 r2 r15 pc (program counter) (equivalent to x86 eip) r3 64-bit (double) r4 d0 to d15 r5 r6 r7 r8 r9 r10 r11 r12

The flags for the cpsr are shown below:
bit 0x1F 1E 1D 1C 1B 1A 19 18 17 16 15 14 13 12 11 10 0F 0E 0D 0C 0B 0A 09 08 07 06 05 04 03 02 01 00 flag N Z C V Q 00 00 00 SSBS PAN DIT 00 | GE | 00 00 00 00 00 00 E A I F T 00 | M |

Below are the cpsr flag functions:
  • N: (Negative) flag, indicates whether the result of the last operation was negative (1) or positive (0)
  • Z: (Zero) flag, indicates whether the result of the last operation was zero (1) or not zero (0)
  • C: (Carry) flag, indicates whether there was a carry (1) or not (0) during the last arithmetic operation
  • V: (Overflow) flag, indicates whether an overflow occurred (1) or not (0) during the last arithmetic operation
  • Q (Saturation) flag, indicates whether saturation occurred (1) or not (0) during the last operation
  • SSBS (Speculative Store Bypass Safe) flag, indicates wether speculative loading of data is permitted (1) or not (0)
  • PAN (Privileged Access Never) flag, indicates wether privileged instructions can be executed in User mode (1) or not (0)
  • DIT (Data Independent Timing) flag, indicates if wether the processor can (0) execute instructions with timing independent timing of data processing or not (1)
  • GE (Greater than or equal), indicate the results of signed comparisons between operands
  • IT (If-Then) flags, indicate the execution state of the If-Then instruction
  • J (Jazelle) flag, indicates whether the processor is executing in Jazelle (Java support) mode (1) or not (0)
  • E (Endianness) flag, indicates the endianness of the processor, either little-endian (0) or big-endian (1)
  • A (Auxiliary carry), indicates whether there was a carry (1) or not (0) between the low nibble and high nibble during an 8-bit operation
  • I (Interrupt) flag, indicates whether maskable (optional) hardware interrupts should be processed (1) or not (0)
  • F (Fast Interrupt), indicates whether fast interrupt exceptions should be processed (1) or not (0)
  • T (Thumb) flag, indicates the execution state of the processor, either Thumb (1) or ARM (0)
  • M (Processor mode) flags, indicate the current processor mode, such as User, System, FIQ, IRQ, Supervisor, Abort, Undefined, or Monitor
While there are unique flags stored by ARM32 in the cpsr register, six of them are the same as in the x86 eflag:
Flag in ARM Flag in x86 Flag N SF Negative Z ZF Zero C CF Carry V OF Overflow A AF Auxiliary I IF Interrupt

The spsr is used to save the state of the cpsr registers when the processor changes privilege modes. This frees the cpsr to load flags for the current state and allows the previous state to be restored later.

ARM Hello World
We are now ready to write a hello world program for ARM. We will build upon what we have already learned from our x86 hello world, and note the differences for GNU ARM assembly.

We will be using Linux syscall table references again. This time for ARM 32-bit.

.section .rodata /* The .rodata section will be stored as read-only in memory. This section is included by GAS in the overall .data section, but it is flagged as read-only */ b_STDOUT = 0x01 /* This defines b_STDOUT as a byte sized constant with a value of 0x01 */ b_WRITE = 0x04 /* This defines b_WRITE as a byte sized contant with avalue of 0x04 */ .section .data hello_msg: .ascii "Hello World!\n" end_hello_msg: len_hello_msg = (end_hello_msg - hello_msg) /* This declares a variable len_hello_msg and assigns it the difference between the end_hello_msg label address and the hello_msg label address. Parenthesis are not necessary in this instance, len_hello_msg = end_hello_msg - hello_msg would evaluate the same */ unused_label: .hword 0xbeef /* This label is here to illustrate how GAS stores label addresses for ARM assembly that are not assigned during the program execution, vs. those that are. Note: The size of a word depends on the processor architecture, for ARM32 a word is 32 bits (4 bytes), so to store 2 bytes of data, we use the half-word (.hword) directive */ .section .text .global _start _start: /* Write "At start" and "Hello World!" to stdout Write syscall reference: r7 r0 (arg0) r1(arg1) r2(arg2) 0x04 unsigned int fd const char *buf size_t count */ print_start_msg: ldr r7, =b_WRITE /* The load register (ldr) instruction is similar to the lea instruction for the x86 processor in that, it loads a calculated memory address or immediate value into a register. Like the eax register for x86, r7 is used to determine the syscall function for ARM Using the = character with ldr is an ARM specific pseudo-instruction that specifies a symbol name which represents a constant value or an address. The assembler will determine the type of value and modify the instruction to either load a relative memory address or an immediate value. For this instruction, ldr will load an immediate value into r7 because b_STDOUT is a constant and not the label for a memory address. For more information on this instruction, refer to this reference: https://developer.arm.com/documentation/dui0041/c/Babbfdih */ ldr r0, =b_STDOUT /* Another constant value loaded for the FD */ adr r1, start_msg /* Address (adr) loads the address of a label into a register. The major functional difference between ldr and adr is that adr can only reference memory locations inside the .text section of code, while ldr can resolve addresses and values from any section. While both ldr and adr could load addresses from labels in the .text section, adr is more efficient for this specific task and should be used for that purpose */ ldr r2, =len_start_msg /* This will resolve to the value of len_start_msg, and load it into r2 */ svc #00000000 /* When writing ARM assembly for GAS, the # character is use to prefix an immediate value assignment This SuperVisor (svc) call is similar to the int 0x80 call for the x86. It will initiate the execution of the syscall by calling a system interrupt. svc creates an exception and passes the immediate value to the exception handler. In earlier versions of ARM svc was called swi (SoftWare Interrupt), but they effectively the same */ write_hello_msg: ldr r7, =b_WRITE ldr r0, =b_STDOUT ldr r1, =hello_msg /* For this instruction, hello_msg is a label for a memory address located in the .data section. Using the ldr pseudo-operation, the assembler will create an immediate values at the end the .text section to store the label address in. It will then reference the memory location for that immediate value and assign it to r1. It uses the Program Counter (PC) register as a base address and offsets from PC to the address. This is essentially what we did with the adr instruction, except the assembly is copying the address for the label in .data and placing it in the .text section to assign. */ ldr r2, =len_hello_msg svc #0x00000000 exit_normally: /* exit syscall reference: r7 r0 (arg0) 0x01 int error_code */ mov r7, #0x00000001 /* Like with x86 assembly, mov can be used to load an immediate value into a register */ mov r0, #0x00000000 svc #0x00000000 /* The following section of code was added to show how you can also place variables and labels for data in the .text section after your code. They must be placed after your code, because they are not executable instructions. They should never be reached by your program's noraml execution or it will crash. */ start_msg: .ascii "At start\n" len_start_msg = . - start_msg /* In GAS, the . character is used to reference the current position in memory, so instead of creating the label end_start_msg and writing "len_start_msg = (end_start_msg - start_msg)" we can just write this as shorthand. */
ARM Assemble Link and Run
Once you have copied the code and have thoroughly read through the comments, it is time to make it executable. GNU provides a cross-assembler for the ARM instruction set which is included in the gcc-arm-linux-gnueabihf package. To assemble the source, navigate to the directory where the source is saved and enter the command:
arm-linux-gnueabihf-as -o hello_arm32.o hello_arm32.asm
This command assumes that you saved the source file as hello_arm32.asm. It directs GAS to create a 32-bit object file named hello_arm32.o and to use the hello_arm32.asm file as an input source.
Enter the command:
arm-linux-gnueabihf-ld -o hello_arm32 hello_arm32.o
Now that the executable is created, we will use Quick Emulator (QEMU) to execute it natively on our x86 system. Enter the command:
qemu-arm hello_arm32

It should produce the output:
At start Hello World!

ARM Debugging in GDB
QEMU has an option that allows GDB to connect to it over a network socket. To run our program in QEMU as a GDB server enter:
qemu-arm -g 2345 hello_arm32 &
This will launch our program in the background with QEMU and bind to port 2345. Port 2345 is an arbitrary and can be changed to whatever you want to bind to.

Once QEMU is running, we will launch GDB for multiarchitectures, open our binary as a template, and then connect to the running process in QEMU. To do so enter:
$gdb-multiarch (gdb) file hello_arm32 (gdb) target remote localhost:2345

You should see an output similar to the following:
pete@framework16:~/Documents/ASM/hello_world/ARM32$ gdb-multiarch GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1 Copyright (C) 2022 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word". (gdb) file hello_arm32 Reading symbols from hello_arm32... (No debugging symbols found in hello_arm32) (gdb) target remote localhost:2345 Remote debugging using localhost:2345 0x00010074 in _start () (gdb)
Notice that we do not need to set a break point and run the program, because QEMU has already set a break at the _start label and executed it.

We can now open our layouts with:
lay asm lay reg
You should now have the familiar layout of registers, assembly, and commands.
Note, there will be no register values loaded as we haven't stepped into an instruction yet.
Let's examine our first instruction:
| > 0x10074 <_start> mov r7, #4
The assembly source was ldr r7, =b_WRITE, but because b_WRITE was a constant value, the assembler translated this to just moving its immediate value into the register.
The next instruction is the same as the first, so let's step into our instructions until we reach the third line:
|-Register group: general------------------------------------------------------------------------------------------------------------------------------------------------| |r0 0x1 1 r1 0x40800b39 1082133305 r2 0x0 0 | |r3 0x0 0 r4 0x0 0 r5 0x0 0 | |r6 0x0 0 r7 0x4 4 r8 0x0 0 | |r9 0x0 0 r10 0x200bc 131260 r11 0x0 0 | |r12 0x0 0 sp 0x408009d0 0x408009d0 lr 0x0 0 | |pc 0x1007c 0x1007c <_start+8> cpsr 0x10 16 fpscr 0x0 0 | |fpsid 0x410430f0 1090793712 fpexc 0x40000000 1073741824 AFSR0_EL1 0x0 0 | |AFSR1_EL1 0x0 0 DBGDIDR 0x3515f021 890630177 DBGDSAR 0x0 0 | |DBGBVR 0x0 0 DBGBCR 0x0 0 DBGWVR 0x0 0 | |DBGWCR 0x0 0 PAR 0x0 0 DBGBVR 0x0 0 | |DBGBCR 0x0 0 DBGWVR 0x0 0 DBGWCR 0x0 0 | |TEECR 0x0 0 MIDR_EL1 0x412fc0f1 1093648625 CTR 0x8444c004 -2075869180 | |TCMTR 0x0 0 TTBR0_EL1 0x0 0 PMCCNTR 0x0 0 | |TLBTR 0x0 0 TTBR1_EL1 0x0 0 MIDR 0x412fc0f1 1093648625 | |TTBCR 0x0 0 MPIDR_EL1 0x80000000 -2147483648 TTBCR2 0x0 0 | |REVIDR_EL1 0x0 0 MIDR 0x412fc0f1 1093648625 JIDR 0x0 0 | |CLIDR 0xa200023 169869347 DFAR 0x0 0 WFAR 0x0 0 | |IFAR 0x0 0 JMCR 0x0 0 AIDR 0x0 0 | |CSSELR 0x0 0 ID_PFR2 0x10 16 VBAR 0x0 0 | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 0x10074 <_start> mov r7, #4 | | 0x10078 <_start+4> mov r0, #1 | | > 0x1007c <_start+8> add r1, pc, #36 ; 0x24 | | 0x10080 <_start+12> ldr r2, [pc, #44] ; 0x100b4 <start_msg+12> | | 0x10084 <_start+16> svc 0x00000000 | | 0x10088 <write_hello_msg> mov r7, #4 | | 0x1008c <write_hello_msg+4> ldr r1, [pc, #36] ; 0x100b8 <start_msg+16> | | 0x10090 <write_hello_msg+8> mov r0, #1 | | 0x10094 <write_hello_msg+12> mov r2, #13 | | 0x10098 <write_hello_msg+16> svc 0x00000000 | | 0x1009c <exit_normally> mov r7, #1 | | 0x100a0 <exit_normally+4> mov r0, #0 | | 0x100a4 <exit_normally+8> svc 0x00000000 | | 0x100a8 <start_msg> ; <UNDEFINED> instruction: 0x73207441 | | 0x100ac <start_msg+4> ldrbtvc r6, [r2], #-372 ; 0xfffffe8c | | 0x100b0 <start_msg+8> andeq r0, r0, r10 | | 0x100b4 <start_msg+12> andeq r0, r0, r9 | | 0x100b8 <start_msg+16> strheq r0, [r2], -r12 | | 0x100bc cfstr64vs mvdx6, [r12], #-288 ; 0xfffffee0 | | 0x100c0 svcvs 0x0057206f | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| remote Thread 1.383292 In: _start L?? PC: 0x1007c (gdb) lay reg (gdb) si 0x00010078 in _start () (gdb) si 0x0001007c in _start () (gdb)

Our instruction:
adr r1, start_msg
has been translated to:
add r1, pc, #36

The add instruction takes the destination register to store the result, and the two arguments to add. In this instance, the immediate value #36 is being added to the pc (program counter) register's value.
This is where things can be confusing. While GDB lists the pc register currently as 0x1007c, ARM's pc register actually stays two instructions ahead of the program, and since these are 32-bit instructions, the value of the pc register will actually be 8 bytes more than our current line (32 bits * 2 = 64 bits = 8 bytes).

While you would expect add r1,pc, #36 to store, 0x100a0 in the register, if we step forward one instruction:
r1 0x100a8 65704
We see that 0x100a8 is in fact stored in r1.

If we look further down our assembly layout, we can see that it is our start_msg label:
0x100a8 <start_msg> ; <UNDEFINED> instruction: 0x73207441
Notice that the disassembler is attempting to interpret the data as instructions, this is because it resides in the .text section with our code, but it does not contain valid assembly instructions.

Now let's examine our ldr instruction:
> 0x10080 <_start+12> ldr r2, [pc, #44]
Our original instruction was:
ldr r2, =len_start_msg
This was resolved by the assembler to:
ldr r2, [pc, #44]

len_start_msg is a variable symbol, so =len_start_msg will evaluate to loading the value for that symbol.
pc, #44 takes the current pc register value and adds 44 to it.
[pc, #44] evaluates the data at that address and loads it to the destination register r2.
We know that the pc value will be two instructions ahead, so if we add 0x10080 + 8 + 44:
(gdb) print/x (0x10080 + 8 + 44) $2 = 0x100b4
And we know that the length of "At start\n" should be 9 bytes, so 0x09 should be stored at 0x100b4:
(gdb) x /1xb $2 0x100b4 <start_msg+12>: 0x09
And we can see that 0x09 is indeed stored at that location.
Notice when we printed our address calculation, GDB automatically stored it in the variable $2 to allow for easy referencing.

Let's step forward in our program to the next ldr instruction:
0x1008c <write_hello_msg+4> ldr r1, [pc, #36] ; 0x100b8 <start_msg+16>

For this instruction, the assembly is loading the value [] stored at the offset of the pc register + 36 bytes.
This should evaluate to the value stored at:
(gdb) print /x (0x1008c + 8 + 36) $3 = 0x100b8

What is stored at 0x100b8 ?
(gdb) x /4xb $3 0x100b8 <start_msg+16>: 0xbc 0x00 0x02 0x00

This is the memory addres 0x000200bc in little endian.
Where is this address?
(gdb) info file Symbols from "/home/pete/Documents/ASM/hello_world/ARM32/hello_arm32". Remote target using gdb-specific protocol: `/home/pete/Documents/ASM/hello_world/ARM32/hello_arm32', file type elf32-littlearm. Entry point: 0x10074 0x00010074 - 0x000100bc is .text 0x000200bc - 0x000200cb is .data While running this, GDB does not access memory from... Local exec file: `/home/pete/Documents/ASM/hello_world/ARM32/hello_arm32', file type elf32-littlearm. Entry point: 0x10074 0x00010074 - 0x000100bc is .text 0x000200bc - 0x000200cb is .data (gdb)
We can see it is in our data section:
(gdb) x /13cb 0x000200bc 0x200bc: 72 'H' 101 'e' 108 'l' 108 'l' 111 'o' 32 ' ' 87 'W' 111 'o' 0x200c4: 114 'r' 108 'l' 100 'd' 33 '!' 10 '\n'
And there is our Hello World! message.

The assembler retrieved the address of our hello_msg label from the .data section,
then it appended that address value to the end of our .text section of code,
then it loaded that address into the r1 register by offsetting from the pc register
to the memory address in the .text section that contained the memory address for the actual data.
ARM Loops and Stack Intro

The Stack

The stack is a special area of RAM that is reserved for a program by the Operating System. It is used primarily as memory that the program can use to organize and store local variables and function arguments which require more space than can be stored in available CPU registers.

The maximum stack size for a program is determined by the operating system. In Linux, the default maximum stack size in Kb can be output with:
ulimit -s 8192
The stack limit above is 8Mb.

We will examine the stack in greater details in future sections, but for now understand these characteristics:
  • The stack is a linear data structure that follows a Last-In, First-Out (LIFO) principle
  • The last element added is always the first to be removed
  • New data can be "pushed" onto the stack or "popped" off the stack
  • The stack "grows down" in memory, which can be confusing because the "top" of the stack will always have the lowest memory address
  • The sp register stores the memory address for the top of the stack

Loops

A loop is a simple logical construct which repeatedly executes instructions until a condition is met.
To demonstrate this functionality, we will write a program which will execute a block of code 10 times.
The code will print the counter for the loop, showing what iteration it is on, and will utilize the stack to facilitate this:

.section .rodata /* Linux Syscall constants */ b_STDOUT = 0x01 b_WRITE = 0x04 /* Offset to convert a value to a single digit ASCII character decimal */ b_ASCII_OFFSET = 0x30 .section .data begin_msg: .ascii "Starting while loop:\n" len_begin_msg = ( . - begin_msg) end_msg: .ascii "Loop ended.\n" len_end_msg = ( . - end_msg) .section .text .global _start _start: print_begin_msg: ldr r7, =b_WRITE ldr r0, =b_STDOUT ldr r1, =begin_msg ldr r2, =len_begin_msg svc #0 mov r3, #0x0 /* This sets r3 to 0 to prepare it to use as our counter for the loop. Use of r3 is arbitrary, any GP register will do, but r3 is the next register not used by the write syscall, which will be used in the loop */ begin_while: print_counter: ldr r7, =b_WRITE ldr r0, =b_STDOUT ldr r1, =b_ASCII_OFFSET /* Start with a value of 0x30 */ add r1, r1, r3 /* Add our counter value to 0x30 to get the ASCII decimal number for the counter 0x30 is the hex value for the decimal ASCII 0, 0x31 is 1 etc. */ orr r1, r1, #0x0a00 /* The orr instruction performs a logical or between two registers or immediate values. This effectively combines the 0x0a value for an ASCII newline character with our original value for the ASCII value of the loop counter and stores both in r1 */ push {r1} /* The push instruction will store the values in a list of registers in the memory stack. The values will be placed on the stack in order of the register numbers, so the lowest number register will by at the top of the stack and the highest number at the bottom. This command can be written in several different forms: sub sp, sp, #4 str r1, [sp] This subtracts 4 bytes from the stack pointer address, then it stores (str) the value of r1 at the stack pointer address str r1, [sp, #-4]! This executes the same thing in a single instruction: it stores r1 at the stack pointer address minus 4 bytes, then the ! character decrements the stack pointer */ mov r1, sp /* The stack pointer stores the memory address of the last data placed on the stack The last dat placed on the stack was the value stored in r1, which contains our two ASCII character codes. This instruction will store the memory address to that location in the stack in r1. This is necessary, because the write syscall takes a memory address as an argument for a string to write, not the actual value. */ mov r2, #0x2 /* We will set arg3 for the syscall to 2 bytes, because we will print both the number character and the newline character. */ svc #0 add sp, sp, #0x4 /* This will move the stack pointer back up to its original position. This will allow us to overwrite the previous characters every time the loop runs. If we did not include this instruction, the stack would continue to grow every time the loop ran. */ cmp r3, #0x9 /* This instruction performs a subtraction operation between the value in register r3 and the immediate value 9. The compare instruction (cmp) disgards the results of the subtraction operation, but it updates the zero (Z) and negative (N) flags in the cpsr appropriately: If the values are equal, the zero flag will be set to 1. If the result of the subtraction is negative, then the negative flag is set to 1. The carry (C) and overflow (V) flags are also set based on the result. This instruction is the same as writing: subs r0, r3, #0x09 The subtract and set flags (subs) instruction performs the same operation as cmp, except it has the option of storing the result in a register. Even though r0 can be used to store the result in the example above, by convention this indicates that the value should be disgarded. For a simple comparison, this isn't useful, but if we wanted to compare values and store the result in r1, we could write: subs r1, r3, #0x09 */ bge end_while /* The branch greater or equal to (bge) instruction checks the values of the cpsr flags If the zero flag is (1), it means the that the comparison was equal and it branches to the end_while label by setting the pc to the end_while label's memory address. If the negative flag is not set, that means that r3 was greater than #0x9, so the program execution will also move to the end_while label. */ add r3, r3, #0x01 /* Increment our counter by 1 */ b begin_while /* branch (b) is an unconditional branch instruction, this will always change the pc to the address of the begin_while label and continue execution. */ end_while: print_end_msg: ldr r7, =b_WRITE ldr r0, =b_STDOUT ldr r1, =end_msg ldr r2, =len_end_msg svc #00000000 exit_normally: mov r7, #0x00000001 mov r0, #0x00000000 svc #0x00000000

After reading through the source code and studying the comments, assemble and link it. Run it in qemu so that we can debug it with gdb.
ARM Debugging Loops
Once we are attached to our remote program in gdb, we will open our assembly and register layouts as before.
We are familiar with the write syscall already, so we will advance forward to the start of our loop.
Enter:
advance print_counter
Your output should look similar to this:
|-Register group: general------------------------------------------------------------------------------------------------------------------------------------------------| |r0 0x15 21 r1 0x200ec 131308 | |r2 0x15 21 r3 0x0 0 | |r4 0x0 0 r5 0x0 0 | |r6 0x0 0 r7 0x4 4 | |r8 0x0 0 r9 0x0 0 | |r10 0x200ec 131308 r11 0x0 0 | |r12 0x0 0 sp 0x408009c0 0x408009c0 | |lr 0x0 0 pc 0x1008c 0x1008c <print_counter> | |cpsr 0x10 16 fpscr 0x0 0 | |fpsid 0x410430f0 1090793712 fpexc 0x40000000 1073741824 | |AFSR0_EL1 0x0 0 AFSR1_EL1 0x0 0 | |DBGDIDR 0x3515f021 890630177 DBGDSAR 0x0 0 | |DBGBVR 0x0 0 DBGBCR 0x0 0 | |DBGWVR 0x0 0 DBGWCR 0x0 0 | |PAR 0x0 0 DBGBVR 0x0 0 | |DBGBCR 0x0 0 DBGWVR 0x0 0 | |DBGWCR 0x0 0 TEECR 0x0 0 | |MIDR_EL1 0x412fc0f1 1093648625 CTR 0x8444c004 -2075869180 | |TCMTR 0x0 0 TTBR0_EL1 0x0 0 | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 0x10074 <_start> mov r7, #4 | | 0x10078 <_start+4> mov r0, #1 | | 0x1007c <_start+8> ldr r1, [pc, #96] ; 0x100e4 <exit_normally+12> | | 0x10080 <_start+12> mov r2, #21 | | 0x10084 <_start+16> svc 0x00000000 | | 0x10088 <_start+20> mov r3, #0 | | > 0x1008c <print_counter> mov r7, #4 | | 0x10090 <print_counter+4> mov r0, #1 | | 0x10094 <print_counter+8> mov r1, #48 ; 0x30 | | 0x10098 <print_counter+12> add r1, r1, r3 | | 0x1009c <print_counter+16> orr r1, r1, #2560 ; 0xa00 | | 0x100a0 <print_counter+20> push {r1} ; (str r1, [sp, #-4]!) | | 0x100a4 <print_counter+24> mov r1, sp | | 0x100a8 <print_counter+28> mov r2, #2 | | 0x100ac <print_counter+32> svc 0x00000000 | | 0x100b0 <print_counter+36> add sp, sp, #4 | | 0x100b4 <print_counter+40> subs r0, r3, #9 | | 0x100b8 <print_counter+44> bge 0x100c4 <print_end_msg> | | 0x100bc <print_counter+48> add r3, r3, #1 | | 0x100c0 <print_counter+52> b 0x1008c <print_counter> | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| remote Thread 1.30460 In: print_counter L?? PC: 0x1008c (gdb) lay reg (gdb) advance print_counter 0x0001008c in print_counter () (gdb)

Let's step into our instructions from here and examine the orr instruction at 0x1009c:
|-Register group: general------------------------------------------------------------------------------------------------------------------------------------------------| |r0 0x1 1 r1 0x30 48 | |r2 0x15 21 r3 0x0 0 | |r4 0x0 0 r5 0x0 0 | |r6 0x0 0 r7 0x4 4 | |r8 0x0 0 r9 0x0 0 | |r10 0x200ec 131308 r11 0x0 0 | |r12 0x0 0 sp 0x408009c0 0x408009c0 | |lr 0x0 0 pc 0x1009c 0x1009c <print_counter+16> | |cpsr 0x10 16 fpscr 0x0 0 | |fpsid 0x410430f0 1090793712 fpexc 0x40000000 1073741824 | |AFSR0_EL1 0x0 0 AFSR1_EL1 0x0 0 | |DBGDIDR 0x3515f021 890630177 DBGDSAR 0x0 0 | |DBGBVR 0x0 0 DBGBCR 0x0 0 | |DBGWVR 0x0 0 DBGWCR 0x0 0 | |PAR 0x0 0 DBGBVR 0x0 0 | |DBGBCR 0x0 0 DBGWVR 0x0 0 | |DBGWCR 0x0 0 TEECR 0x0 0 | |MIDR_EL1 0x412fc0f1 1093648625 CTR 0x8444c004 -2075869180 | |TCMTR 0x0 0 TTBR0_EL1 0x0 0 | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 0x10074 <_start> mov r7, #4 | | 0x10078 <_start+4> mov r0, #1 | | 0x1007c <_start+8> ldr r1, [pc, #96] ; 0x100e4 <exit_normally+12> | | 0x10080 <_start+12> mov r2, #21 | | 0x10084 <_start+16> svc 0x00000000 | | 0x10088 <_start+20> mov r3, #0 | | 0x1008c <print_counter> mov r7, #4 | | 0x10090 <print_counter+4> mov r0, #1 | | 0x10094 <print_counter+8> mov r1, #48 ; 0x30 | | 0x10098 <print_counter+12> add r1, r1, r3 | | > 0x1009c <print_counter+16> orr r1, r1, #2560 ; 0xa00 |

As we step through the instruction we can see that the value of r1 changes from 0x30 to 0x0a30.
It now contains the value for ASCII "0\n"
r1 0xa30 2608

Now lets step to the push instruction:
|-Register group: general------------------------------------------------------------------------------------------------------------------------------------------------| |r0 0x1 1 r1 0xa30 2608 r2 0x15 21 | |r3 0x0 0 r4 0x0 0 r5 0x0 0 | |r6 0x0 0 r7 0x4 4 r8 0x0 0 | |r9 0x0 0 r10 0x200ec 131308 r11 0x0 0 | |r12 0x0 0 sp 0x408009c0 0x408009c0 lr 0x0 0 | |pc 0x100a0 0x100a0 <print_count cpsr 0x10 16 fpscr 0x0 0 | |fpsid 0x410430f0 1090793712 fpexc 0x40000000 1073741824 AFSR0_EL1 0x0 0 | |AFSR1_EL1 0x0 0 DBGDIDR 0x3515f021 890630177 DBGDSAR 0x0 0 | |DBGBVR 0x0 0 DBGBCR 0x0 0 DBGWVR 0x0 0 | |DBGWCR 0x0 0 PAR 0x0 0 DBGBVR 0x0 0 | |DBGBCR 0x0 0 DBGWVR 0x0 0 DBGWCR 0x0 0 | |TEECR 0x0 0 MIDR_EL1 0x412fc0f1 1093648625 CTR 0x8444c004 -2075869180 | |TCMTR 0x0 0 TTBR0_EL1 0x0 0 PMCCNTR 0x0 0 | |TLBTR 0x0 0 TTBR1_EL1 0x0 0 MIDR 0x412fc0f1 1093648625 | |TTBCR 0x0 0 MPIDR_EL1 0x80000000 -2147483648 TTBCR2 0x0 0 | |REVIDR_EL1 0x0 0 MIDR 0x412fc0f1 1093648625 JIDR 0x0 0 | |CLIDR 0xa200023 169869347 DFAR 0x0 0 WFAR 0x0 0 | |IFAR 0x0 0 JMCR 0x0 0 AIDR 0x0 0 | |CSSELR 0x0 0 ID_PFR2 0x10 16 VBAR 0x0 0 | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 0x10074 <_start> mov r7, #4 | | 0x10078 <_start+4> mov r0, #1 | | 0x1007c <_start+8> ldr r1, [pc, #96] ; 0x100e4 <exit_normally+12> | | 0x10080 <_start+12> mov r2, #21 | | 0x10084 <_start+16> svc 0x00000000 | | 0x10088 <_start+20> mov r3, #0 | | 0x1008c <print_counter> mov r7, #4 | | 0x10090 <print_counter+4> mov r0, #1 | | 0x10094 <print_counter+8> mov r1, #48 ; 0x30 | | 0x10098 <print_counter+12> add r1, r1, r3 | | 0x1009c <print_counter+16> orr r1, r1, #2560 ; 0xa00 | | > 0x100a0 <print_counter+20> push {r1} ; (str r1, [sp, #-4]!) |
Notice the value of the sp register before the push:
sp 0x408009c0

Now after the push:
sp 0x408009bc

This is 4 bytes lower than the previous address. Lets examine the data at that address:
(gdb) x /4xb 0x408009bc 0x408009bc: 0x30 0x0a 0x00 0x00

We can see the value of r1 is now at that address.
The instruction mov r1, sp will store that address in r1 to pass to the write syscall.

Let's now examine the cmp instruction at 0x100b4:
> 0x100b4 <print_counter+40> cmp r3, #9
(gdb) info registers cpsr cpsr 0x10 16 (gdb) si 0x000100b8 in print_counter () (gdb) info registers cpsr cpsr 0x80000010 -2147483632
As we step through the instruction, we can see the value of the flags in cpsr change.

What flags are now set? We could print the value in binary with:
(gdb) print/t 0x80000010 $1 = 10000000000000000000000000010000
This is still difficult to read and determine what flags are set.

We know that the Z, N, C, and V flags are set by the cmp instruction, so let's format the output to show those flags.
Enter the following script to show a formatted output for the flags:
printf "N=%d Z=%d C=%d V=%d\n", (($cpsr & (1 << 31)) != 0), (($cpsr & (1 << 30)) != 0), (($cpsr & (1 << 29)) != 0), (($cpsr & (1 << 28)) != 0)
This is a script in C-style code which takes the cpsr register value performs a bitwise and operation on a bit that is shifted left to the position of the corresponding flag bit, if the bit is set, the statement will be non-zero and evaluate true, which will print a 1.

We can see from this script that the negative bit was set by the comparison, because 0 - 9 = -9 which is negative.
N=1 Z=0 C=0 V=0

This would be lengthy to type out every time we want to check those flags, so lets open a text editor and save the script as cpsr_cmp.gdb
We can now run the script inside gdb by entering:
source cpsr_cmp.gdb
This is assuming you placed it in the same path as the current executable.
Otherwise you must use the path to the script.

We ant to iterate through our loop, but we don't want to manually step through every instruction over and over again.
We can automate this process by using another script. First, we will set a break point at the end of our loop with:
(gdb) break *0x100c0 Breakpoint 1 at 0x100c0
The * character lets gdb know that the value is a memory address and not a label name.

Enter:
continue
To skip down to our break point.

Now we can write our script.
Enter:
(gdb) set $count = 0 (gdb) while $count < 8 >source cpsr_cmp.gdb >continue >set $count = $count +1 >end
We just wrote a while loop to debug our while loop.

Your output should be similar to this (note you may have to enter ctrl + l to re-draw your screen):
|-Register group: general------------------------------------------------------------------------------------------------------------------------------------------------| |r0 0x2 2 r1 0x408009bc 1082132924 | |r2 0x2 2 r3 0x9 9 | |r4 0x0 0 r5 0x0 0 | |r6 0x0 0 r7 0x4 4 | |r8 0x0 0 r9 0x0 0 | |r10 0x200ec 131308 r11 0x0 0 | |r12 0x0 0 sp 0x408009c0 0x408009c0 | |lr 0x0 0 pc 0x100c0 0x100c0 <print_counter+52> | |cpsr 0x80000010 -2147483632 fpscr 0x0 0 | |fpsid 0x410430f0 1090793712 fpexc 0x40000000 1073741824 | |AFSR0_EL1 0x0 0 AFSR1_EL1 0x0 0 | |DBGDIDR 0x3515f021 890630177 DBGDSAR 0x0 0 | |DBGBVR 0x0 0 DBGBCR 0x0 0 | |DBGWVR 0x0 0 DBGWCR 0x0 0 | |PAR 0x0 0 DBGBVR 0x0 0 | |DBGBCR 0x0 0 DBGWVR 0x0 0 | |DBGWCR 0x0 0 TEECR 0x0 0 | |MIDR_EL1 0x412fc0f1 1093648625 CTR 0x8444c004 -2075869180 | |TCMTR 0x0 0 TTBR0_EL1 0x0 0 | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 0x10098 <print_counter+12> add r1, r1, r3 | | 0x1009c <print_counter+16> orr r1, r1, #2560 ; 0xa00 | | 0x100a0 <print_counter+20> push {r1} ; (str r1, [sp, #-4]!) | | 0x100a4 <print_counter+24> mov r1, sp | | 0x100a8 <print_counter+28> mov r2, #2 | | 0x100ac <print_counter+32> svc 0x00000000 | | 0x100b0 <print_counter+36> add sp, sp, #4 | | 0x100b4 <print_counter+40> cmp r3, #9 | | 0x100b8 <print_counter+44> bge 0x100c4 <print_end_msg> | | 0x100bc <print_counter+48> add r3, r3, #1 | |B+> 0x100c0 <print_counter+52> b 0x1008c <print_counter> | | 0x100c4 <print_end_msg> mov r7, #4 | | 0x100c8 <print_end_msg+4> mov r0, #1 | | 0x100cc <print_end_msg+8> ldr r1, [pc, #20] ; 0x100e8 <exit_normally+16> | | 0x100d0 <print_end_msg+12> mov r2, #12 | | 0x100d4 <print_end_msg+16> svc 0x00000000 | | 0x100d8 <exit_normally> mov r7, #1 | | 0x100dc <exit_normally+4> mov r0, #0 | | 0x100e0 <exit_normally+8> svc 0x00000000 | | 0x100e4 <exit_normally+12> andeq r0, r2, r12, ror #1 | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| remote Thread 1.39901 In: print_counter L?? PC: 0x100c0 N=1 Z=0 C=0 V=0 Breakpoint 1, 0x000100c0 in print_counter () N=1 Z=0 C=0 V=0 Breakpoint 1, 0x000100c0 in print_counter () N=1 Z=0 C=0 V=0 Breakpoint 1, 0x000100c0 in print_counter () N=1 Z=0 C=0 V=0 Breakpoint 1, 0x000100c0 in print_counter () N=1 Z=0 C=0 V=0 Breakpoint 1, 0x000100c0 in print_counter () N=1 Z=0 C=0 V=0 Breakpoint 1, 0x000100c0 in print_counter () N=1 Z=0 C=0 V=0 Breakpoint 1, 0x000100c0 in print_counter () (gdb)

Notice we are now on what should be the last iteration of the loop.
Let's advance to the bge instruction with:
advance *0x100b8

Let's look at what flags were set with our cmp instruction:
(gdb) source cpsr_cmp.gdb N=0 Z=1 C=1 V=0
Notice that the zero bit is now set and the negative bit is no longer set.

The zero bit is set because r3 was equal to 9.
The negative bit is not set because the result of 9 - 9 isn't negative.
Both of these condititions should cause our branch condition to be met.

Let's test this branch by setting a break point where we should jump to:
(gdb) break *0x100c4 Breakpoint 2 at 0x100c4

Enter continue to advance the program to the next break:
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 0x1008c <print_counter> mov r7, #4 | | 0x10090 <print_counter+4> mov r0, #1 | | 0x10094 <print_counter+8> mov r1, #48 ; 0x30 | | 0x10098 <print_counter+12> add r1, r1, r3 | | 0x1009c <print_counter+16> orr r1, r1, #2560 ; 0xa00 | | 0x100a0 <print_counter+20> push {r1} ; (str r1, [sp, #-4]!) | | 0x100a4 <print_counter+24> mov r1, sp | | 0x100a8 <print_counter+28> mov r2, #2 | | 0x100ac <print_counter+32> svc 0x00000000 | | 0x100b0 <print_counter+36> add sp, sp, #4 | | 0x100b4 <print_counter+40> cmp r3, #9 | | 0x100b8 <print_counter+44> bge 0x100c4 <print_end_msg> | | 0x100bc <print_counter+48> add r3, r3, #1 | |B+ 0x100c0 <print_counter+52> b 0x1008c <print_counter> | |B+> 0x100c4 <print_end_msg> mov r7, #4 | | 0x100c8 <print_end_msg+4> mov r0, #1 | | 0x100cc <print_end_msg+8> ldr r1, [pc, #20] ; 0x100e8 <exit_normally+16> | | 0x100d0 <print_end_msg+12> mov r2, #12 | | 0x100d4 <print_end_msg+16> svc 0x00000000 | | 0x100d8 <exit_normally> mov r7, #1 | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| remote Thread 1.40929 In: print_end_msg L?? PC: 0x100c4 N=1 Z=0 C=0 V=0 Breakpoint 1, 0x000100c0 in print_counter () N=1 Z=0 C=0 V=0 Breakpoint 1, 0x000100c0 in print_counter () N=1 Z=0 C=0 V=0 Breakpoint 1, 0x000100c0 in print_counter () (gdb) si 0x0001008c in print_counter () (gdb) advance *0x100b8 0x000100b8 in print_counter () (gdb) source cpsr_cmp.gdb N=0 Z=1 C=1 V=0 (gdb) break *0x100c4 Breakpoint 2 at 0x100c4 (gdb) continue Continuing. Breakpoint 2, 0x000100c4 in print_end_msg () (gdb)
Notice that our program had two breakpoints set, one at 0x100c0 which would branch back to the start of our loop,
and another at 0x100c4 which will continue the rest of the program.
Our condition to branch was met by both the zero bit being set to 1 and the negative bit being set to 0, so execution moved to 0x100c4.

Enter continue one last time to finish executing the remainder of the program:
(gdb) continue Continuing. [Inferior 1 (process 1) exited normally] (gdb)
ARM Loop Exercises

Exercise 1.

Find the other flag bit that is set in the cpsr. Why is it set?

Exercise 2.

Write a gdb script that prints all of the cpsr flags in the format of the cpsr_cmp script.

Exercise 3:

Re-write the loop to allow for more than 10 iterations while printing the correct iteration number.
ARM ABI and Calling Convention

What is an ABI?

An Application Binary Interface (ABI) is a hardware-level interface used between software executables.
ABIs are similar to APIs, in that an API is a source code level interface between source code,
but while APIs are high-level and hardware indepedent. ABIs are low-level and hardware dependent.

ABI's determine:
  • How to pass arguments to a function
  • How to pass a function's return
  • What register's must be preserved and what registers can be be clobbered (over-written)
  • How data is organized in memory
  • How system calls are performed

We will reference the The Proceedure Call Standard for ARM32 from the ARM32 ABI to write our next program.

This document defines the following calling convention:

  • The registers r4-r8, r10, and r11 are used to hold the values of local variables
  • Registers r12-r15 have special roles: IP, SP, LR, and PC
  • A subroutine must preserve the contents of the registers r4-r8, r10, r11 and SP
  • The first four registers r0-r3 (a1-a4) are used to pass argument values into a subroutine and to return values
  • r0-r3 may also be used to hold intermediate values within a routine (but, in general, only between subroutine calls)
     Register   Synonym   Special   Role in the procedure call standard 
    r15 PC The Program Counter.
    r14 LR The Link Register.
    r13 SP The Stack Pointer.
    r12 IP The Intra-Procedure-call scratch register.
    r11 v8 FP Frame Pointer or Variable-register 8.
    r10 v7 Vairable-register 7.
    r9 v6 SB
    TR
    Platform register or Variable-register 6.
    The meaning of this register is defined by the platform standard.
    r8 v5 Variable-register 5.
    r7 v4 Variable-register 4.
    r6 v3 Variable-register 3.
    r5 v2 Variable-register 2.
    r4 v1 a4 Argument / scratch register 4.
    r2 a3 Argument / result / scratch register 2.
    r0 a1 Argument / result / scratch register 1.

ARM Functions and User Input
It's time to learn more about stack management in ARM assembly by creating a program with function calls and user input.
Examine the following source code:
.section .rodata // Linux Syscall constants STDIN = 0x00 STDOUT = 0x01 EXIT = 0x01 READ = 0x03 WRITE = 0x04 ERR_INVALID_INPUT = 0x01 ERR_BUFF_OVERFLOW = 0x02 // Valid ACII values for decimal numbers b_MIN_ASCII = 0x30 b_MAX_ASCII = 0x39 // Termination character b_NEWLINE = 0x0a // Input buffer size b_BUFFER_SIZE = 0x08 .section .data // String variables used to prompt user and show output first_num_msg: .ascii "Enter the first number to add: " len_first_num_msg = ( . - first_num_msg) second_num_msg: .ascii "Enter the second number to add: " len_second_num_msg = ( . - second_num_msg) sum_msg: .ascii "The sum is: " len_sum_msg = ( . - sum_msg) invalid_msg: .ascii "ERROR: Invalid input detected\n" len_invalid_msg = ( . - invalid_msg) overflow_msg: .ascii "ERROR: Buffer overflow detected\n" len_overflow_msg = ( . - overflow_msg) .section .bss /* The block started by symbol (bss) section stores unitialized variables. They will be zero initialized in memory, so we will start with clean buffers */ first_number_buffer: .skip b_BUFFER_SIZE second_number_buffer: .skip b_BUFFER_SIZE .section .text .global _start _start: /* Prompt user to enter integers numbers, add them, and print the result */ movw r4, #0xbeef // Load the lower-half (16-bits) of r4 movt r4, #0xdead // Load the upper-half of r4 movw r5, #0xbabe movt r5, #0xdeed movw r6, #0xface movt r6, #0xcafe movw r7, #0xdeaf movt r7, #0xfade movw r8, #0xbabe movt r8, #0xbead movw r9, #0xface movt r9, #0xdeaf movw r10, #0xbade movt r10, #0xcade /* The above instructions set all the variable registers this is done only for demonstration purposes to provide easy data to view on the stack when debugging. The movw and movt instructions are used because ARM32 cannot load some 32-bit constants into registers with a single instruction, so you must load the lower and upper haves separately. */ prompt_for_first_number: ldr r7, =WRITE ldr r0, =STDOUT ldr r1, =first_num_msg ldr r2, =len_first_num_msg svc #0 get_first_number: ldr r0, =first_number_buffer ldr r1, =b_BUFFER_SIZE bl get_number // r0: buffer address r1: buffer length --> r0: unsigned integer push {r0} // Save the first number to the stack prompt_for_second_number: ldr r7, =WRITE ldr r0, =STDOUT ldr r1, =second_num_msg ldr r2, =len_second_num_msg svc #0 get_second_number: ldr r0, =second_number_buffer ldr r1, =b_BUFFER_SIZE bl get_number push {r0} // Save the second number to the stack print_sum_msg: ldr r7, =WRITE ldr r0, =STDOUT ldr r1, =sum_msg ldr r2, =len_sum_msg svc #0 pop {r0, r1} // Pop both numbers off the stack bl print_sum // r0: unsigned integer r1: unsigned integer --> void exit_normally: ldr r7, =EXIT mov r0, #0 svc #0 exit_with_invalid_error: ldr r7, =WRITE ldr r0, =STDOUT ldr r1, =invalid_msg ldr r2, =len_invalid_msg svc #0 ldr r7, =EXIT ldr r0, =ERR_INVALID_INPUT svc #0 exit_with_overflow_error: ldr r7, =WRITE ldr r0, =STDOUT ldr r1, =overflow_msg ldr r2, =len_overflow_msg svc #0 ldr r7, =EXIT ldr r0, =ERR_BUFF_OVERFLOW svc #0 get_number: /* purpose: read a natural number from user input usage: arg0 (r0) the memory address to store ASCII input from STDIN arg1 (r1) the size of the memory buffer to store the input returns: r0: 32-bit positive integer error handling: Invalid input will result in no return and a program exit with error code 0x1 Only characters 0123456789 \n (0x0a) and (0x00) are valid */ push {fp, lr} // Preserve the caller's frame pointer and link register (previous pc) mov fp, sp // Set the frame pointer to the current stack pointer push {r4-r10} // Preserve the caller's variable registers // r1 and r0 are scratch registers that will get cloberred by the read syscall's arguments so we need to preserve them push {r1} // Store r1 on the stack, which is the length of our buffer push {r0} // Store r0 on the stack, which is the memory address we will save our input to ldr r7, =READ ldr r0, =STDIN pop {r1} // r1=r0 from stack, this sets the address for the read syscall to our function's arg0 input pop {r2} // r2=r1 from stack, this sets the size of the input buffer for the read syscall to our functions arg1 input svc #0 cmp r0, r2 // Compare the number of bytes read to our buffer (r0) to the size of our buffer (r2) bge if_newline_check // A full buffer should always end with a newline character b endif_newline_check if_newline_check: sub r6, r2, #0x1 // The byte offset is 1 less than the length ldrb r4, [r1, r6] // Load the last byte from the buffer ldr r5, =b_NEWLINE cmp r4, r5 // If the last character isn't a newline, then there was a buffer overflow bne if_buffer_overflow_found b endif_buffer_overflow_found if_buffer_overflow_found: b exit_with_overflow_error // Unconditional branch to exit the program with an overflow error endif_buffer_overflow_found: endif_newline_check: // Buffer length was valid mov r0, r1 // Move the buffer address into r0 to pass to validate_input mov r1, r2 // Move the buffer length into r1 to pass to validate_input bl validate_input // r0: buffer address, r1: buffer length --> r0: 0x0 is valid 0x1 is invalid cmp r0, #0x1 // Test for invalid flag beq invalid_number // branch to invalid number error handling valid_number: mov r0, r1 // validate_input passes any valid number back in r1, get number passes that back in r0 pop {r4-r10} // This restores the original values of r4-r11 from the stack pop {fp} // Restore the previous fp to the current fp pop {pc} // This sets the pc to the lr value, so that execution resumes where this function was called from in the caller's function invalid_number: pop {r4-r10, fp} b exit_with_invalid_error // Unconditional branch to exit the program with an invalid input error validate_input: /* parameters: arg0 (r0) the memory address to validate ASCII decimal input from arg1 (r1) the size the input memory address buffer returns: r0: 0x0 for valid decimal number, 0x1 for invalid decimal number r1: Unchanged if number was invalid, the value of the number if the number was valid */ push {fp, lr} // Preserve the callers fp and pc mov fp, sp // Set the frame pointer to the current stack pointer push {r4-r10} // Preserve the caller's variable registers /* Register use: r0: buffer address passed to function */ mov r3, #0x0 // Loop counter for each byte stored in the input buffer mov r6, #0x0 // This will hold a flag which indicates a terminating character was found // We need to check that all characters are valid decimal characters, and count them validate_loop: ldrb r4, [r0] // Load one byte from the buffer memory location // Newline is a valid termination ldr r5, =b_NEWLINE cmp r4, r5 moveq r6, #0x1 // Flag the terminiation character if the comparison was equal beq valid ldr r5, =b_MAX_ASCII // If character is greater than b_MAX_ASCII then it is invalid cmp r4, r5 bgt invalid ldr r5, =b_MIN_ASCII // If character is less than b_MIN_ASCII then it is invalid cmp r4, r5 blt invalid add r3, r3, #0x01 // Increment counter by 1 cmp r3, r1 // Check if we have looped through all the characters bge end_validate_loop // End loop add r0, r0, #0x01 // Increment the memory buffer address by 1 b validate_loop // Continue loop end_validate_loop: valid: convert_to_decimal: /* If all characters were valid, we can convert them to a decimal value. Register use: r0: buffer address passed to function r1: length of buffer passed to function r3: counter (starting with actual length) r4: current character from buffer r5: min ASCII value r6: running total r7: exponent r8: base r9: product of base and exponent r10: temp var */ cmp r6, #0x1 beq if_terminating_char b endif_terminating_char if_terminating_char: // Check if a number wasn't entered and only enter was pressed cmp r3, #0x0 beq empty sub r0, r0, #0x1 // Point to the previous character before the newline endif_terminating_char: mov r1, r3 // Clobber r1 with the actual length of our number string ldr r5, =b_MIN_ASCII // Reset r5 to the minimum ASCII value mov r6, #0 // Reset r6 to 0 for the running total mov r8, #10 // Set r8 to base 10 mov r9, #1 // Set r9 to 1 for the first exponent multiplication // The first digit doesn't need to be multiplied by the base and exponent ldrb r4, [r0] sub r4, r4, r5 // subtract b_MIN_ASCII value from the current character to get the decimal digit add r6, r6, r4 // Add the decimal digit to the running total sub r3, r3, #1 // Decrement our counter by one sub r0, r0, #1 // Decrement our buffer address by one digit_loop: cmp r3, #0 ble end_digit_loop ldrb r4, [r0] // Load one character from the current memory position from the buffer sub r4, r4, r5 // subtract b_MIN_ASCII value from the current character to get the decimal digit sub r10, r1, r3 // Get the exponent value mov r7, r10 // Exponent loop exponent_loop: mul r10, r9, r8 // Find the product of the exponent and base mov r9, r10 sub r7, r7, #1 // Decrement the exponent counter cmp r7, #0 // Our exponent will increase for each digit ble end_exponent_loop b exponent_loop end_exponent_loop: mul r10, r4, r9 // The new digit value is the product of the exponent product and the digit add r6, r6, r10 // Add the digit to the running total mov r9, #1 // Reset r9 to 1 sub r3, r3, #1 // Decrement our counter by one sub r0, r0, #1 // Decrement our buffer address by one b digit_loop end_digit_loop: mov r0, #0x0 // Return code of 0 indicates a valid number mov r1, r6 // The final running total is passed back as the number pop {r4-r10} // Restore variable registers pop {fp, pc} // Restore variable fp and resume execution from lr address invalid: mov r0, #0x1 // Return code of 1 indicates an invalid number pop {r4-r10} // Restore variable registers pop {fp, pc} // Restore variable fp and resume execution from lr address empty: // The user entered an empty number mov r0, #0x0 // It is valid, but equivalent to zero mov r1, #0x0 pop {r4-r10} // Restore variable registers pop {fp, pc} // Restore variable fp and resume execution from lr address print_sum: /* parameters: arg0 (r0) the first number to add arg1 (r1) the second number to add returns: void */ push {fp, lr} // Preserve the callers fp and pc mov fp, sp // Set the frame pointer to the current stack pointer push {r4-r10} // Preserve the caller's variable registers add r0, r0, r1 // Adds both numbers and clobbers r0 with the sum /* Variable registers: r4: counter r5: b_MIN_ASCII / b_NEWLINE r6: divisor / newline flag r7: quotient r8: divisor * quotient product r9: remainder/decimal digit/null pad r10: the base address of our string on the stack */ sub sp, sp, #0x0c // Make room on the stack, 12 bytes can hold 10 characters for a 32-bit integer mov r10, sp // Store the base address of our string mov r4, #0x0b // We are storing little endian, so we need to start at the end of the stack for our loop ldr r5, =b_NEWLINE mov r6, #10 // Set the divisor to 10 // Store a newline which will be read last for little endian strb r5, [sp, r4] // Store the newline character on the stack sub r4, r4, #0x1 // Decrement our counter to reflect writing the newline character ldr r5, =b_MIN_ASCII digit_to_ASCII_loop: cmp r0, #0x0 bne if_more_digits else_no_more_digits: // Null pad the rest of the string mov r5, #0x00 strb r5, [sp, r4] b endif_more_digits if_more_digits: sdiv r7, r0, r6 // Divide the hex number by 10 mul r8, r7, r6 // Multiply the quotient by 10 sub r9, r0, r8 // Find the remainder add r9, r9, r5 // Add b_MIN_ASCII to the remainder to convert it to the ASCII decimal strb r9, [sp, r4] // Store the ASCII character on the stack mov r0, r7 // Overwrite the original number with the quotient endif_more_digits: sub r4, r4, #0x1 cmp r4, #0x0 bge digit_to_ASCII_loop print_sum_syscall: ldr r7, =WRITE ldr r0, =STDOUT mov r1, r10 // Set the write address to the base of our string mov r2, #0x0c svc #00000000 add sp, sp, #0x0c // Move the stack pointer back pop {r4-r10, fp, pc} // Restore the stack and return to the caller