We are now ready to write a hello world program for ARM. We will build upon what we have already learned
from our x86 hello world, and note the differences for GNU ARM assembly.
We will be using Linux syscall table references again. This time for ARM 32-bit.
We will be using Linux syscall table references again. This time for ARM 32-bit.
.section .rodata /* The .rodata section will be stored as read-only in memory. This section is included by GAS in the overall .data section, but it is flagged as read-only */ b_STDOUT = 0x01 /* This defines b_STDOUT as a byte sized constant with a value of 0x01 */ b_WRITE = 0x04 /* This defines b_WRITE as a byte sized contant with avalue of 0x04 */ .section .data hello_msg: .ascii "Hello World!\n" end_hello_msg: len_hello_msg = (end_hello_msg - hello_msg) /* This declares a variable len_hello_msg and assigns it the difference between the end_hello_msg label address and the hello_msg label address. Parenthesis are not necessary in this instance, len_hello_msg = end_hello_msg - hello_msg would evaluate the same */ unused_label: .hword 0xbeef /* This label is here to illustrate how GAS stores label addresses for ARM assembly that are not assigned during the program execution, vs. those that are. Note: The size of a word depends on the processor architecture, for ARM32 a word is 32 bits (4 bytes), so to store 2 bytes of data, we use the half-word (.hword) directive */ .section .text .global _start _start: /* Write "At start" and "Hello World!" to stdout Write syscall reference: r7 r0 (arg0) r1(arg1) r2(arg2) 0x04 unsigned int fd const char *buf size_t count */ print_start_msg: ldr r7, =b_WRITE /* The load register (ldr) instruction is similar to the lea instruction for the x86 processor in that, it loads a calculated memory address or immediate value into a register. Like the eax register for x86, r7 is used to determine the syscall function for ARM Using the = character with ldr is an ARM specific pseudo-instruction that specifies a symbol name which represents a constant value or an address. The assembler will determine the type of value and modify the instruction to either load a relative memory address or an immediate value. For this instruction, ldr will load an immediate value into r7 because b_STDOUT is a constant and not the label for a memory address. For more information on this instruction, refer to this reference: https://developer.arm.com/documentation/dui0041/c/Babbfdih */ ldr r0, =b_STDOUT /* Another constant value loaded for the FD */ adr r1, start_msg /* Address (adr) loads the address of a label into a register. The major functional difference between ldr and adr is that adr can only reference memory locations inside the .text section of code, while ldr can resolve addresses and values from any section. While both ldr and adr could load addresses from labels in the .text section, adr is more efficient for this specific task and should be used for that purpose */ ldr r2, =len_start_msg /* This will resolve to the value of len_start_msg, and load it into r2 */ svc #00000000 /* When writing ARM assembly for GAS, the # character is use to prefix an immediate value assignment This SuperVisor (svc) call is similar to the int 0x80 call for the x86. It will initiate the execution of the syscall by calling a system interrupt. svc creates an exception and passes the immediate value to the exception handler. In earlier versions of ARM svc was called swi (SoftWare Interrupt), but they effectively the same */ write_hello_msg: ldr r7, =b_WRITE ldr r0, =b_STDOUT ldr r1, =hello_msg /* For this instruction, hello_msg is a label for a memory address located in the .data section. Using the ldr pseudo-operation, the assembler will create an immediate values at the end the .text section to store the label address in. It will then reference the memory location for that immediate value and assign it to r1. It uses the Program Counter (PC) register as a base address and offsets from PC to the address. This is essentially what we did with the adr instruction, except the assembly is copying the address for the label in .data and placing it in the .text section to assign. */ ldr r2, =len_hello_msg svc #0x00000000 exit_normally: /* exit syscall reference: r7 r0 (arg0) 0x01 int error_code */ mov r7, #0x00000001 /* Like with x86 assembly, mov can be used to load an immediate value into a register */ mov r0, #0x00000000 svc #0x00000000 /* The following section of code was added to show how you can also place variables and labels for data in the .text section after your code. They must be placed after your code, because they are not executable instructions. They should never be reached by your program's noraml execution or it will crash. */ start_msg: .ascii "At start\n" len_start_msg = . - start_msg /* In GAS, the . character is used to reference the current position in memory, so instead of creating the label end_start_msg and writing "len_start_msg = (end_start_msg - start_msg)" we can just write this as shorthand. */