QEMU has an option that allows GDB to connect to it over a network socket.
To run our program in QEMU as a GDB server enter:
This will launch our program in the background with QEMU and bind to port 2345.
Port 2345 is an arbitrary and can be changed to whatever you want to bind to.
Once QEMU is running, we will launch GDB for multiarchitectures, open our binary as a template, and then connect to the running process in QEMU. To do so enter:
You should see an output similar to the following:
qemu-arm -g 2345 hello_arm32 &
Once QEMU is running, we will launch GDB for multiarchitectures, open our binary as a template, and then connect to the running process in QEMU. To do so enter:
$gdb-multiarch (gdb) file hello_arm32 (gdb) target remote localhost:2345
You should see an output similar to the following:
pete@framework16:~/Documents/ASM/hello_world/ARM32$ gdb-multiarch GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1 Copyright (C) 2022 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word". (gdb) file hello_arm32 Reading symbols from hello_arm32... (No debugging symbols found in hello_arm32) (gdb) target remote localhost:2345 Remote debugging using localhost:2345 0x00010074 in _start () (gdb)
Notice that we do not need to set a break point and run the program,
because QEMU has already set a break at the _start label and executed it.
We can now open our layouts with:
You should now have the familiar layout of registers, assembly, and commands.
Note, there will be no register values loaded as we haven't stepped into an instruction yet.
Let's examine our first instruction:
We can now open our layouts with:
lay asm lay reg
Note, there will be no register values loaded as we haven't stepped into an instruction yet.
Let's examine our first instruction:
> 0x10074 <_start> mov r7, #4
The assembly source was ldr r7, =b_WRITE, but because b_WRITE was a constant value,
the assembler translated this to just moving its immediate value into the register.
The next instruction is the same as the first, so let's step into our instructions until we reach the third line:
The next instruction is the same as the first, so let's step into our instructions until we reach the third line:
|-Register group: general------------------------------------------------------------------------------------------------------------------------------------------------| |r0 0x1 1 r1 0x40800b39 1082133305 r2 0x0 0 | |r3 0x0 0 r4 0x0 0 r5 0x0 0 | |r6 0x0 0 r7 0x4 4 r8 0x0 0 | |r9 0x0 0 r10 0x200bc 131260 r11 0x0 0 | |r12 0x0 0 sp 0x408009d0 0x408009d0 lr 0x0 0 | |pc 0x1007c 0x1007c <_start+8> cpsr 0x10 16 fpscr 0x0 0 | |fpsid 0x410430f0 1090793712 fpexc 0x40000000 1073741824 AFSR0_EL1 0x0 0 | |AFSR1_EL1 0x0 0 DBGDIDR 0x3515f021 890630177 DBGDSAR 0x0 0 | |DBGBVR 0x0 0 DBGBCR 0x0 0 DBGWVR 0x0 0 | |DBGWCR 0x0 0 PAR 0x0 0 DBGBVR 0x0 0 | |DBGBCR 0x0 0 DBGWVR 0x0 0 DBGWCR 0x0 0 | |TEECR 0x0 0 MIDR_EL1 0x412fc0f1 1093648625 CTR 0x8444c004 -2075869180 | |TCMTR 0x0 0 TTBR0_EL1 0x0 0 PMCCNTR 0x0 0 | |TLBTR 0x0 0 TTBR1_EL1 0x0 0 MIDR 0x412fc0f1 1093648625 | |TTBCR 0x0 0 MPIDR_EL1 0x80000000 -2147483648 TTBCR2 0x0 0 | |REVIDR_EL1 0x0 0 MIDR 0x412fc0f1 1093648625 JIDR 0x0 0 | |CLIDR 0xa200023 169869347 DFAR 0x0 0 WFAR 0x0 0 | |IFAR 0x0 0 JMCR 0x0 0 AIDR 0x0 0 | |CSSELR 0x0 0 ID_PFR2 0x10 16 VBAR 0x0 0 | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 0x10074 <_start> mov r7, #4 | | 0x10078 <_start+4> mov r0, #1 | | > 0x1007c <_start+8> add r1, pc, #36 ; 0x24 | | 0x10080 <_start+12> ldr r2, [pc, #44] ; 0x100b4 <start_msg+12> | | 0x10084 <_start+16> svc 0x00000000 | | 0x10088 <write_hello_msg> mov r7, #4 | | 0x1008c <write_hello_msg+4> ldr r1, [pc, #36] ; 0x100b8 <start_msg+16> | | 0x10090 <write_hello_msg+8> mov r0, #1 | | 0x10094 <write_hello_msg+12> mov r2, #13 | | 0x10098 <write_hello_msg+16> svc 0x00000000 | | 0x1009c <exit_normally> mov r7, #1 | | 0x100a0 <exit_normally+4> mov r0, #0 | | 0x100a4 <exit_normally+8> svc 0x00000000 | | 0x100a8 <start_msg> ; <UNDEFINED> instruction: 0x73207441 | | 0x100ac <start_msg+4> ldrbtvc r6, [r2], #-372 ; 0xfffffe8c | | 0x100b0 <start_msg+8> andeq r0, r0, r10 | | 0x100b4 <start_msg+12> andeq r0, r0, r9 | | 0x100b8 <start_msg+16> strheq r0, [r2], -r12 | | 0x100bc cfstr64vs mvdx6, [r12], #-288 ; 0xfffffee0 | | 0x100c0 svcvs 0x0057206f | |------------------------------------------------------------------------------------------------------------------------------------------------------------------------| remote Thread 1.383292 In: _start L?? PC: 0x1007c (gdb) lay reg (gdb) si 0x00010078 in _start () (gdb) si 0x0001007c in _start () (gdb)
Our instruction adr r1, start_msg has been translated to add r1, pc, #36
The add instruction takes the destination register to store the result, and the two arguments to add. In this instance, the immediate value #36 is being added to the pc (program counter) register's value.
This is where things can be confusing. While GDB lists the pc register currently as 0x1007c, ARM's pc register actually stays two instructions ahead of the program, and since these are 32-bit instructions, the value of the pc register will actually be 8 bytes more than our current line (32 bits * 2 = 64 bits = 8 bytes).
While you would expect add r1,pc, #36 to store, 0x100a0 in the register, if we step forward one instruction:
The add instruction takes the destination register to store the result, and the two arguments to add. In this instance, the immediate value #36 is being added to the pc (program counter) register's value.
This is where things can be confusing. While GDB lists the pc register currently as 0x1007c, ARM's pc register actually stays two instructions ahead of the program, and since these are 32-bit instructions, the value of the pc register will actually be 8 bytes more than our current line (32 bits * 2 = 64 bits = 8 bytes).
While you would expect add r1,pc, #36 to store, 0x100a0 in the register, if we step forward one instruction:
r1 0x100a8 65704
We see that 0x100a8 is in fact stored in r1.
If we look further down our assembly layout, we can see that it is our start_msg label:
If we look further down our assembly layout, we can see that it is our start_msg label:
0x100a8 <start_msg> ; <UNDEFINED> instruction: 0x73207441
Notice that the disassembler is attempting to interpret the data as instructions, this is
because it resides in the .text section with our code, but it does not contain valid
assembly instructions.