x86 Assembly

x86 Assembly Source Code
Assembly source code is processed by an assembler to convert it in to machine code that the target CPU architecture will understand and output it to a file format that the OS will be able to execute.

The assembler:
  • Translates mnemonic opcodes into machine code
  • Resolves symbolic addresses and labels to actual memory addresses
  • Calculates relative offsets for branching instructions (calls, jumps, etc.)
  • Processes assembler directives that define data, set memory alignment, and specify output file sections
  • Generates output files that can be linked together to create an executable program
The assembler follows this order when processing assembly source files:
  1. Process assembler directives
  2. Assign memory addresses to labels
  3. Calculate the size and layout of the program in memory
  4. Translate opcodes to machine code
  5. Resolve symbolic addresses
  6. Generate an output file

Throughout this course we will be using the GNU Assembler (GAS). GAS has many directives, but some of the more common ones include:
  • .align n: Align the next data item on an n-byte boundary.
  • .ascii "string": Store the string in memory without a null terminator.
  • .asciz "string": Store the string in memory with a null terminator.
  • .balign n: Same as .align, but pads with zeros instead of NOP instructions.
  • .byte n1, n2, ...: Store a sequence of 8-bit bytes in memory.
  • .comm symbol, length: Declare a common block of the specified length for symbol.
  • .data: Switch to the data section for subsequent data items.
  • .equ symbol, expression: Set the value of a symbol to a constant expression.
  • .fill repeat, size, value: Generate a block of data with the specified size, repeat times, initialized to the given value.
  • .globl symbol: Mark a symbol as global, making it accessible by other object files during the linking process.
  • .local symbol: Mark a symbol as local, meaning it will not be accessible by other object files.
  • .long n1, n2, ...: Store a sequence of 32-bit integers in memory.
  • .org new_location: Set the assembly location counter to the specified new_location.
  • .section name, flags: Switch to a named section with the specified flags.
  • .short n1, n2, ...: Store a sequence of 16-bit integers in memory.
  • .size symbol, expression: Set the size of a symbol to the given expression.
  • .space n: Insert n bytes of zero-initialized space into the output.
  • .string "string": Same as .asciz, store the string in memory with a null terminator.
  • .text: Switch to the text section for subsequent instructions.
  • .word n1, n2, ...: Store a sequence of 16-bit or 32-bit integers in memory, depending on the target architecture.