Architecture Fundamentals for Reverse Engineering on Aarch64

image

Introduction

AArch64 is the 64-bit instruction set architecture (ISA) for ARM processors. It features a new instruction set, A64, with 31 general-purpose 64-bit registers, an advanced SIMD (Neon) enhanced with 32 × 128-bit registers, and a new exception system with fewer banked registers and modes.

Instruction Set Architecture (ISA)

Instruction Set Architecture (ISA) is an abstract model of a computer that defines the supported instructions, data types, registers, and the hardware support for managing main memory, fundamental features such as the memory consistency, addressing modes, virtual memory, and the input/output model of a family of implementations of the ISA.


Instructions in AArch64

Instruction: ADD Description: Adds two registers together and stores the result in a third register. Use Case: Used to perform addition operations on data.

ADD x2, x0, x1

Instruction: FADD Description: Adds two floating-point numbers together and stores the result in a third register. Use Case: Used to perform addition operations on floating-point data.

Instruction: SUB Description: Subtracts one register from another and stores the result in a third register. Use Case: Used to perform subtraction operations on data.

SUB x2, x0, x1

Instruction: FSUB Description: Subtracts one floating-point number from another and stores the result in a third register. Use Case: Used to perform subtraction operations on floating-point data.

Instruction: MUL Description: Multiplies one register from another and stores the result in a third register. Use Case: Used to perform multiplication operations on data.

MUL x2, x0, x1

Instruction: FMUL Description: Multiplies two floating-point numbers together and stores the result in a third register. Use Case: Used to perform multiplication operations on floating-point data.

Instruction: UDIV Description: Divides one register with an unsigned value from another and stores the result in a third register. Use Case: Used to perform division operations on data.

UDIV x2, x0, x1

Instruction: SDIV Description: Divides one register with an signed value from another and stores the result in a third register. Use Case: Used to perform division operations on data.

SDIV x2, x0, x1

Instruction: NEG Description: Negate the content of a register and then stores the result in a second register Use Case: Used to negate operations on data.

NEG x0, x1

Instruction: ASR Description: Shifts the bits in a register to the right by a specified number of bits and stores the result in a second register, while preserving the sign bit. Use Case: Used to shift bits in a register to the right while preserving the sign bit.

ASR x1, x0, #4

Instruction: AND Description: Performs a bitwise AND operation on two registers and stores the result in a third register. Use Case: Used to perform bitwise AND operations on data.

AND x2, x0, x1

Instruction: LSL Description: Shifts the bits in a register to the left by a specified number of bits and stores the result in a second register. Use Case: Used to shift bits in a register to the left.

LSL x1, x0, #4  

Instruction: LSR Description: Shifts the bits in a register to the right by a specified number of bits and stores the result in a second register. Use Case: Used to shift bits in a register to the right.

LSR x1, x0, #4  

Instruction: LDR Description: Loads a value from memory into a register. Use Case: Used to load data from memory into a register.

LDR x1, [x0]

Instruction: LDP Description: Loads two consecutive registers from memory. Use Case: Used to load data from memory into two registers.

LDP x2, x3, [x0]

Instruction: STR Description: Stores a value from a register into memory. Use Case: Used to store data from a register into memory.

STR x0, [x1]

Instruction: STP Description: Stores two consecutive registers into memory. Use Case: Used to store data from a registers into memory.

STR x0, x1, [x2]

Instruction: MOV Description: Moves the value of one register into another register. Use Case: Used to move data between registers.

MOV x1, x0

Instruction: MVN Description: Performs a bitwise NOT operation on a register and stores into another register. Use Case: Used to move data between registers.

MVN x1, x0

Instruction: ORR Description: Performs a bitwise OR operation on two registers and stores the result in a third register. Use Case: Used to perform bitwise OR operations on data.

ORR x2, x0, x1

Instruction: EOR Description: Performs a bitwise exclusive OR operation on two registers and stores the result in a third register. Use Case: Used to perform bitwise exclusive OR operations on data.

EOR x2, x0, x1

Instruction: TST Description: Performs a bitwise AND operation between the contents of multiple registers, but discarding the result of the operation, just updates the value. Use Case: Used to update data between registers.

TST x0, x1

Instruction: CMP Description: Compares two registers and sets the condition flags based on the result. Use Case: Used to compare two registers and set the condition flags.

CMP x0, x1

Instruction: CBZ Description: Compares a register to zero and branches to a new location in the code if the register is zero. Use Case: Used to check if a register is zero and branch to a new location in the code if it is.

CBZ x0, L1

Instruction: CBNZ Description: Compares a register to zero and branches to a new location in the code if the register is not zero. Use Case: Used to check if a register is not zero and branch to a new location in the code if it is not.

CBNZ x0, L1

Instruction: HINT Description: Provides a hint to the processor about the expected behavior of the code. Use Case: Used to optimize code performance.

HINT #0

Instruction: RET Description: Returns control to the calling function. Use Case: Used to return a value from a function.

MOV w0, #1
    RET

Instruction: ADR Description: Adds a 21-bit signed immediate to the current instruction’s address. Use Case: Load the address of a nearby label into a register.

ADR x1, message

Instruction: ADRP Description: Adds a 21-bit signed immediate to the current instruction’s address, shifts it left 12 positions. Use Case: Load the address of a nearby label into a register.

Instruction: BL Description: Branches to a new location in the code and stores the return address in a link register. Use Case: Used to call a subroutine and store the return address.

    MOV w0, #1
    BL my_function
    MOV w1, w0
    RET

my_function:
    MOV w0, #2
    RET

Instruction: B Description: Branches to a new location in the code. Use Case: Used to change the flow of code execution.

B label1
label1:
    RET

Instruction: B.EQ Description: Used to transfer control to another instruction if it’s equal to the comparison. Use Case: Used to compare if a instruction value it’s equal to the comparison.

    CMP w0, #0
    B.EQ label1
    B label2
label1:
    MOV w0, #1
    RET
label2:
    MOV w0, #2
    RET

Instruction: B.NE Description: Used to transfer control to another instruction if it’s NOT equal to the comparison. Use Case: Used to compare if a instruction value it’s NOT equal to the comparison.

    CMP w0, #0
    B.NE label1
    B label2
label1:
    MOV w0, #1
    RET
label2:
    MOV w0, #2
    RET

Instruction: B.GT Description: Used to transfer control to another instruction if it’s greater than to the comparison. Use Case: Used to compare if a instruction value it’s greater than to the comparison.

    CMP w0, #0
    B.GT label1
    B label2
label1:
    MOV w0, #1
    RET
label2:
    MOV w0, #2
    RET

Instruction: B.GT Description: Used to transfer control to another instruction if it’s greater than to the comparison. Use Case: Used to compare if a instruction value it’s greater than to the comparison.

    CMP w0, #0
    B.GT label1
    B label2
label1:
    MOV w0, #1
    RET
label2:
    MOV w0, #2
    RET

Instruction: B.GE Description: Used to transfer control to another instruction if it’s greater than or equal to the comparison. Use Case: Used to compare if a instruction value it’s greater than or equal to the comparison.

    CMP w0, #0
    B.GE label1
    B label2
label1:
    MOV w0, #1
    RET
label2:
    MOV w0, #2
    RET

Instruction: B.LT Description: Used to transfer control to another instruction if it’s less than to the comparison. Use Case: Used to compare if a instruction value it’s less than to the comparison.

    CMP w0, #0
    B.LT label1
    B label2
label1:
    MOV w0, #1
    RET
label2:
    MOV w0, #2
    RET

Instruction: B.LE Description: Used to transfer control to another instruction if it’s less than or equal to the comparison. Use Case: Used to compare if a instruction value it’s less than or equal to the comparison.

    CMP w0, #0
    B.LE label1
    B label2
label1:
    MOV w0, #1
    RET
label2:
    MOV w0, #2
    RET

Instruction: B.PL Description: Used to transfer control to another instruction if it’s positive or zero to the comparison. Use Case: Used to compare if a instruction value it’s positive or zero to the comparison.

    CMP w0, #0
    B.PL label1
    B label2
label1:
    MOV w0, #1
    RET
label2:
    MOV w0, #2
    RET

Instruction: B.MI Description: Used to transfer control to another instruction if it’s negative or not zero to the comparison. Use Case: Used to compare if a instruction value it’s negative or not zero to the comparison.

    CMP w0, #0
    B.MI label1
    B label2
label1:
    MOV w0, #1
    RET
label2:
    MOV w0, #2
    RET

Registers in AArch64

Register: r0 - r31
Description: General-purpose register. 
Use Case: Used for general-purpose data storage.
Register: s0 - s31 
Description: Floating-point register. 
Use Case: Used for floating-point data storage.
Register: sp 
Description: Stack pointer register. 
Use Case: Used to point to the top of the stack.
Register: pc 
Description: Program counter register. 
Use Case: Used to store the address of the next instruction to be executed.
Register: fp 
Description: Frame pointer register. 
Use Case: Used to store the address of the current stack frame.
Register: lr 
Description: Link register. 
Use Case: Used to store the return address from a subroutine call.
Register: cpsr 
Description: Current program status register. 
Use Case: Used to store the current processor status.

Accessing registers

Register Description Use Case
x0 / w0 Argument 1 Function argument
x1 / w1 Argument 2 Function argument
x2 / w2 Argument 3 Function argument
x3 / w3 Argument 4 Function argument
x4 / w4 Argument 5 Function argument
x5 / w5 Argument 6 Function argument
x6 / w6 Argument 7 Function argument
x7 / w7 Argument 8 Function argument
x8 / w8 Return value Function return value
x9 / w9 Reserved -
x10 / w10 Reserved -
x11 / w11 Reserved -
x12 / w12 Reserved -
x13 / w13 Reserved -
x14 / w14 Reserved -
x15 / w15 Reserved -
x16 / w16 Intra-procedure call scratch register Function call
x17 / w17 Intra-procedure call scratch register Function call
x18 / w18 Thread Environment Block (TEB) Thread-local storage
x19 / w19 Argument 1 Function argument
x20 / w20 Argument 2 Function argument
x21 / w21 Argument 3 Function argument
x22 / w22 Argument 4 Function argument
x23 / w23 Argument 5 Function argument
x24 / w24 Argument 6 Function argument
x25 / w25 Argument 7 Function argument
x26 / w26 Argument 8 Function argument
x27 / w27 Reserved -
x28 / w28 Reserved -
x29 / w29 Frame pointer Stack frame
x30 / w30 Link register Function call
x31 Stack pointer - Zero register Stack frame

Data Processing

.global _start

_start:
    // Load a value from memory
    ldr x0, =0x1234

    // Load a pair of values from memory
    ldp x1, x2, [x0]

    // Add two numbers
    add x3, x1, x2

    // Subtract two numbers
    sub x4, x3, x2

    // Negate a number
    neg x5, x4

    // Multiply two numbers
    mov x6, #4
    mul x5, x5, x6

    // Unsigned division
    mov x6, #2
    udiv x5, x5, x6

    // Signed division
    mov x6, #-3
    sdiv x5, x5, x6

    // Logical shift left
    mov x6, #2
    lsl x5, x5, x6

    // Logical shift right
    mov x6, #3
    lsr x5, x5, x6

    // Bitwise OR
    mov x6, #0x0F
    orr x5, x5, x6

    // Bitwise XOR
    mov x6, #0x55
    eor x5, x5, x6

    // Bitwise NOT
    mvn x5, x5

    // Compare two numbers
    mov x6, #10
    cmp x5, x6

    // Conditional branch if zero
    cbz x5, zero

    // Conditional branch if not zero
    cbnz x5, not_zero

    // Store a value to memory
    str x5, [x0]

    // Exit the program
    mov x8, #93
    svc #0

zero:
    // Do something if x5 is zero
    b end

not_zero:
    // Do something if x5 is not zero
    b end

end:
    // End of program

Function Calls

.section .data
    a: .word 10
    b: .word 20
    c: .word 5
    d: .word 0

.section .text
    .global _start

_start:
    ; Load 'a' into 'x0'
    ldr x0, =a

    ; Load 'b' into 'x1'
    ldr x1, =b

    ; Load 'c' into 'x2'
    ldr x2, =c

    ; Load the value at address 'a' into 'x3'
    ldr w3, [x0]

    ; Load the value at address 'b' into 'x4'
    ldr w4, [x1]

    ; Load the value at address 'c' into 'x5'
    ldr w5, [x2]

    ; Add 'w3' and 'w4' and store the result in 'w6'
    add w6, w3, w4

    ; Subtract 'w5' from 'w6' and store the result in 'w7'
    sub w7, w6, w5

    ; Store the result in 'd'
    str w7, [x3]

    ; Exit the program
    mov x8, #93
    svc #0

Procedure Call Standard

.section .text
.global _start

_start:
    stp x29, x30, [sp, #-16]!
    mov x29, sp
    stp x19, x20, [sp, #-16]!
    stp x21, x22, [sp, #-16]!
    stp x23, x24, [sp, #-16]!
    stp x25, x26, [sp, #-16]!
    stp x27, x28, [sp, #-16]!

    mov w0, #5 ; Input number
    bl factorial ; Call factorial function

    ldp x27, x28, [sp], #16
    ldp x25, x26, [sp], #16
    ldp x23, x24, [sp], #16
    ldp x21, x22, [sp], #16
    ldp x19, x20, [sp], #16
    mov sp, x29
    ldp x29, x30, [sp], #16
    ret

factorial:
    cmp w0, #1 ; Check if input is 1
    ble .base_case ; If input is 1, return 1

    stp x29, x30, [sp, #-16]!
    mov x29, sp
    stp x19, x20, [sp, #-16]!
    stp x21, x22, [sp, #-16]!
    stp x23, x24, [sp, #-16]!
    stp x25, x26, [sp, #-16]!
    stp x27, x28, [sp, #-16]!

    sub w0, w0, #1 ; Decrement input
    bl factorial ; Call factorial function recursively

    ldp x27, x28, [sp], #16
    ldp x25, x26, [sp], #16
    ldp x23, x24, [sp], #16
    ldp x21, x22, [sp], #16
    ldp x19, x20, [sp], #16
    mov sp, x29
    ldp x29, x30, [sp], #16

    mul w0, w0, x19 ; Multiply input by result of recursive call
    ret

.base_case:
    mov w0, #1 ; Return 1
    ret

System Calls

.global main
main:
    // Open file
    mov x0, #0x2
    adr x1, filename
    mov x2, #0x0
    svc #0x80

    // Write to file
    mov x0, #0x1
    mov x1, x0
    adr x2, message
    mov x3, #0x5
    svc #0x80

    // Close file
    mov x0, #0x3
    svc #0x80

    // Exit
    mov x0, #0x0
    ret

filename: .ascii "example.txt\0"
message: .ascii "Hello, world!\n\0"

Disclaimer

DISCLAIMER: The information and code provided here are intended for educational and informational purposes only. The user assumes full responsibility for the use of this information and code. The provider of this information and code makes no representations or warranties, express or implied, about the completeness, accuracy, reliability, suitability, or availability of the information and code provided. Any reliance you place on such information and any use of this code is therefore strictly at your own risk. In no event will the provider be liable for any loss or damage including without limitation, indirect or consequential loss or damage, arising out of, or in connection with, the use of this information and code.