Hardware Knowledge You Have to Know Before Assembly

Summary

After leaving the abstract world of high-level programming languages, you plan to start with assembly.

In assembly, you need to manipulate registers and memory directly. Not only that, but you also have to deal with stack alignment, System V, call stack, and so on.

Prerequisite knowledge is crucial, otherwise you will miss out on truely on valuable ideas.

The least important thing is the language itself

What if you plan to start Reverse Engineering of Pwn Challenge, here is a guide for beginners like you

Registers

The most intuitive concept is registers.

Registers are the most fast, smallest storage unit located directly inside the CPU. They hold memory address, data and instructions.

Depending on the architecture, registers can be classified into several functions.

General-Purpose Registers

These registers can be used by developers to store return value, temporary data, function arguments or operands for arithmetic operations.

General-Purpose Registers you can operate on them.

Such like these instructions:

push rbp
mov rbp, rsp
pop rdi

You can easily modify the value these registers store.

Control Registers

Control Registers in the x86 architecture determine the global operating mode of the processor and control system-level features such as virtual memory management (paging), processor protection rings, caching strategies, and task switching.

PC

The most basic control register is PC (Program Counter). Its name has historical reasons. Learn more: Wiki

RIP is the register which was implemented in x86_64. It store the current instruction address.

Also, there are different name of PC register in different architectures.

ARM: R15

RISC-V: PC

PowerPC: IAR/NIA

CR0-CR8 (x86)

CR0 (System Status & Mode Switch): Controls the foundational operating modes and memory policies of the CPU. It manages the transition from Real Mode to Protected Mode (via the PE bit), enables Paging (via the PG bit), and enforces Write Protection (WP) to prevent Ring 0 from modifying read-only pages.

CR3 (Page Directory Base Register - PDBR): Holds the physical address of the root page table for the current process. It is the anchor for the Memory Management Unit (MMU) during address translation; loading a new value into CR3 effectively switches the virtual memory context during a process context switch.

CR2 (Page Fault Linear Address): Serves as a diagnostic register that automatically stores the exact linear (virtual) address that triggered a Page Fault (#PF). The kernel’s exception handler reads this register to determine which memory address needs to be resolved or swapped in.

CR4 (Architectural Extensions & Security): Toggles advanced hardware capabilities and critical kernel-level security mitigations. This includes enabling Physical Address Extension (PAE) and activating hardware defenses like SMEP (Supervisor Mode Execution Prevention) and SMAP (Supervisor Mode Access Prevention) to block local privilege escalation exploits.

CR8 (Task Priority Register): Available exclusively in 64-bit long mode to manage interrupt prioritization via the Local APIC, allowing the kernel to mask external interrupts below a specific priority threshold.

Register File

In computer architecture, the Register File is a small, ultra-fast array of internal storage cells built directly into the CPU processor core. It holds the immediate operands and results that the Execution Units (like the ALU) need to read and write during every clock cycle.

Actually, Register File Implement Registers, when you operate on rbp or rax by asm, the register file implements the underlying circuit.

You don’t need to figure out the Implementation principle of register file at first.

ELF

ELF(Executable and Linkable File),which is the default executable format in GNU/Linux.

There are different executable format in different operating system.

GNU/Linux: ELF

Windows: PE

Darwin(Mac OS): Mach-O

ELF is the most popular format, so I just discuss ELF.

Section (ELF)

const char *str = "Hello World";

This line of C code declares a pointer which point to a string. This string was stored in .rodata section.

Don’t be obsessed with note the name and functions of sections.

The core idea of this is “everything is in the memory, sections are a kind of format to organize data”

objdump is a tool to easily show each section

objdmp -s -t .rodata a.out
recent-work

分析musl libc 1.2.0 malloc实现

musl libc 最近决定着手malloc的实现,一方面是比较感兴趣,另一方面可以加强对内存管理的理解,同时也可以推进对堆安全的理解。 相对于glibc的复杂,或许应该先抛开复杂的性能优化,从最简单的开始学习 所以我选择了musl libc 1.2.0版本 musl libc的 …

Read more →

Hardware Knowledge You Have to Know Before Assembly

Summary After leaving the abstract world of high-level programming languages, you plan to start with assembly. In assembly, you need to …

Read more →