1 / 120

Introduction to Assembly Language IA32

Introduction to Assembly Language IA32. Winter 2013 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University. Course Objectives. The better knowledge of computer systems, the better programing. Course Contents. Introduction to computer systems: B&O 1

ashanti
Télécharger la présentation

Introduction to Assembly Language IA32

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Assembly Language IA32 Winter 2013 COMP 2130 Intro Computer Systems Computing Science Thompson Rivers University

  2. Course Objectives • The better knowledge of computer systems, the better programing. IA32

  3. Course Contents • Introduction to computer systems: B&O 1 • Introduction to C programming: K&R 1 – 4 • Data representations: B&O 2.1 – 2.4 • C: advanced topics: K&R 5.1 – 5.10, 6 – 7 • Introduction to IA32 (Intel Architecture 32): B&O 3.1 – 3.8, 3.13 • Compiling, linking, loading, and executing: B&O 7 (except 7.12) • Dynamic memory management – Heap: B&O 9.9.1 – 9.9.2, 9.9.4 – 9.9.5, 9.11 • Code optimization: B&O 5.1 – 5.6, 5.13 • Memory hierarchy, locality, caching: B&O 5.12, 6.1 – 6.3, 6.4.1 – 6.4.2, 6.5, 6.6.2 – 6.6.3, 6.7 • Virtual memory (if time permits): B&O 9.4 – 9.5 IA32

  4. Unit Learning Objectives • Translate a simple C code into an assembly code. • Understand how a computer system runs an application. • Prerequisite for compilers and operating systems • Analyze assembly program code. • List registers used in IA32. • Compute memory address. • In function call, understand the use of user stack for variables, with %esp and %ebp • List two data types. • List three operations types. • List three operand types. • Use different memory addressing modes. IA32

  5. Use movel, leal, addl, subl, ... • Convert simple machine code to C code. • Convert if-else to if and goto version. • Convert if and goto version to if-else, do-while, while, or for version. • Understand the use of stack for function call. IA32

  6. Unit Contents • IA32 (Intel Architecture 32) • Program Encodings • Data Movement • Data Layout and Access • Arithmetic and Logical Operations • Control Structures • Function Calls, and Runtime Stack • Array Allocation and Access IA32

  7. Why should we spend our time learning machine code? • Even though compilers do most of the work in generating assembly code, being able to read and understand it is an important skill for serious programmers. • By reading assembly code, • We can understand the optimization capabilities of the compiler and analyze the underlying inefficiencies in the code. • Code optimization in Chapter 5. • We can understand the function invocation mechanism. • We can help ourselves understand how computer systems (HW) and operating systems (SW) run programs. • IA32 • Traditional x86 • 32bit • Linux uses what is referred to as flat addressing, where the entire memory space is viewed by the programmer as a large array of bytes. IA32

  8. 3.2 Program Encodings • Machine Programming: We will just study basic topics. • Many slides from CMU • 15-213/18-243: Introduction to Computer Systems • Sep. 2, 2010 • Randy Bryant and Dave O’Hallaron IA32

  9. Programmer-Visible State PC: Program counter Address of next instruction in memory Called “EIP” (IA32) or “RIP” (x86-64) Registerfile Heavily used program data Condition codes Store status information about most recent arithmetic operation Used for conditional branching, e.g., if Memory Byte addressable array Code, user data, (some) OS data Includes stack used to support procedures Carnegie Mellon Assembly Programmer’s View Memory CPU Addresses Registers Object Code Program Data OS Data PC Data Condition Codes Instructions Stack

  10. Carnegie Mellon Turning C into Object Code • Code in files p1.c p2.c • Compile with command: $ gcc –O1 p1.c p2.c -o p • Use basic optimizations (-O1) • Put resulting binary in file p text C program (p1.c p2.c) Compiler (gcc -S) text Asm program (p1.s p2.s) Assembler (gcc or as) binary Object program (p1.o p2.o) Static libraries (.a) Linker (gcc orld) binary Executable program (p)

  11. Carnegie Mellon Compiling Into Assembly Generated IA32 Assembly C Code sum: pushl %ebp movl %esp,%ebp movl 12(%ebp),%eax addl 8(%ebp),%eax popl %ebp ret int sum(int x, int y) { int t = x+y; return t; } pushl, movl, addl, popl, ret are mnemonics for machine instructions. Some compilers use instruction “leave” • Obtain with command • $ gcc –O1 -S code.c • Produces file code.s

  12. In order to support IA32 on 64bit Ubuntu, e.g., cs.tru.ca, we need to install libc6-dev:i386 and gcc-multilib. • How to create assembly code on cs.tru.ca? • $ gcc –O1 –S code.c –m32 • In code.s on cs.tru.ca, you will see • .cfi_startproc for // CFI (call frame information) directive // for exception handling on Linux pushl %ebp movel %esp, $ebp • .cfi_endproc for // CFI directive popl %ebp IA32

  13. code.s on cs.tru.ca, by using $ gcc –O1 –S code.c –m32 .file "code.c" .text .globl sum .type sum, @function sum: .LFB0: .cfi_startproc movl 8(%esp), %eax addl 4(%esp), %eax ret .cfi_endproc .LFE0: .size sum, .-sum .ident "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3" .section .note.GNU-stack,"",@progbits int sum(int x, int y) { int t = x+y; return t; } IA32

  14. Carnegie Mellon Assembly Characteristics: Data Types • “Integer” data of 1, 2, or 4 bytes • Data values • Addresses • Floating point data of 4, 8, or 10 bytes • No aggregate types such as arrays or structures • Just contiguously allocated bytes in memory

  15. Carnegie Mellon Assembly Characteristics: Operation Types • Perform arithmetic function on register or memory data • Transfer data between memory and register • Load data from memory into register • Store register data into memory • Transfer control • Unconditional jumps to/from procedures • Conditional branches

  16. Carnegie Mellon Object Code Code for sum • Assembler • Translates .s into .o • Binary encoding of each instruction • Nearly-complete image of executable code • Missing linkages between code in different files • Linker • Resolves references between files • Combines with static run-time libraries • E.g., code for malloc, printf • Some libraries are dynamically linked • Linking occurs when program begins execution 0x401040 <sum>: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x5d 0xc3 • Total of 11 bytes • Each instruction 1, 2, or 3 bytes • Starts at address 0x401040

  17. Carnegie Mellon Machine Instruction Example int t = x+y; • C Code • Add two signed integers • Assembly • Add 2 4-byte integers • “Long” words in GCC parlance • Same instruction whether signed or unsigned • Operands: x: Register %eax y: Memory M[%ebp+8] t: Register %eax • Return function value in %eax • Object Code • 3-byte instruction • Stored at address 0x80483ca addl 8(%ebp),%eax Similar to expression: x += y; More precisely: int eax; int *ebp; eax += ebp[2]; // *(ebp+2) 0x80483ca: 03 45 08

  18. Carnegie Mellon Disassembling Object Code Disassembled • Disassembler $ objdump -d prog • Useful tool for examining object code • Analyzes bit pattern of series of instructions • Produces approximate rendition of assembly code • Can be run on either a.out (complete executable) or .o file 0x80483c4 <sum>: 80483c4: 55 push %ebp 80483c5: 89 e5 mov %esp,%ebp 80483c7: 8b 45 0c mov 0xc(%ebp),%eax 80483ca: 03 45 08 add 0x8(%ebp),%eax 80483cd: 5d pop %ebp 80483ce: c3 ret

  19. Carnegie Mellon Alternate Disassembly Object Disassembled • Within gdb Debugger $ gdb prog disassemble sum • Disassemble procedure x/11xb sum • Examine the 11 bytes starting at sum 0x401040: 0x55 0x89 0xe5 0x8b 0x45 0x0c 0x03 0x45 0x08 0x5d 0xc3 Dump of assembler code for function sum: 0x080483c4 <sum+0>: push %ebp 0x080483c5 <sum+1>: mov %esp,%ebp 0x080483c7 <sum+3>: mov 0xc(%ebp),%eax 0x080483ca <sum+6>: add 0x8(%ebp),%eax 0x080483cd <sum+9>: pop %ebp 0x080483ce <sum+10>: ret

  20. Carnegie Mellon What Can be Disassembled? • Anything that can be interpreted as executable code • Disassembler examines bytes and reconstructs assembly source % objdump -d WINWORD.EXE WINWORD.EXE: file format pei-i386 No symbols in "WINWORD.EXE". Disassembly of section .text: 30001000 <.text>: 30001000: 55 push %ebp 30001001: 8b ec mov %esp,%ebp 30001003: 6a ff push $0xffffffff 30001005: 68 90 10 00 30 push $0x30001090 3000100a: 68 91 dc 4c 30 push $0x304cdc91

  21. 3.3 Data Format • Integer data of 1, 2, or 4 bytes • Data values • Addresses • Floating point data of 4, 8, or 10 bytes • No aggregate types such as arrays or structures • Just contiguously allocated bytes in memory • movl 12(%ebp), %eax • addl 8(%ebp), %eax IA32

  22. 3.4 Accessing Information • Assembly operation types • Perform arithmetic function on register or memory data • Transfer data between memory and register • Load data from memory into register • Store register data into memory • Store register data into register • But no memory to memory • Transfer control • Unconditional jumps to/from procedures • Conditional branches • Examples movl 12(%ebp), %eax addl 8(%ebp), %eax • What kind of registers are in IA32? IA32

  23. %eax %ecx %edx %ebx %esi %edi %esp %ebp Carnegie Mellon Integer Registers (IA32) Origin (mostly obsolete) %ax %ah %al accumulate %cx %ch %cl counter %dx %dh %dl data general purpose %bx %bh %bl base source index %si destination index %di stack pointer %sp base pointer %bp 16-bit virtual registers (backwards compatibility)

  24. 3.4.1, 3.4.2, and 3.4.3 IA32

  25. %eax %ecx %edx %ebx %esi %edi %esp %ebp Carnegie Mellon Moving Data: IA32 • Moving Data (data transfer operations) movl Source, Dest: • Operand Types • Immediate: Constant integer data • Example: $0x400, $-533 • Like C constant, but prefixed with ‘$’ • Encoded with 1, 2, or 4 bytes • Register: One of 8 integer registers • Example: %eax, %edx • But %esp and %ebp reserved for special use • Others have special uses for particular instructions • Memory:4 consecutive bytes of memory at address given by register • Simplest example: (%eax) • Various other “address modes”

  26. Carnegie Mellon movl Operand Combinations Assumption: %eax <- temp1; %edx <- temp2; %ecx <- p; Cannot do memory-memory transfer with a single instruction How to do memory-memory transfer? Source Dest Src, Dest C Analog Reg movl $0x4,%eax temp1 = 0x4; Imm Mem movl $-147,(%ecx) *p = -147; Reg movl %eax,%edx temp2 = temp1; movl Reg Mem movl %eax,(%ecx) *p = temp1; Mem Reg movl (%ecx),%eax temp1 = *p;

  27. Carnegie Mellon Simple Memory Addressing Modes • Normal (R) Mem[Reg[R]] • Register R specifies memory address.movl (%ecx),%eax • Displacement D(R) Mem[Reg[R]+D] • Register R specifies start of memory region. • Constant displacement D specifies offset.movl 8(%ebp),%edx ??? %edx = M[%ebp + d];

  28. Practice Problem 3.1 • Address Value Register Value 0x100 0xFF %eax 0x100 0x104 0xAB %ecx 0x1 0x108 0x13 %edx 0x3 • Operand Value %eax ??? $0x108 (%eax) 4(%eax) IA32

  29. Carnegie Mellon Using Simple Addressing Modes swap: pushl %ebp movl %esp,%ebp pushl %ebx movl 8(%ebp), %edx movl 12(%ebp), %ecx movl (%edx), %ebx movl (%ecx), %eax movl %eax, (%edx) movl %ebx, (%ecx) popl %ebx popl %ebp ret Set Up void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } Body Finish

  30. Increasing address %eax %eax %eax 0x123 0x123 0x123 %edx %edx %edx 0 0 0x123 %esp %esp %esp 0x108 0x104 0x108 • Figure 3.5 Illustration of stack operation Initially pushl %eax popl %edx Stack “bottom” Stack “bottom” Stack “bottom” • • • • • • • • • 0x108 0x108 0x108 0x123 0x123 Stack “top” 0x104 Stack “top” Stack “top”

  31. pushl %eax is equivalent to subl $4, %esp movl %eax, (%esp) • popl %edxis equivalent to movl (%esp), %edx addl $4, %esp

  32. Intermediate Review • Registers • %eax, %ebx, %ecx, %edx, %esp, %ebp • Data transfer instruction • movl ... • Operand types • Constant values • Memory • Register • Addressing mode • movl (%ecx),%eax • movl 8(%ebp),%edx • Stack operations • pushl ... • popl ... IA32

  33. Carnegie Mellon Using Simple Addressing Modes • • • swap: pushl %ebp movl %esp,%ebp pushl %ebx movl 8(%ebp), %edx movl 12(%ebp), %ecx movl (%edx), %ebx movl (%ecx), %eax movl %eax, (%edx) movl %ebx, (%ecx) popl %ebx popl %ebp ret Set Up void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } Offset from %ebp yp: &y 12 xp: &x 8 Rtn adr 4 Body Stack (in memory) Old %ebp 0 %ebp Old %ebx -4 %esp Finish Prepared by the caller function %esp initially

  34. Carnegie Mellon Understanding Swap • • • void swap(int *xp, int *yp) { int t0 = *xp; int t1 = *yp; *xp = t1; *yp = t0; } Stack (in memory) Offset from %ebp yp: &y 12 Prepared by the caller function xp: &x 8 Rtn adr 4 Old %ebp 0 %ebp How to use the variables? Old %ebx -4 %esp Register Value %edx xp %ecx yp %ebx t0 %eax t1 movl 8(%ebp), %edx # edx = xp movl 12(%ebp), %ecx # ecx = yp movl (%edx), %ebx # ebx = *xp (t0) movl (%ecx), %eax # eax = *yp (t1) movl %eax, (%edx) # *xp = t1 movl %ebx, (%ecx) # *yp = t0

  35. %eax %edx %ecx %ebx %esi %edi 0x100 %esp %ebp 0x104 Carnegie Mellon Understanding Swap Address 123 0x124 456 0x120 0x11c Offset from %ebp 0x118 0x114 yp 12 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 0 %ebp 0x104 -4 0x100 movl 8(%ebp), %edx # edx = xp movl 12(%ebp), %ecx # ecx = yp movl (%edx), %ebx # ebx = *xp (t0) movl (%ecx), %eax # eax = *yp (t1) movl %eax, (%edx) # *xp = t1 movl %ebx, (%ecx) # *yp = t0

  36. %eax %edx %ecx %ebx %esi %edi %esp %ebp Carnegie Mellon Understanding Swap Address 123 0x124 456 0x120 0x11c 0x118 Offset 0x124 0x114 yp 12 0x120 0x120 0x110 xp 8 0x124 0x10c 4 Rtn adr 0x108 0 %ebp 0x104 -4 0x100 0x100 movl 8(%ebp), %edx # edx = xp movl 12(%ebp), %ecx # ecx = yp movl (%edx), %ebx # ebx = *xp (t0) movl (%ecx), %eax # eax = *yp (t1) movl %eax, (%edx) # *xp = t1 movl %ebx, (%ecx) # *yp = t0 0x104

  37. %eax %edx %ecx %ebx %esi %edi %esp %ebp Carnegie Mellon Understanding Swap Address 123 0x124 456 0x120 0x11c 0x118 Offset 0x124 0x114 yp 12 0x120 0x110 0x120 xp 8 0x124 0x124 0x10c 4 Rtn adr 0x108 0 %ebp 0x104 -4 0x100 0x100 movl 8(%ebp), %edx # edx = xp movl 12(%ebp), %ecx # ecx = yp movl (%edx), %ebx # ebx = *xp (t0) movl (%ecx), %eax # eax = *yp (t1) movl %eax, (%edx) # *xp = t1 movl %ebx, (%ecx) # *yp = t0 0x104

  38. %eax %edx %ecx %ebx %esi %edi %esp %ebp Carnegie Mellon Understanding Swap Address 123 0x124 456 456 0x120 0x11c 0x118 Offset 0x124 0x114 yp 12 0x120 0x110 0x120 xp 8 0x124 0x10c 123 4 Rtn adr 0x108 0 %ebp 0x104 -4 0x100 0x100 movl 8(%ebp), %edx # edx = xp movl 12(%ebp), %ecx # ecx = yp movl (%edx), %ebx # ebx = *xp (t0) movl (%ecx), %eax # eax = *yp (t1) movl %eax, (%edx) # *xp = t1 movl %ebx, (%ecx) # *yp = t0 0x104

  39. %eax %edx %ecx %ebx %esi %edi %esp %ebp Carnegie Mellon Understanding Swap Address 123 123 0x124 456 0x120 0x11c 456 0x118 Offset 0x124 0x114 yp 12 0x120 0x110 0x120 xp 8 0x124 0x10c 123 4 Rtn adr 0x108 0 %ebp 0x104 -4 0x100 0x100 movl 8(%ebp), %edx # edx = xp movl 12(%ebp), %ecx # ecx = yp movl (%edx), %ebx # ebx = *xp (t0) movl (%ecx), %eax # eax = *yp (t1) movl %eax, (%edx) # *xp = t1 movl %ebx, (%ecx) # *yp = t0 0x104

  40. %eax %edx %ecx %ebx %esi %edi %esp %ebp Carnegie Mellon Understanding Swap Address 456 0x124 456 0x120 0x11c 456 456 0x118 Offset 0x124 0x114 yp 12 0x120 0x110 0x120 xp 8 0x124 0x10c 123 123 4 Rtn adr 0x108 0 %ebp 0x104 -4 0x100 0x100 movl 8(%ebp), %edx # edx = xp movl 12(%ebp), %ecx # ecx = yp movl (%edx), %ebx # ebx = *xp (t0) movl (%ecx), %eax # eax = *yp (t1) movl %eax, (%edx) # *xp = t1 movl %ebx, (%ecx) # *yp = t0 0x104

  41. %eax %edx %ecx %ebx %esi %edi %esp %ebp Carnegie Mellon Understanding Swap Address 456 0x124 123 0x120 0x11c 456 0x118 Offset 0x124 0x114 yp 12 0x120 0x110 0x120 xp 8 0x124 0x10c 123 123 4 Rtn adr 0x108 0 %ebp 0x104 -4 0x100 0x100 movl 8(%ebp), %edx # edx = xp movl 12(%ebp), %ecx # ecx = yp movl (%edx), %ebx # ebx = *xp (t0) movl (%ecx), %eax # eax = *yp (t1) movl %eax, (%edx) # *xp = t1 movl %ebx, (%ecx) # *yp = t0 0x104

  42. Carnegie Mellon Complete Memory Addressing Modes • Most General Form D(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]+D] • D: Constant “displacement” 1, 2, or 4 bytes • Rb: Base register: Any of 8 integer registers • Ri: Index register: Any, except for %esp • Unlikely you’d use %ebp, either • S: Scale: 1, 2, 4, or 8 • E.g., 8(%eax, %edx, 4) -> M[%eax + 4 * %edx + 8] • Special Cases (Rb,Ri) Mem[Reg[Rb]+Reg[Ri]] D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D] (Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]

  43. Practice Problem 3.1 • Address Value Register Value 0x100 0xFF %eax 0x100 0x104 0xAB %ecx 0x1 0x108 0x13 %edx 0x3 • Operand Value %eax ??? $0x108 (%eax) 4(%eax) 9(%eax, %edx) 0xFC(, %ecx, 4) (%eax, %edx, 4) IA32

  44. Carnegie Mellon Address Computation Examples

  45. 3.5 Arithmetic & Logical Operations IA32

  46. 3.5.1 Load Effective Address • The load effective address instruction leal is a variant of the movl instruction. • It has the form of an instruction that reads from memory to a register, but it does not reference memory at all. Just address computation. • Example, If %edx contains value x and %ecx contains y, then leal 7(%edx, %edx, 4) %eax x + 4x + 7 leal 6(%eax), %edx ??? leal (%eax, %ecx, 4), %edx ??? leal 0xA(, %ecx, 4), %edx ??? IA32

  47. Carnegie Mellon Address Computation Instruction • lealSrc,Dest • Src is address mode expression • Set Dest to address denoted by expression • Uses • Computing addresses without a memory reference • E.g., translation of p = &x[i]; • Computing arithmetic expressions of the form x + k*y • k = 1, 2, 4, or 8 • Example int mul12(int x) { return x*12; } Converted to ASM by compiler, assuming %eax has x: leal (%eax,%eax,2), %eax ;t <- x+2*x sall $2, %eax ;return t<<2

  48. 3.5.2,3 Unary, Binary, Shift Operations IA32

  49. Carnegie Mellon Some Arithmetic Operations • Two Operand Instructions: FormatComputation addlSrc,Dest Dest = Dest + Src sublSrc,Dest Dest = Dest  Src imullSrc,Dest Dest = Dest * Src sallSrc,Dest Dest = Dest << Src Also called shll sarlSrc,Dest Dest = Dest >> Src Arithmetic shrlSrc,Dest Dest = Dest >> Src Logical(filled with 0) xorlSrc,Dest Dest = Dest ^ Src andlSrc,Dest Dest = Dest & Src orlSrc,Dest Dest = Dest | Src • Watch out for argument order!

  50. Carnegie Mellon Some Arithmetic Operations • One Operand Instructions inclDestDest = Dest + 1 declDestDest = Dest  1 neglDestDest = Dest notlDestDest = ~Dest • See book for more instructions

More Related