1 / 62

嵌入式處理器架構與 程式設計

嵌入式處理器架構與 程式設計. 王建民 中央研究院 資訊所 2008 年 7 月. Contents. Introduction Computer Architecture ARM Architecture Development Tools GNU Development Tools ARM Instruction Set ARM Assembly Language ARM Assembly Programming GNU ARM ToolChain Interrupts and Monitor.

joie
Télécharger la présentation

嵌入式處理器架構與 程式設計

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 嵌入式處理器架構與程式設計 王建民 中央研究院 資訊所 2008年 7月

  2. Contents • Introduction • Computer Architecture • ARM Architecture • Development Tools • GNU Development Tools • ARM Instruction Set • ARM Assembly Language • ARM Assembly Programming • GNU ARM ToolChain • Interrupts and Monitor

  3. Lecture 7ARM Assembly Language

  4. Outline • Coprocessor and Thumb Instructions • Assembly Language • Runtime Environment

  5. Coprocessors1 • The ARM architecture supports 16 coprocessors • System coprocessor • Floating-point coprocessor • Application-specific coprocessor • A coprocessor may be implemented • in hardware • in software (via the undefined instruction exception) • in both (common cases in hardware, the rest in software) • Each coprocessor instruction set occupies part of the ARM instruction set.

  6. Coprocessors2 • There are three types of coprocessor instruction • Coprocessor data processing • Coprocessor (to/from ARM) register transfers • Coprocessor memory transfers (load and store to/from memory) • Assembler macros can be used to transform custom coprocessor mnemonics into the generic mnemonics understood by the processor.

  7. Coprocessor Data Processing • This instruction initiates a coprocessor operation • The operation is performed only on internal coprocessor state • For example, a Floating point multiply, which multiplies the contents of two registers and stores the result in a third register • Syntax: • CDP{<cond>} <cp_num>,<opc_1>,CRd,CRn,CRm,{<opc_2>}

  8. Coprocessor Register Transfers • Instructions • MRC : Move to ARM Register from Coprocessor • MCR : Move to Coprocessor from ARM Register • An operation may also be performed on the data as it is transferred • Ex. a Floating Point Convert to Integer instruction can be implemented as a register transfer to ARM. • Syntax <MRC|MCR>{<cond>} <cp_num>,<opc_1>,Rd,CRn,CRm,<opc_2> 31 28 27 26 25 24 23 22 21 20 19 16 15 12 11 8 7 5 4 3 0 Cond 1 1 1 0 opc_1 L CRn Rd cp_num opc_2 1 CRm ARM Source/Dest Register Opcode Coprocesor Source/Dest Registers Transfer To/From Coprocessor Condition Code Specifier Opcode

  9. 31 28 27 26 25 24 23 22 21 20 19 16 15 12 11 8 7 0 Cond 1 1 0 P U N W L Rn CRd cp_num Offset Source/Dest Register Address Offset Base Register Load/Store Base Register Writeback Condition Code Specifier Transfer Length Add/Subtract Offset Pre/Post Increment Coprocessor Memory Transfers1 • Load from memory to coprocessor registers • Store to memory from coprocessor registers.

  10. Coprocessor Memory Transfers2 • Syntax <LDC|STC>{<cond>}{<L>} <cp_num>,CRd,<address> • PC relative offset generated if possible, else causes an error. <LDC|STC>{<cond>}{<L>} <cp_num>,CRd,<[Rn,offset]{!}> • Pre-indexed form, with optional writeback of the base register <LDC|STC>{<cond>}{<L>} <cp_num>,CRd,<[Rn],offset> • Post-indexed form • <L> when present causes a “long” transfer to be performed (N=1) else causes a “short” transfer to be performed (N=0). • Effect of this is coprocessor dependant.

  11. Thumb1 • Thumb is a 16-bit instruction set • Optimized for code density from C code (~65% of ARM code size) • Improved performance from narrow memory (~160% of an equivalent ARM connected to 16-bit memory system) • Subset of the functionality of the ARM instruction set • Core has additional execution state - Thumb • It can switch back and forth between 16-bit and 32-bit instructions • Switch between ARM and Thumb using BX instruction

  12. Thumb2 • For most instructions generated by compiler: • Conditional execution is not used • Source and destination registers identical • Only Low registers used • Constants are of limited size • Inline barrel shifter not used 31 ADDS r2,r2,#1 0 32-bit ARM Instruction 15 ADD r2,#1 0 16-bit Thumb Instruction

  13. Outline • Coprocessor and Thumb Instructions • Assembly Language • Runtime Environment

  14. The Programmer’s Model1 • We will not be using the Thumb instruction set. • Memory Formats • We will be using the Little Endian format • the lowest numbered byte of a word is considered the word’s least significant byte, and the highest numbered byte is considered the most significant byte . • Instruction Length • All instructions are 32-bits long. • Data Types • 8-bit bytes and 32-bit words.

  15. The Programmer’s Model2 • Processor Modes (of interest) • User: the “normal” program execution mode. • IRQ: used for general-purpose interrupt handling. • Supervisor: a protected mode for the operating system. • The Register Set • Registers R0-R15 + CPSR • R13: Stack Pointer • R14: Link Register • R15: Program Counter where bits 0:1 are ignored (why?)

  16. The Programmer’s Model3 • Program Status Registers • CPSR (Current Program Status Register) • holds info about the most recently performed ALU operation • contains N (negative), Z (zero), C (Carry) and V (oVerflow) bits • controls the enabling and disabling of interrupts • sets the processor operating mode • SPSR (Saved Program Status Registers) • used by exception handlers • Exceptions • reset, undefined instruction, SWI, IRQ.

  17. Assembly Language Basics1 • “Load/store” architecture • 32-bit instructions • 32-bit and 8-bit data types • 32-bit addresses • 37 registers (30 general-purpose registers, 6 status registers and a PC) • only a subset is accessible at any point in time • No instruction to move a 32-bit constant to a register (why?)

  18. Assembly Language Basics2 • Conditional execution • Barrel shifter • scaled addressing, multiplication by a small constant, and ‘constant’ generation • Loading constants into registers • Loading addresses into registers • Load and Store Multiple instructions • Jump tables • Co-processor instructions (we will not use these)

  19. GNU ARM Assembler • You can assemble the contents of any ARM assembly language source file by executing the arm-elf-as program. • arm-elf-as –mno-fpu –o filename.o filename.s • Though you can use the GNU Linker to create the final executable, it is preferred to use the GNU Compiler Collection to create an executable file. • arm-elf-gcc –o filename.elf filename.s • To execute an ARM executable file • arm-elf-run filename.elf

  20. Assembly Language Syntax • Each assembly line has the following format [<label:>] [<instruction or directive>] @ comment • A label can be any valid symbol followed by a : • Only use the alphabetic characters A-Z and a-z, the digits 0-9, as well as “_”, “.”, and “$” • An instruction to assemble into machine language code. • Begins with a letter • A directive to guide the work of the assembler • Begins with a . • A comment is anything that follows a @ • C-style comments (using “/*” and “*/”) are also allowed

  21. Assembler Directives • Starting a new section .section name • Defining code section of program .text • Defining data initialized data section of program .data • Defining un-initialized data section of program .bss • End of the assembly file (optional) .end

  22. Assembler Directives • Making a symbol available to other partial programs that are linked with it .global symbol • Declaring a symbol as externally defined (optional) .extern symbol • Aligning the address to a particular storage boundary which is a power of 2. .align expression • Declaring a common symbol that may be merged .comm symbol,length,alignment

  23. Assembler Directives • Defining / initializing storage locations .word expression @ 32 bits .hword expression @ 16 bits .byte expression @ 8 bits • Defining / initializing a string .ascii “string” .asciz “string” • Defining memory space .skip size .space size

  24. Assembler Directives • Directives similar to the statements that begin with “#” in the C programming language .include “file” .equ symbol, expression .set symbol, expression .if expression .ifdef expression .ifndef expression .else .endif

  25. Chunks of code or data manipulated by the linker Minimum required block (why?) First instruction to be executed The Structure of an Assembly Code .file "sum2.s" .section .text @ the code section .align 2 @ aligns the address @ to 4 bytes .global sum2 @ give the symbol @ an external linkage sum2: add r0, r0, r1 @ add input arguments mov pc, lr @ return from subroutine .end @ end of program

  26. Example #1: Finding the Large One #include <stdio.h> extern int max2(int a, int b); int main() { int a = 12345; int b = 6789; printf("The maximum of %d and %d is %d\n",a,b,max2(a,b)); } .text .align 2 .global max2 max2: cmp r0, r1 @ compare two numbers bge done @ if R0 contains the maximum mov r0, r1 @ otherwise overwrite R0 done: mov pc, lr @ return from subroutine

  27. Example #2: Finding the Largest #include <stdio.h> extern int maxn(int *a, int n); int a[6] = { 123, 34, 45, 56, 678, 9 }; int main() { printf("The maximum of all numbers is %d\n", maxn(a,6)); } .text .align 2 .global maxn maxn: mov r2, r0 mov r3, r1 ldr r0, [r2], #4 loop: subs r3, r3, #1 @ reduce the count by 1 beq done @ test if finished ldr r1, [r2], #4 @ put next number in R1 cmp r0, r1 @ if R0 contains the larger movlt r0, r1 @ otherwise overwrite R0 b loop @ continue done: mov pc, lr @ return from subroutine

  28. Does this work? • Instead of computing the larger number by itself, it may call max2 in Example #1 to find the larger number .text .align 2 .global maxn maxn: mov r2, r0 mov r3, r1 ldr r0, [r2], #4 loop: subs r3, r3, #1 @ reduce the count by 1 beq done @ test if finished ldr r1, [r2], #4 @ put next number in R1 bl max2 @ call max2 to find the larger b loop @ continue done: mov pc, lr @ return from subroutine

  29. Calling Another Function • Be careful with the registers used in a function, especially the link register! .text .align 2 .global maxn maxn: mov r2, r0 mov r3, r1 mov r5, lr @ save the link register ldr r0, [r2], #4 loop: subs r3, r3, #1 @ reduce the count by 1 beq done @ test if finished ldr r1, [r2], #4 @ put next number in R1 bl max2 @ call max2 to find the larger b loop @ continue done: mov lr, r5 @ restore the link register mov pc, lr @ return from subroutine

  30. Example #3: Computing Factorial #include <stdio.h> extern int factor(int n); int main() { int n = 7; printf("The factorial of %d is %d\n", n, factor(n)); } .text .align 2 .global factor factor: stmfd sp!, {r0, lr} @ push register on stack subs r0, r0, #1 moveq r0, #1 @ (n-1)! = 1 if n-1 == 0 blne factor @ compute (n-1)! if n-1 != 0 ldmfd sp!, {r1, lr} @ pop registers from stack mul r0, r0, r1 @ compute n! = n * (n-1)! done: mov pc, lr @ return from subroutine

  31. Assembly codes for if-statements if cond then t1 = cond if not t1 goto else_label then_statements codes for then_statements goto endif_label else else_label: else_statements codes for else_statements end if; endif_label:

  32. Assembly codes for else-if parts • For each alternative, place in code the current else_label, and generate a new one. if cond then s1 t1 = cond1 if not t1 goto else_label1 codes for s1 goto endif_label else if cond2 then s2 else_label1: t2 = cond2 if not t2 goto else_label2 codes for s2 goto endif_label else s4 else_label2: codes for s4 end if; endif_label:

  33. Assembly codes for while loops • Create two labels: start_loop, end_loop while (cond) { start_loop: if (!cond) goto end_loop s1; codes for s1 if (cond2) break; if (cond2) goto end_loop s2; codes for s2 if (cond3) continue; if (cond3) goto start_loop: s3; codes for s3 }; goto start_loop end_loop:

  34. Assembly codes for numeric loops • Semantics: loop not executed if range is null, so must test before first pass. for J in expr1..expr2 loop J = expr1 start_label: if J > expr2 goto end_label S1 codes for S1 end loop; J = J + 1 goto start_label end_label:

  35. Codes for short-circuit expressions • Short-circuit expressions are treated as control structures • if B1 or else B2 then S1… -- if (B1 || B2) { S1.. if B1 goto then_label if not B2 goto else_label then_label: codes for S1 goto endif_label else_label: • Inherit target labels from enclosing control structure • Create additional labels for composite short-circuits

  36. Assembly codes for case statements • If range is small and most cases are defined, create jump table as array of code addresses, and generate indirect jump. table label1, label2 … case x is jumpi x table when up: y := 0; label1: y = 0 goto end_case when down : y := 1; label2: y = 1 goto end_case end case; end_case:

  37. Outline • Coprocessor and Thumb Instructions • Assembly Language • Runtime Environment

  38. Runtime Environment • To understand the environment in which your final output will be running. • How a program is laid out in memory: • Code • Data • Stack • Heap • How function callers and callees pass info

  39.  High memory Runtime stack (not to scale) Dynamic data (heap) Global data Static data Code  Low memory Executable Layout in Memory

  40. stack heap globl static code Overall Program Layout • From low memory up: • Code (text segment, instructions) • Static (constant) data • Global data • Dynamic data (heap) • Runtime stack (procedure calls) • Review of what’s in each section:

  41. Text Segment (Executable Code)1 • Actual machine instructions • Arithmetic / logical • Comparison • Branch (short distances) • Jump (long distances) • Load / store • Data movement • Constant manipulation (immediate)

  42. Text Segment (Executable Code)2 • Code segment write-protected, so running code can’t overwrite itself. • (Debugger can overwrite it.) • You’ll create the precursor for the code in this segment by emitting assembly code. • Assembler will build final text.

  43. Data Segment1 • Data Objects • Whose size is known at compile time • Whose lifetime is the full run of the program (not just during a function invocation) • Static data includes things that won’t change (can be write-protected): • Virtual-function dispatching tables • String literals used in instructions • Arithmetic literals could be, but more likely incorporated into instructions.

  44. Data Segment2 • Global data (other than static) • Variables declared global • Local variables declared static (in C) • Declared local to a function. • Retain values even between invocations of that function (lifetime is whole run). • Semantic analysis ensures that static locals are not referenced outside their function scope.

  45. Dynamic Data (Heap)1 • Data created by malloc or New. • Heap data lives until deallocated or until program ends. (Sometimes longer than you want, if you lose track of it.) • Garbage collection / reference counting are ways of automatically de-allocating dead storage in the heap.

  46. *p3 *p2 *p4 0x1000000 *p1 Dynamic Data (Heap)2 • Heap allocation starts at bottom of heap (lower addresses) and allocates upward. • Requirements of alignment, specifics of allocation algorithm may cause storage to be allocated out of (address) order. p1 = new Big(); p2 = new Medium(); p3 = new Big(); p4 = new Tiny(); • So (int)p2 > (int)p1 • But (int)p4 < (int)p3 • Compare pointers for equality, not < or >.

  47. Runtime Stack1 • Data used for function invocation: • Variables declared local to functions (including main) aka “automatic” data. • Except for statics (in data segment) • Variables declared in anonymous blocks inside function. • Arguments to function (passed by caller). • Temporaries used by generated code (not representing names in source). • Possibly value returned by callee to caller.

  48. Runtime Stack2 • Types of data that can be allocated on runtime stack: • In C, all kinds of data: simple types, structs, arrays. • C++: stack can hold objects declared as class type, as well as pointer type. • Some languages don’t allow arrays on stack.

  49.  Top  Base Stack Terminology1 A stack is an abstract data type. Push new value onto Top; pop value off Top. Higher elements are more recent, lower elements are older.

  50. Stack Terminology2 • Stack implementation can grow any direction. • MIPS stack grows downward (from higher memory addresses to lower). • Possible difficulty with terminology. • Some people (and documents) talk about going “up” and “down” the stack. • Some use the abstraction, where “up” means “more recent”, towards Top. • Some (including gdb) say “up” meaning “towards older entries”, toward Base.

More Related