1 / 29

ENGR 330: Today’s Class

ENGR 330: Today’s Class. Notes Networking/Telecom Course (QMCS 370) CIGs Pentium Instruction Set Format overview Evolution Details, Address Modes Pentium Architecture/Pipelining Pentium, the first Pentium Pro Pentium 3 Pentium 4 Memory Management (if time). Pentium Instruction Format.

honoria
Télécharger la présentation

ENGR 330: Today’s Class

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ENGR 330: Today’s Class • Notes • Networking/Telecom Course (QMCS 370) • CIGs • Pentium Instruction Set • Format overview • Evolution • Details, Address Modes • Pentium Architecture/Pipelining • Pentium, the first • Pentium Pro • Pentium 3 • Pentium 4 • Memory Management (if time) R. Smith - University of St Thomas - Minnesota

  2. Pentium Instruction Format • Supports 8, 16, 32-bit operands • Officially 17 addressing modes, arguably more • Keyed off the opcode and prefixes • Identical “assembly language” from old 8080 CPU R. Smith - University of St Thomas - Minnesota

  3. Chronology • 8080 (1974) • 8-bit registers, 16-bit RAM addresses (MITS Altair) • 8086, 8088 (1978) “IA-16” • 16-bit registers and RAM addresses (IBM-PC) • 8088 hardware was ‘backwards compatible” with 8085 • “Assembler compatible” with 8080 - just reassemble • Segmentation allowed 1MB of RAM addressing • 80386 (1985) “IA-32” • 32-bit registers w/smaller ‘subsets’ for compatibility • 32-bit addresses made segments irrelevant • Pentium - the first (1993) • P6 Family introduced in 1995 • Pentium Pro, Pentium II, Pentium III, etc. • Pentium 4 introduced in 2000 R. Smith - University of St Thomas - Minnesota

  4. Instruction Set Extensions • Each new processor brought new instructions • Specialized sets, too • 80x87 Math Co-Processor • Introduced floating point instructions and stack • Integrated into later processors • MMX (1997) • SIMD instructions, 8 integer registers @ 64 bits (reused FP) • 3DNow! (AMD in 1997) • MMX extended to support floating point operations • SSE (1999; SSE2 in 2000 for integers) • 8 giant 128-bit registers for SIMD operation R. Smith - University of St Thomas - Minnesota

  5. Pentium General Registers • Cut into halves/quarters for compatibility R. Smith - University of St Thomas - Minnesota

  6. Pentium Registers • Address Space • Segments with 32 bit addresses • Usually only 1 segment is used by a program • Standard general purpose registers • EAX, EBX, ECX, and EDX – 32 bits each, with lower half accessible separately and as separate high/low bytes • Each has special jobs in certain arithmetic instructions • Address Registers • ESI, EDI – point to strings in memory • EBP – points to bse of the current stack frame (local memory) • ESP – the stack pointer R. Smith - University of St Thomas - Minnesota

  7. Intel Assembly Language • Opcode destination, source • Format similar to LC-3 and MIPS • BUT, allows memory to memory transfers • Operands may be reg/mem, reg/reg, mem/mem, mem/reg • But it all depends on the opcode - many weird restrictions • Segment Registers – mostly obsolete • Provide the “upper” part of the address in 16bit-1MB days • RAM addresses traditionally included a segment register • MOV AL,DS:[7777h] • Move contents of 7777 hex (mapped by DS) to AL • DS segment is for “data” - the default segment R. Smith - University of St Thomas - Minnesota

  8. Addressing Modes • “Displacement only” - direct address • Traditionally uses a segment register • DS is the default • Register Indirect • MOV AL,[BX] - moves to RAM addressed by BX • MOV AL,ES:[DI] addressed by DI with ES segment • The ‘BP’ register uses SS segment by default • Various Indexed modes • Combine 1 or 2 index registers plus offset • May include offset R. Smith - University of St Thomas - Minnesota

  9. Memory Addressing Modes Summary • Pick zero or one from each column • Suffix an “E” for the Pentium registers • BX = EAX, EBX, ECX, EDX, ESP, or EBP • Also add a “scale factor” for 8/16/32/64 R. Smith - University of St Thomas - Minnesota

  10. Pentium Instructions • Traditional Instructions • Add, sub, add w/carry, sub w/carry, mul, div • BCD arithmetic, booleans, shifts/rotates, string ops • Loops, conditionals, condition code setting, subroutines • MMX / SSE / XXM Extensions • Intended to better support image manipulation for multimedia • MMX: Eight 64-bit registers, plus special instructions • SSE/XXM: Eight 128-bit registers usable as 16 registers of 64 bits • Parallel adds, shifts, multiplies of multiple values packed into MMX registers • Example applications • Subtracting one image from another for overlaying • Unpacking a compressed image (JPEG, MPEG, etc) R. Smith - University of St Thomas - Minnesota

  11. Architecture of The First Pentium • Before SIMD and super-graphics, but still a major machine • Superscalar – faster than “linear” instruction execution • Separate caches for instructions and data R. Smith - University of St Thomas - Minnesota

  12. Fixed Point: 5 stages U-pipe and V-pipe are interchangeable except if the instruction needs the barrel shifter. U and V can run 2 instructions in parallel – cover vast majority of instructions used Floating Point: 8 stages Uses part of the fixed point pipeline Pentium Details R. Smith - University of St Thomas - Minnesota

  13. Pentium ProProcessor Overview • First P6 Pentium • 1995 • Pentium II in 1997 • Pentium III in 1999 • Reservation Station • Decoupled instruction fetching from execution • Leap in performance R. Smith - University of St Thomas - Minnesota

  14. P6 Architecture – Caches & Execution R. Smith - University of St Thomas - Minnesota

  15. Pentium Pro Details R. Smith - University of St Thomas - Minnesota

  16. Pentium III System Structure R. Smith - University of St Thomas - Minnesota

  17. Pentium III Processor R. Smith - University of St Thomas - Minnesota

  18. P III Instruction Execution Units R. Smith - University of St Thomas - Minnesota

  19. Pentium 4 System Structure R. Smith - University of St Thomas - Minnesota

  20. Pentium 4 microinstructions • Embeds a RISC architecture and pipelining within a CISC instruction set • Instructions fetched to CPU • Translated into internal RISC-style “microinstructions” • Microinstructions are stored in the level 0 instruction cache • CPU execution logic executes microinstructions in a pipelined fashion • Retains compatibility with old Pentium and x86 code while achieving RISC-like performance R. Smith - University of St Thomas - Minnesota

  21. Pentium 4 Processor Architecture R. Smith - University of St Thomas - Minnesota

  22. Memory Hierarchies • Temporal Locality • If I touched location X just now, I’ll likely touch it again soon. • Spatial Locality • If I touch location X, I’ll probably also touch X+1, X-1, etc. • Lesson: keep stuff you’re using nearby in the fastest RAM you can build • Lesson: if you’re not using it right now, it’s OK to stick it in slower storage till you need it • Lesson: the system can hide the hierarchy from your programs, most of the time R. Smith - University of St Thomas - Minnesota

  23. Storage Technologies (costs in 2004) • At the top: Hard Drives • Size: Terabytes. Cost/GB: $.50-$2 • Access time: 5 million to 20 million nsec • Flash • Size: Gigabytes. Cost/GB: $15 • Access time: 200 nsec • Dynamic RAM (typical computer RAM) • Size: Gigabytes, Cost/GB: $100-200 • Speed 50-70 nsec • Static RAM (cache, on-chip, registers) • Size: Megabytes, Cost/GB: $4K to $10K • Speed: .05 - 5 nsec R. Smith - University of St Thomas - Minnesota

  24. The driving force in computer design • Programs are hard to write • How do we get the most out of the programs we have already written? • Implications for memory • CPU mustn’t see cache operation in general • CPU mustn’t see oddities in RAM layout or availability • How do we hide these details? R. Smith - University of St Thomas - Minnesota

  25. Hiding the details • Cache implementation • We give the CPU an MAR/MDR interface • We make most RAM references as fast as possible • We NEVER make a mistake • Process swapping • Multiprocessor problems • RAM Management • We make RAM look identical to all programs • Programs can’t tell where they really reside in RAM • Give programs exactly as much RAM as they need at a given time, and give the rest away to other programs that are taking turns with the CPU R. Smith - University of St Thomas - Minnesota

  26. Direct Mapped Cache • The preferred design these days • A collection of high speed RAM locations • Broken into individually addressed “cache entries” • Part of RAM address chooses cache entry (“Direct mapping”) • A cache entry • “Index” is its address in the cache • Valid bit - true if the entry contains valid RAM data • “Tag” holds the address bits not matching the cache address • Data area - where the stored data resides • Store multiple words (spatial locality) R. Smith - University of St Thomas - Minnesota

  27. Example • 32 bit RAM addresses • 64 cache entries, each contains 16 bytes • How do we resolve cache addresses? • How big is the tag field? • How much RAM does it need, in bits, per entry? • How much for the whole cache? R. Smith - University of St Thomas - Minnesota

  28. CPU and Cache Handling • What happens with a cache hit? • What happens with a cache miss? • A stall, like a pipeline stall, but simpler • We stall the whole CPU - inefficient but it’s the best approach • What happens when we write data? • “Write through” runs the write while CPU proceeds • Other CPU accesses get the cached, updated value • “Write miss” - obvious approach isn’t efficient • Use a “write buffer” to catch missed writes R. Smith - University of St Thomas - Minnesota

  29. All done. • Questions? • Diagrams cribbed from random Internet sites R. Smith - University of St Thomas - Minnesota

More Related