1 / 30

MMX Multi Media eXtensions

MMX Multi Media eXtensions. Starting with Pentium II MMX. Outline. Overview MMX programming environment Data types SIMD execution model New arithmetic MMX Instructions Cooperation with FPU Further Enhancements. Overview. Eight new 64-bit data registers, called MMX registers

msandy
Télécharger la présentation

MMX Multi Media eXtensions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MMX Multi Media eXtensions Starting with Pentium II MMX TUC-N dr. Emil CEBUC

  2. Outline • Overview • MMX programming environment • Data types • SIMD execution model • New arithmetic • MMX Instructions • Cooperation with FPU • Further Enhancements TUC-N dr. Emil CEBUC

  3. Overview • Eight new 64-bit data registers, called MMX registers • Three new packed data types: • — 64-bit packed byte integers (signed and unsigned) • — 64-bit packed word integers (signed and unsigned) • — 64-bit packed doubleword integers (signed and unsigned) • Instructions that support the new data types and to handle MMX state • Management • Extensions to the CPUID instruction TUC-N dr. Emil CEBUC

  4. MMX programming env. TUC-N dr. Emil CEBUC

  5. MMX Registers TUC-N dr. Emil CEBUC

  6. Data Types • 64-bit packed byte integers — eight packed bytes • 64-bit packed word integers — four packed words • 64-bit packed doubleword integers — two packed double words TUC-N dr. Emil CEBUC

  7. SIMD Execution Model • MMX instructions move 64-bit packed data types (packed bytes, packed words, or packed double words) and the quadword data type between MMX registers and memory or between MMX registers in 64-bit blocks • However, when performing arithmetic or logical operations on the packed data types, MMX instructions operate in parallel on the individual bytes, words, or double words contained in MMX registers TUC-N dr. Emil CEBUC

  8. SIMD Execution Model TUC-N dr. Emil CEBUC

  9. New arithmeticWraparound • Wraparound arithmetic • With wraparound arithmetic, a true out-of-range result is truncated (that is, the carry or overflow bit is ignored and only the least significant bits of the result are returned to the destination) TUC-N dr. Emil CEBUC

  10. New arithmeticSigned saturation • Signed saturation arithmetic • With signed saturation arithmetic, out-of range results are limited to the representable range of signed integers for the integer size being operated on TUC-N dr. Emil CEBUC

  11. New arithmeticUnsigned saturation • Unsigned saturation arithmetic • With unsigned saturation arithmetic, out of-range results are limited to the representable range of unsigned integers for the integer size. So, positive overflow when operating on unsigned byte integers results in FFH being returned and negative overflow results in 00H being returned TUC-N dr. Emil CEBUC

  12. New arithmeticSaturation ranges Saturation arithmetic provides an answer for many overflow situations. For example, in color calculations, saturation causes a color to remain pure black or pure white without allowing inversion TUC-N dr. Emil CEBUC

  13. MMX Instructions • The MMX instruction set consists of 47 instructions, grouped into the following categories: • Data transfer • Arithmetic • Comparison • Conversion • Unpacking • Logical • Shift • Empty MMX state instruction (EMMS) TUC-N dr. Emil CEBUC

  14. MMX Instruction set summary TUC-N dr. Emil CEBUC

  15. MMX Instruction set summary TUC-N dr. Emil CEBUC

  16. MMX Instruction set summary TUC-N dr. Emil CEBUC

  17. PMADDWD TUC-N dr. Emil CEBUC

  18. Cooperation with FPU • Applications can contain both x87 FPU floating-point and MMX instructions. However, because the MMX registers are aliased to the x87 FPU register stack, care must be taken when making transitions between x87 FPU instructions and MMX instructions • When an MMX instruction (other than the EMMS instruction) is executed, the processor changes the x87 FPU state as follows: • The TOS (top of stack) value of the x87 FPU status word is set to 0. • The entire x87 FPU tag word is set to the valid state (00B in all tag fields) • When an MMX instruction writes to an MMX register, it writes ones (11B) to the exponent part of the corresponding floating-point register (bits 64 through 79) TUC-N dr. Emil CEBUC

  19. Further Enhancements • streaming SIMD extensions (SSE) were introduced into the IA-32 architecture in the Pentium III processor family • Eight 128-bit data registers (called XMM registers) in non-64-bit modes; • Sixteen XMM registers are available in 64-bit mode. • The 32-bit MXCSR register, which provides control and status bits for operations performed on XMM registers. TUC-N dr. Emil CEBUC

  20. SSE • The 128-bit packed single-precision floating-point data type (four IEEE single precision floating-point values packed into a double quadword). • Instructions that perform SIMD operations on single-precision floating-point values and that extend SIMD operations that can be performed on integers: • 128-bit Packed and scalar single-precision floating-point instructions that operate on data located in MMX registers • 64-bit SIMD integer instructions that support additional operations on packed integer operands located in MMX registers • instructions that save and restore the state of the MXCSR register TUC-N dr. Emil CEBUC

  21. SSE2Pentium 4 and Intel Xeon processors • support for packed double-precision floating-point values and for 128-bit packed integers. • Five data types: • 128-bit packed double-precision floating-point (two IEEE Standard 754 double-precision floating-point values packed into a double quadword) • 128-bit packed byte integers • 128-bit packed word integers • 128-bit packed doubleword integers • 128-bit packed quadword integers TUC-N dr. Emil CEBUC

  22. SSE2 • flexibility is provided with instructions that operate on single (scalar) double-precision floating-point values located in the low quadword of an XMM register • greater throughput when performing SIMD operations on packed integers. • The capability is particularly useful for applications such as RSA authentication and RC5 encryption TUC-N dr. Emil CEBUC

  23. SSE2Data types TUC-N dr. Emil CEBUC

  24. SSE2Instructions • Packed and scalar double-precision floating-point instructions • 64-bit and 128-bit SIMD integer instructions • 128-bit extensions of SIMD integer instructions introduced with the MMX technology and the SSE extensions • Cacheability-control and instruction-ordering instructions TUC-N dr. Emil CEBUC

  25. SSEScalar Instructions TUC-N dr. Emil CEBUC

  26. SSE3 SSSE3 • The Pentium 4 processor supporting Hyper-Threading Technology introduces Streaming SIMD Extensions 3 (SSE3). The Intel Xeon processor 5100 series, Intel Core 2 processor families introduced Supplemental Streaming SIMD Extensions 3 (SSSE3). TUC-N dr. Emil CEBUC

  27. Asymmetric Processing TUC-N dr. Emil CEBUC

  28. Horizontal Processing TUC-N dr. Emil CEBUC

  29. SSE3 Instructions • x87 FPU instruction • One instruction that improves x87 FPU floating-point to integer conversion • SIMD integer instruction • One instruction that provides a specialized 128-bit unaligned data load • SIMD floating-point instructions • Three instructions that enhance LOAD/MOVE/DUPLICATE performance • Two instructions that provide packed addition/subtraction • Four instructions that provide horizontal addition/subtraction • Thread synchronization instructions • Two instructions that improve synchronization between multi-threaded agents TUC-N dr. Emil CEBUC

  30. SSSE3Instructions • Twelve instructions that perform horizontal addition or subtraction operations. • Six instructions that evaluate the absolute values. • Two instructions that perform multiply and add operations and speed up the evaluation of dot products. • Two instructions that accelerate packed-integer multiply operations and produce integer values with scaling. • Two instructions that perform a byte-wise, in-place shuffle according to the second shuffle control operand. • Six instructions that negate packed integers in the destination operand if the signs of the corresponding element in the source operand is less than zero. • Two instructions that align data from the composite of two operands TUC-N dr. Emil CEBUC

More Related