260 likes | 426 Vues
Introduction. What is a Vector Processor?The Velocity EngineProgramming the Velocity EngineDiscuss Examples 1 to 3 onlyQ
E N D
1. Programming theVelocity Engine Bing-Chang Lai
Phillip John McKerrow
University of Wollongong
2. Introduction What is a Vector Processor?
The Velocity Engine
Programming the Velocity Engine
Discuss Examples 1 to 3 only
Q&A
3. What is a Vector Processor? Supports Single Instruction Multiple Data (SIMD) instructions
Originally used in Supercomputers for crunching scientific programs
Now popular on the desktop as well, for crunching multimedia related applications
4. What is a Vector Processor? On desktop, it is usually part of a larger processor
Examples of Vector Processor Technologies
MMX, SSE, 3DNow, AltiVec
5. The Velocity Engine Apples name for AltiVec Technology
What is AltiVec Technology then?
Refers to technique Motorola used to add vector processing capabilities to the G4 (74xx) family of processors
6. The Velocity Engine G4 Processor
Load/Store Unit
Integer Unit
Floating Point Unit
Vector Unit (AltiVec)
7. Programming the Velocity Engine Specifications
AltiVec Technology Programming Interface Manual
Available from
http://e-www.motorola.com/brdata/ PDFDB/MICROPROCESSORS/32_BIT/POWERPC/ALTIVEC/ALTIVECPIM.pdf
http://www.altivec.org/tech_specifications/ altivec_pim.pdf
8. Programming the Velocity Engine Compilers
Apple AltiVec-related patches to GCC 2.295.2
Metroworks Codewarrior
Vector types
All vectors are 128-bit long
Start with keyword vector or __vector
Followed by type. Eg. unsigned char, unsigned int, signed int and so on
9. Programming the Velocity Engine Vector types
10. Programming the Velocity Engine Vector types
11. Programming the Velocity Engine Vector types
12. Programming the Velocity Engine Vector operations
Arithmetic Operations
vec_abs (absolute value), vec_add (addition), vec_sub (subtraction) ...
Boolean Operations
vec_and (Logical AND), vec_or (Logical OR) ...
vec_cmpeq (Equality), vec_cmple (Less Than or Equal To)
13. Programming the Velocity Engine Vector operations
Miscellaneous Operations
vec_perm (Permutation), vec_merge (Merges two vectors into 1) ...
Memory Operations
vec_st (Store), vec_ld (Load) ...
Data Stream Operations
vec_dst (Vector Data Stream Touch), vec_dss (Vector Data Stream Stop) ...
14. Programming the Velocity Engine Constraints
Vector operations all work on 128-bits at a time only no more and no less.
vec_ld (load) and vec_st (store) all operate on 16-byte (128-bit) boundaries.
This leads to alignment of data issues
Loading of data from memory to the processor is one of the main bottlenecks.
Use cache functions to mark data for load before the operation takes place
15. Programming the Velocity Engine The following examples from the paper will be discussed
Example 1: Element-by-Element access
Example 2: Alignment
Example 3: Unaligned Loads and Stores
The Image Addition program in the Appendix will not be discussed
16. Programming the Velocity Engine Example 1: Element-by-Element Access
17. Programming the Velocity Engine Example 1: Element-by-Element Access
Outputs
01234567890ABCDEF
Instead of using the union, you can also access elements by address and casting
18. Programming the Velocity Engine Example 2: Alignment
16-byte aligned locations have address with the least significant 4 bits set to 0. Eg. 0xf0, 0x10 and so on
AltiVec specification specifies vec_malloc and vec_free for creating 16-byte aligned blocks for vectors.
The code finds the aligned address by removing setting the 4 l.s.b to 0 and then adding 16.
Please note that Apple GCC aligns everything to 16-byte boundaries
19. Programming the Velocity Engine Example 2: Alignment - Allocate
20. Programming the Velocity Engine Example 2: Alignment - Deallocate
21. Programming the Velocity Engine Example 2: Alignment - Using
22. Programming the Velocity Engine Example 3: Unaligned Loads and Store
23. Programming the Velocity Engine Example 3: Unaligned Loads and Store
24. Programming the Velocity Engine Example 3: Unaligned Loads and Store
25. Resources The code for this paper will be available
At http://www.bclai.net (Probably by the end of the week)
Email me on bl12@uow.edu.au
Other Important Resources
AltiVec Information Source
At http://www.altivec.org
Email group list
Apples AltiVec Homepage
At http://developer.apple.com/hardware/ve/
Tutorials
Vector Libraries
AlienOrb AltiVec Page
At http://www.alienorb.com/AltiVec/
AltiVec Tutorial
AltiVec Code Examples on lookup table, streaming data fetch instructions ...
26. References Bing-Chang Lai, Phillip John McKerrow Programming the Velocity Engine, AUC, 2001
Motorola, Inc. AltiVec Technology Programming Interface Manual, 1999.see http://e-www.motorola.com/brdata/PDFDB/MICROPROCESSORS/32_BIT/POWERPC/ALTIVEC/ALTIVECPIM.pdf
IanOllmann Ph.D. AltiVec, 2001. see http://www.alienorb.com/AltiVec/Altivec.pdf
27. Q&A Q&A
This slide follows the Think different slide and precedes any optional slides.Q&A
This slide follows the Think different slide and precedes any optional slides.