Ch. 11 Digital Signal Processing Using General-Purpose Processors

Ch. 11 Digital Signal Processing Using General-Purpose Processors Kathy Grimes

Signals • Signals • Electrical • Mechanical • Acoustic • Most real-world signals are Analog – they vary continuously over time • Many Limitations with Analog • Repeatability • Tolerances • Difficulty storing information or implementing certain operations Leads us to DSP…

Digital Signal Processing (DSP) • Represent signals by sequences of numbers • Pros • Repeatable • Accuracy can be controlled • Time-varying operations are easier to implement • Cons • Sampling cause loss of information • Round-off errors • A/D and D/A mixed-signal hardware

Digital Signal Processing (DSP) • Analog to Digital Converter • Continuous to Discrete time signal • 11.1 shows the sampling of a signal • Common Signals • Step Discontinuity (Figure 11.2) Impulse (Figure 11.3) FIGURE 11.1 Discrete Time Signals. FIGURE 11.2 Step Function. FIGURE 11.3 Impulse Function.

DSP Building Blocks • Based off of three basic functions: • Delay • Add • Multiply • Raw Performance for DSP algorithm is usually by # of ops needed to execute FIGURE 11.6 Delay Function. FIGURE 11.5 Multiply Function. FIGURE 11.4 Add Function.

DSP Building Blocks • These two systems in combination can be used to develop any discrete difference equation FIGURE 11.8 Feedback System. FIGURE 11.7 Feedforward System.

Fixed-Point and Floating-Point Implementations • Floating-Point DSP perform Integer Operation • Dynamic operating range • Fixed-Point DSP perform Integer and Floating Operation • Fixed range – 16 bit = 65536 max range • Analog world signals = infinite precision • Floating-point mimic the “infinite” range better • Easier to implement, avoids rounding and overflow errors • Why not always use Floating-point? • Cost, Availability, Price, and Performance • Precision Floating Point is good for smaller values but is poorer at larger values using same number of bits

Single Instruction Multiple Data • SIMD Microarchitecture and Instructions • One clock cycle for 4 data x(1 instruction)x 1 value • Increase of performance for low-level DSP functions (MAC) FIGURE 11.10 SIMD Instruction.

Microarchitecture Considerations • Processor Clockspeed • Cache size • Usually DSP architectures manually partition the memory space in order to reduce number of accesses to external memory • Latency = costly in terms of time and resources • Intel architectures have large amounts of cache and can overcome the fast/slow memory, however, all memory starts in “far” caches • Output data should be generated sequentially Accessing memory in a scattered pattern (while using threads) should be avoided

Implementation Options for Intel • Intrinsic • Vectorization • Intel Performance Primitives

Intrinsics and Data Types • C code that calls special built-in compiler capabilities that map closely to underlying SSE instruction set • Added Data Types • _m64, _m128, _m128d, _m128i • Intrinsic Operation Types • Arithmetic (fixed- and floating-point) • Shift • Logical • Compare • Set • Shuffle • Concatenation Adds four FP values packed into a and b and performs four additions in one instruction

Vectorization • Use compiler to apply vectorization techniques to loops within data processing iteration looks for opportunities to convert loops from single set to vector-based implementation (so that multiple operands can be operated at the same time) • Like GCC -- >aligned with SIMD instruction set • Use #pragma directives to guide compiler to avoid overheads such as data dependces Listing 11.7 Memory Alignment Property and Discarding Assumed Data Dependences. Listing 11.4 Explicitly Don’t Vectorize Loop.

Vectorization • Comparisons on Performance • This performance would be vastly different if the memory was not already aligned

Performance Primitives • Intel Libraries – highly optimized implementations for many different applications (include audio codecs, image processing, data compression, etc…) • Libraries take full advantage of CPU and SIMD (and most are written for performance) • Libraries are threaded and can obtain performance gains by parallelizing the algorithm • Libraries that take advantage are: • Signal Processing – Convolution and correlation, Finite impulse response (FIR) filter, FIR coefficints generation function, Infinite response filter (IIR), Transforms • Image Processing • Small Matrices and Realistic Rendering • Cryptography

Finite Impulse Response Filter • FIR filter equation • Y[n] = a.x[n] + b.x[n-1] + c.x[n-2] Listing 11.9 FIR Using Intel Performance Primitives. Listing 11.8 FIR Filter C Code Example

FIR Ex: Intel SSE • Loop Unrolling to get rid of data dependences • By changing the data elements, we can reduce the number of times we need to read data

Medical Ultrasound Imaging • Computation intensive • Needs a significant amount of embedded computational performance • Same basic algorithmic pattern even though physical configurations, parameters, and functionality are different • Beam forming • Envelope Extraction • Polar-to-Cartesian coordinate translation

FIGURE 11.12 Block Diagram of a Typical Ultrasound Imaging Application.

Envelope Detector FIGURE 11.15 Block Diagram of the Envelope Detector.

Envelope Detector FIGURE 11.16 Polar-to-Cartesian Conversion of a Hypothetically Scanned Rectangular Object. Listing 11.11 Code Sample for Envelope Detector.

Performance Results • Why such a large difference?

Summary • Digital Signal Processing in general-purpose processors • Extend Processing Capabilities • Simplifies overall application when platforms require Control, Communications, and General-purpose processing w/DSP • Many ways to improve an Intel system by implementing special C code, vectorization, and specific libraries • Performance is greatly enhanced when DSP is implemented properly

Ch. 11 Digital Signal Processing Using General-Purpose Processors

Ch. 11 Digital Signal Processing Using General-Purpose Processors

Presentation Transcript

Digital Signal Processing

Digital Signal Processors (DSPs)

DIGITAL SIGNAL PROCESSING

Digital Signal Processing

Digital Signal Processing Using CASPER Tools

Digital Signal Processing

DSP: Digital Signal Processors

Digital Signal Processing

Digital Signal Processing

Modern Digital Signal Processors

DIGITAL SIGNAL PROCESSING

Digital Signal Processing – Chapter 11

General-Purpose Processors: Software

Digital Signal Processing

Digital signal processors (DSP)

DIGITAL SIGNAL PROCESSORS

Digital Signal Processing

Digital signal processing