1 / 16

Pattern-Forming Instabilities in the Swift-Hohenberg Model

Pattern-Forming Instabilities in the Swift-Hohenberg Model. Micah Brodsky 6.338 Spring ‘08. Swift-Hohenberg Model. Scalar partial differential equation Can derive as approximation to Navier-Stokes at onset of convection But used generally as case study in pattern formation

cortez
Télécharger la présentation

Pattern-Forming Instabilities in the Swift-Hohenberg Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pattern-Forming Instabilities in the Swift-Hohenberg Model Micah Brodsky 6.338 Spring ‘08

  2. Swift-Hohenberg Model • Scalar partial differential equation • Can derive as approximation to Navier-Stokes at onset of convection • But used generally as case study in pattern formation • Has a Lyapunov functional – relaxes towards a steady state, no turbulence • But does interesting things along the way

  3. The Model • ∂u/∂t = εu - u3 – (∇2 – 1)2u + g2u2 • Expands to:(ε - 1)·u - ∇2u - 2∇4u + g2u2 - u3 Saturationterm! Linear instability High-frequencystabilization Low-ordernonlinearities Preferred length scale • One of the simplest possible pattern-forming equations

  4. Let’s try it out… ε = 1.5, g2 = 0 Initialized with low-amplitude random perturbations about zero

  5. What Do We Understand? • We can easily predict simple features • Instability • Characteristic wavelength • Perturbational analysis gives a bit more • Wavelength instabilities (e.g. Zigzag and “Eckman”) • Higher order wave effects (e.g. hexagons!) • But what about high-amplitude behavior?

  6. Scales of order ε = 4 ε = 0.5 ε = 0.1

  7. Competition Among Patterns ε = 0.1, g2 = 0.5 Initialized with low-amplitude random perturbations about zero,along with a strip of -1.

  8. High-Amplitude Instabilities • Initially saturates to uniform steady state solutions • 0 is linearly unstable, but ±√(ε – 1) are linearly stable • But then… ε = 3, g2 = 0 Initialized with low-amplitude random perturbations plus a slight positive bias, along with a strip of -1.

  9. Implementation • Explicit finite difference solver • I.e. ui+1[x] = ui[x] + dt * ( … ui[x-1] - 2ui[x] + ui[x+1] ... ) • Instability is the bottleneck (and ∇4makes it hit hard) • C++ / MPI on SiCortex • Domain split into vertical slabs • Processors exchange 2-cell-wide boundaries after every time step

  10. Output • SDL for real-time visualization • Forwarding X to the rank-0 node is a pain but can be done • Raw data dump to file, imported into Matlab with fread()

  11. Where’s The Bottleneck? • Worst case communication / computation:~ 4 doubles / ~ 66 flops • But SiCortex has lots of bandwidth • Most of the time is spent computing • Compiler effects • Memory, memory, memory…

  12. Compiler • -O3 goes without saying • But wait, why is it generating such crummy machine code (sometimes you have to look!) • Pointer analysis! • Compiler can use registers and software pipelining efficiently only if it knows that it’s not tripping over its own calculations • Use __restrict__ pointers wherever possible • Means no other pointer points to the same data • 50% FLOPS improvement on simpler kernel (diffusion equation) • But… only about 10% here.

  13. Memory Hierarchy • Profile with papiex -a • L1 cache easily overrun • Solution: block striping • Adds some overhead, but about 10% net savings on “tall” problems (why not more?) • L2 usually okay • (though not if we go 3D) Memory access stencil:

  14. Communication • SiCortex has a lot of bandwidth • 1 double for every 2 clock cycles (without congestion) • This is plenty to keep us busy • But, communication is still up to 25% of runtime (use mpipex, subtract startup overhead) • We can still gain by overlapped communication & computation • Or can we?

  15. Overlapped Communication • Solution: • Compute edges first, then start non-blocking transfers (Isend, Irecv) • Wait for results only whenwe need to compute nextedges • Problem! This kills our L1 cache performance gains • Communication is heaviest when middle block is narrowest • Net result: 5-7% gain on large problems • 13% on large, “thin” problems

  16. The real problems • Kernel has too many instructions! • Processor has poor instruction-level parallelism (ILP) • Not to mention the time step is *really* slow • Implicit / Crank-Nicholson methods? • Spectral methods? • Scalability • Slab slicing limits to width / 2 processors • Switch to block decomposition? • 3D • I really want to see 3D results. In 3D. =)

More Related