1 / 60

Research on Graph-Cut for Stereo Vision

Research on Graph-Cut for Stereo Vision. Presenter: Nelson Chang Institute of Electronics, National Chiao Tung University. Outline. Research Overview Brief Review of Stereo Vision Hierarchical Exhaustive Search Partitioned Graph-Cut for Stereo Vision Hierarchical Parallel Graph-Cut.

chase-lyons
Télécharger la présentation

Research on Graph-Cut for Stereo Vision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research on Graph-Cut for Stereo Vision Presenter: Nelson Chang Institute of Electronics, National Chiao Tung University

  2. Outline • Research Overview • Brief Review of Stereo Vision • Hierarchical Exhaustive Search • Partitioned Graph-Cut for Stereo Vision • Hierarchical Parallel Graph-Cut

  3. Our Research HRP-2 Head • A fast vision system for robotics • Stereo vision • Local block-based + diffusion (M) • Graph-cut (PhD) • Belief propagation (PhD) • Segmentation • Watershed (M) • Meanshift • Approaches • Embedded solutions • DSP (U) • ASIC • PC-based solutions • Dual webcam stereo (U) HRP-2 Tri-Camera Head

  4. My Research • A fast graph-cut VLSI engine for stereo vision • ASIC approach • Goal: 256x256 pixels, 30 depth label, 30 fps • Stereo vision system prototypes • PC-based • DSP-based • FPGA/ASIC-based

  5. Review on Stereo Vision Presenter: Nelson Chang Institute of Electronics, National Chiao Tung University

  6. d Concept of Stereo Vision • Computational Stereo – to determine the 3-D structure of a scene from 2 or more images taken from distinct view points. Triangulation of non-verged geometry d : disparity Z : depth T : baseline f : focal length M. Z. Brown et al., “Advances in Computational Stereo,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 8, August 2003.

  7. Disparity Image • Disparity Map/Image • The disparities of all the pixels in the image • Example: Left Cam Right Cam 110 pixels Disparity map of the 4x4 block 0 0 0 0 Left Disparity Map Right Disparity Map 0 0 110 0 Farthest 0 100 138 0 d= 0 80 123 156 176 d= 255 Nearest

  8. How to find the disparity of a pixel? (1/2) • Simple Local Method • Block Matching • SADSum of Absolute Difference • ∑|IL-IR| • Find the candidate disparity with minimal SAD • Assumption • Disparities within a block should be the same • Limitation • Works bad in texture-less region • Works bad in repeating pattern 0 0 0 0 0 100 d=k-1 SAD=400 0 200 300 0 0 0 d=k SAD=0 0 0 0 0 100 0 0 100 0 200 300 0 d=k+1 SAD=600 200 300 0 Left 0 0 0 100 0 0 300 0 0 Right

  9. How to find the disparity of a pixel? (2/2) • Complex Global Method • Graph-cut, Belief Propagation • Disparity Estimation  Optimal Labeling Problem • Assign the label (disparity) of each pixel such that a given global energy is minimal • Energy is a function of the label set (disparity map/image) • The energy considers the • Intensity similarity of the corresponding pixel • Example: Absolute Difference (AD), D=|IL-IR| • Disparity smoothness of neighboring pixels • Example: Potts Model If (dL≠dR), V=K else, V=0 d=0 V=2K d=16 V=3K d=32 V=3K d=2 V=4K 0 0 ? 16 32

  10. Swap and Expansion Moves More chances of finding more local minimum E • Weak move • Modifies 1 label at a time • Standard move • Strong • Modifies multiple labels at a time • Proposed swap and expansion move Init. Strong Weak α-βswap αexpansion Initial labeling Standard move

  11. D V V V V D’ 4-connected structure • Most common graph/MRF(BP) structure in stereo 2-variable Graph-Cut Source α Observable nodes D V V V V Hidden nodes α’ Sink MRF in Belief Propagation D,V are vectors

  12. Hierarchical Exhaustive Search on Presenter: Nelson Chang Institute of Electronics, National Chiao Tung University

  13. Outline • Combinatorial Optimization • Graph-Cut • Exhaustive Search • Iterated Conditional Modes • Hierarchical Exhaustive Search • Result • Summary & Next Step

  14. 0 0 0 0 99 92 101 100 ? ? ? ? 10 10 10 0 1 2 3 100 79 98 114 1 1 1 1 Combinatorial Optimization • Determine a combination (pattern, set of labels) such that the energy of this combination is minimum • Example: 4-bit binary label problem • Find a label-set which yields the minimal energy • Each individual bit can be set as 0 or 1 • Each label corresponds to an energy cost • Each neighboring bit pair is better to have the same label (smoothness) Energy(0000) = 99+92+100+101 = 392 Energy(0001) = = 99+92+100+98+10 = 399

  15. 1100 101 99 100 92 79 114 100 98 Graph-Cut • Formulate the previous problem into a graph-cut problem • Find the cut with minimum total capacity (cost, energy) • Solving the graph-cut: Ford-Fulkurson Method 0 3 13 12 2 ? ? ? ? 10 10 10 9 7 0 1 2 3 14 4 1 1 1 Total Flow Pushed = 99+79+100+98 +1 +10 +3 =390 Max Flow (Energy of the cut 1100)

  16. 0 0 0 0 99 92 101 100 10 10 10 ? ? ? ? 0 1 2 3 100 79 98 114 1 1 1 1 Exhaustive Search • List all the combinations and corresponding energy • Example: 1100 has the minimal energy of 390

  17. 0 0 0 0 99 92 101 100 10 10 10 ? ? ? ? 0 1 2 3 100 79 98 114 1 1 1 1 Iterated Conditional Modes • Iteratively finds the best label under the current given condition • Greedy • Different starting decision (initial condition) result in different result • Can find local minima • Example: • Start with bit 1 because it is more reliable • Iteration order: bit1bit0bit2bit3 • Final solution: 1100 0 0 1 1 2 3 0 1 100(1)<99+10(0)  1 79(1)<92(0)  1 100+10(0)<114 (1)  0 101(0)<98+10(1)  0

  18. Exhaustive Search Engine • Exhaustive search can be hardware implemented • Less sequential dependency • Not suitable for graph larger than 4x4 Result of fully connected graph, NOT 4-connected graph

  19. 0 1 2 3 Hierarchical Graph-Cut? • Solve large n graph with multiple small n GCE hierarchically • Example: • Solve n=16 with 4+1 n=4 graph-cuts For each sub-graph, find the best 2 label-sets Sub-graph 0 Sub-graph 1 For each sub-graph vertice Label 0 = 1st label set Label 1 = 2nd label set Assumption: !! The optimal solution must be within the combinations of sub-graph label sets !! Sub-graph 2 Sub-graph 3

  20. HGC Speed up Evaluation • For an 8-point GCE with 8-set of ECUs • Cost: 300 eq. adders • Latency: 41 cycles per graph • If only 1 GCE is used to compute 64-point 2 variable graph-cut Latency = 41 cycles x 8 + 41 cycles + TV = 369 cycles + TV If V is computed for each pixels Tv=(8x8)X(8x7/2)X4=3584 Total Latency ~ 3953 cycles Question: Is this solution the optimal label set for n=64???

  21. Hierarchical Exhaustive Search pat0 is the best candidate pattern pat1 is 2nd best candidate pattern • 64x64 nodes • 4x4 based pyramid structure • 3 levels Level 2 D@lv2 E0/E1@lv1 Label0@lv2 pat0@lv1 Label1@lv2 pat1@lv1 Level 1 D@lv1 E0/E1@lv0 Label0@lv1 pat0@lv0 Label1@lv1 pat1@lv0 Level 0 D@lv0 D0/D1@lv0 Label0@lv0 Label0 Label1@lv0 Label1

  22. Computing V term at Level 1 • For 1st order neighboring sub-graphs Gi and Gj • possible neighboring pair combination • (pat0i, pat0j) • (pat0i, pat1j) • (pat1i, pat0j) • (pat1i, pat1j) • Compute V(patXi,patXj) with original neighboring cost • Example: • V(pat0i, pat0j) = K • V(pat0i, pat1j) = K+K+K = 3K Gi Gj pat0i pat0j ? ? ? 0 0 ? ? ? ? ? ? 0 0 ? ? ? ? ? ? 0 1 ? ? ? ? ? ? 1 1 ? ? ? pat0i pat1j ? ? ? 0 1 ? ? ? ? ? ? 0 0 ? ? ? ? ? ? 0 1 ? ? ? ? ? ? 1 0 ? ? ?

  23. Result of 16x16 (256) 2 level HES • Random generated 100 graphs • D/V~ 10 • Symmetric V=20 • Error Rate • Max: 17/256 ~ 6.6% • Average: 7/256 ~ 2.8% • Min: 2/256 ~ 0.8%

  24. Result of 64x64 (4096) 3 level HES • Random generated 100 graphs • D/V~ 10 • Symmetric V=20 • Error Rate • Max: 185/4096 ~ 4.5% • Average: 146/4096 ~ 3.6% • Min: 115/4096 ~ 2.8%

  25. Death Sentence to HES Presenter: Nelson Chang Institute of Electronics, National Chiao Tung University

  26. Error Rate vs. Graph Size Error rate range became smaller • (D,V)=(~163:20) 3.63 vs. 3.65 Error rate did not increase significantly

  27. 256x256 1 pattern result Impact of different V cost • 64x64(3 level) HES • 100 patterns per V cost value • D cost (average over s-link caps of 10 patterns, 2 for each V) • Average: 162.8 • Std.Dev: 94.4 • V cost • 10, 20, 40, 60, 80

  28. Stereo Matching Case • Stereo Pair: Tsukuba • Expansion with random label order • 15 labels  15 graph-cut computations • Graph Size: 256 x 256 • D term: truncated Sum of Squared Error (tSSE) • Truncated at AD=20 • V term: Potts model • K=20

  29. 1st iteration result 5 BnK’s expansion result 4 • Error rate might exceed 20% for important expansion moves 9 Important expansions

  30. Reason for failure • Best 2 local candidates does NOT include the final optimal solution • Error often happen near lv2 and lv3 block boundary • Majority node has both 0 source and sink link capacity • More dependent on neighboring node’s label • D:V ratio ~ 56:20  2.8:1 • Similar to D:V = 163:60 case • Error rate for random pattern ~ 15% Best 2 patterns in does NOT consider the pattern of

  31. Partitioned (Block) Graph-Cut Presenter: Nelson Chang Institute of Electronics, National Chiao Tung University

  32. Motivation • Global • Considers the whole picture • More information • Local • Considers a limited region of a picture • Less information Is it necessary to use that much information in global methods??

  33. Concept • Original full GC • 1 big graph • Partitioned GC • N smaller graphs What’s the smallest possible partition to achieve the same performance?

  34. Experiment Setting • Energy • D term • Luma only • Birchfield-Tomasi cost (best result at half-pel position) • Square Error • V term • Potts Model V= K x T(di≠dj) • K constant is the same for all partition • Partition Size • 4x4, 16x16, 32x32, 64x64, 128x128 • Stereo Pairs • Tsukuba, Teddy, Cones, Venus

  35. Tsukuba 4x4, 16x16, 32x32, 64x64 4x4 16x16 64x64 32x32

  36. Tsukuba 96x96, 128x128 Full GC 128x128 96x96

  37. Venus 32x32, 64x64 64x64 32x32

  38. Venus 96x96, 128x128 Full GC 96x96 128x128

  39. Teddy 32x32, 64x64 64x64 32x32

  40. Teddy 96x96, 128x128 Full GC 96x96 128x128

  41. Cones 32x32, 64x64 64x64 32x32

  42. Cones 96x96, 128x128 Full GC 96x96 128x128

  43. Middleburry Result Evaluation Web Page http://cat.middlebury.edu/stereo/ Best: Full GC with best parameter Full: Full GC with k=20(tsukuba) and 60 (others)

  44. Summary • Smallest possible partition size (2% accuracy drop) • Tuskuba64x64 • Teddy & Cones  96x96 • Venus  larger than 128x128 • Benefits • Possible complexity or storage reduction • Parallelism increase • Drawbacks • Performance (disparity accuracy) drop • PC computation becomes longer

  45. Hierarchical Parallel Graph-Cut Presenter: Nelson Chang Institute of Electronics, National Chiao Tung University

  46. Concept of Hierarchical Parallel GC • Bottom Up • Solve graph-cut for smaller subgraphs • Solve graph-cut for larger subgraphs • Larger subgraphs = set of neighboring smaller subgraphs !!Each subgraph is temporary independent !! Larger subgraph = sg0+sg1+sg2+sg3 sg0 sg1 Level 0 Level 1 sg2 sg3

  47. HPGC for solving a 256x256 graph Step 1 64 32x32 Lv0 subgraphs Step 2 16 64x64 Lv1 subgraphs Step 3 4 128x128 Lv2 subgraphs Step 4 1 256x256 Lv3 subgraphs Total graph-cut computations = 64+16+4+1 =85 !!HPGC must used Ford-Fulkerson-based methods!!

  48. Boykov and Kolmogorov’s Motivation 1 1 1 • Dinic Method • Search the shortest augmenting path • Use Breadth First Search (BFS) • Example: • Search shortest path (length = k) • Use BFS, expand the search tree • Find all paths of length k • Search shortest path (length = k+1), • Use BFS, RE-expand the search tree again • Find all paths of length (k+1) • Search shortest path (length = k+2), • Use BFS, RE-RE-expand the search tree again • ….. 1 1 1 1 1 1 1 Why don’t we REUSE the expanded tree?

  49. BnK’s Method • Concept: • Reuse the already expanded trees • Avoid re-expanding the tress from scratch (nothing) • 3 stages • Growth • Grow the search tree • Augmentation • Ford-Fulkerson style augmentation • Adoption • Reconnect the unconnected sub-trees • Connect the orphans to a new parent Augmenting Path Saturate Critical Edge Adopt Orphans

  50. Feature of BnK method • Based on Ford-Fulkerson • Bidirection search tree constructon • Searched tree reuse • Determine label (source or sink) using tree connectivity Source tree Sink tree

More Related