460 likes | 581 Vues
The AVID Video and Image Processing Lab at UC Berkeley is developing a framework for fast, automatic, and accurate three-dimensional model construction from various inputs. This framework aims to generate models suitable for high-quality visualization and applications in virtual and augmented reality. Building upon previous work in scene and building modeling, the project focuses on efficient techniques for urban areas and complex scenes, utilizing multivalued representations and dense depth estimation methods for enhanced visualization quality.
E N D
Three Dimensional Model Construction for Visualization Avideh Zakhor Video and Image Processing Lab University of California at Berkeley avz@eecs.berkeley.edu
Outline • Goals and objectives • Previous work by PI • Directions for future work
Goals and Objectives • Develop a framework for fast, automatic and accurate 3D model construction for objects, scenes, rooms, buildings (interior and exterior), urban areas, and cities. • Models must be easy to compute, compact to represent and suitable for high quality view synthesis and visualization • Applications: Virtual or augmented reality fly-throughs.
Previous Work on Scene Modeling • Full/Assisted 3-D ModelingKanade et al.; Koch et al.; Becker & Bove; Debevec et al.; Faugeras et al.; Malik & Yu. • Mosaics and PanoramasSzeliski & Kang; McMillan & Bishop; Shum & Szeliski • Layered/LDI RepresentationsWang & Adelson; Sawhney & Ayer; Weiss; Baker et al. • View Interpolation/IBR/Light FieldsChen & Williams; Chang & Zakhor; Laveau & Faugeras; Seitz & Dyer; Levoy & Hanrahan
Previous Work on Building Models • Nevatia (USC): multi-sensor integration • Teller (MIT): spherical mosaics on a wheelchair sized rover, known 6DOF • Van Gool (Belgium): roof detection from aerial photographs • Peter Allen (Columbia): images and laser range finders; view/sensor planning. • Faugeras (INRIA)
Previous Work on City Modeling • Planet 9: • Combines ground photographs with existing city maps manually. • UCLA Urban Simulation Team: • Uses mutligen to create models from aerial photographs, together with ground video for texture mapping. • Bath and London models by Univ. of Bath. • Combines aerial photgraphs with existing maps. • All approaches are slow and labor intensive.
Work at VIP lab at UCB Scene modeling and reconstruction.
Multi-Valued Representation: MVR • Level k has k occluding surfaces • Form multivalued array of depth and intensity
Imaging geometry (1) • Planar translation
Imaging Geometry (2) • Circular/orbital motion
Dense Depth Estimation • Estimate camera motion • Compute depth maps to build MVRs • Low-contrast regions problematic for dense depth estimation. • Enforce spatial coherence to achieve realistic, high quality visualization.
Block Diagram for Dense Depth Estimation • Planar approximation of depth for low contrast regions.
Oroginal Sequences “Mug” sequence (13 frames) “Teabox” sequence (102 frames)
Low-Contrast Regions • Complete tracking Mug sequence Tea-box sequence
Multiframe Depth Estimation Apply iterative estimation algorithm to enforce piecewise smoothness, without smoothing over depth discontinuities.
Multiframe Depth Estimation Mug Tea-box Multiframe Stereo + Low-Contrast Processing + Piecewise Smoothing Multiframe Stereo + Low-Contrast Processing + Piecewise Smoothing
Multivalued Representation • Project depths to reference coordinates
Multivalued representation for frame 4 (Level 0) Results (1) • Mug sequence
Multivalued representation for frame 4 (Level 1) Results • Mug sequence
Multivalued representation for frame 4 (Combining Levels 0 and 1) Results • Mug sequence
Results • Mug sequence Reconstructed sequence Arbitrary flythrough
Results (2) • Teabox sequence Multivalued representation for frame 22 (Intensity, Level 0)
Results • Teabox sequence Multivalued representation for frame 22 (Depth, Level 0)
Results • Teabox sequence Multivalued representation for frame 22 (Intensity, Level 1)
Results • Teabox sequence Multivalued representation for frame 22 (Depth, Level 1)
Results • Teabox sequence Multivalued representation for frame 22 (Intensity, combining Levels 0 and 1)
Results • Teabox sequence Multivalued representation for frame 22 (Depth, combining Levels 0 and 1)
Results • Teabox sequence Multivalued representation for frame 86 (Intensity, Level 0)
Results • Teabox sequence Multivalued representation for frame 86 (Depth, Level 0)
Results • Teabox sequence Multivalued representation for frame 86 (Intensity, Level 1)
Results • Teabox sequence Multivalued representation for frame 86 (Depth, Level 1)
Results • Teabox sequence Multivalued representation for frame 86 (Intensity, combining Levels 0 and 1)
Results • Teabox sequence Multivalued representation for frame 86 (Depth, combining Levels 0 and 1)
Multiple MVRs • Perform view interpolation w/many MVRs
Results: multiple MVRs • Teabox sequence Reconstructed sequence from MVR86 Reconstruct sequence from MVR22
Results: Multiple MVRs Reconstructed sequence Arbitrary flyaround
Extensions • Complex scenes with many “levels” are difficult to model with MVR; e.g. trees, leaves, etc • Difficult to ensure realistic visualization from all angles; Need to plan capture process carefully. • Tradeoff between CG polygon modeling and IBR; • Use both in real visualization databases. • Build polygon models from MVR.
Issues for model construction • Choice of geometry for obtaining data • Choice of imaging technology. • Choice of representation. • Choice of models. • Dealing with time varying scenes.
Extensions: • So far, addressed “outside in” problem: • Camera looked inward to “scan” the object. • Future work will focus on the “Inside out” problem: • Modeling a room, office. • Modeling exterior or interior of a building • Modeling an urban environment e.g. a city
Strategy • Use: • Range sensors, position sensors (GPS), Gyros(orientation), omni camera, video. • Existing datasets: 3D CAD models, digital elevation maps (DEM), DTED, city maps, architectural drawings: apriori information
Modeling interior of buildings • Leverage existing work in the computer graphics group at UCB: • 3D model of Soda hall available from the “soda walkthrough” project. • 3D model built out of architectural drawings • Use additional video, and laser range finder input to • Enhance the details of the 3D model: furniture, etc • Add texture maps for photo-realistic walk-throughs.
City Modeling • Develop a framework for modeling parts of city of San Francisco: • Use aerial photograph as provided by Space Imaging Corp; resolution 1 ft. • Use digitized city maps • Use ground data collection vehicle to collect range and intensity video from a panoramic camera, annotated with 6 DOF parameters. • Derive data fusion algorithms to process the above in speedy, automated and accurate fashion.
Requirements • Automation (little or no interaction needed from human operators) • Speed: must scale with large areas and large data sets. • Accuracy • Robustness to location of data collection. • Ease of data collection. • Representation suitable to hierarchical visualization databases.
Relationship to others • USC: accurate tracking and registration algorithms needed for model construction. • Syracuse: uncertainty processing, and data fusion for model construction. • G. Tech: How to combine CG polygonal model building with IBR models in vis. database? How can vis. databases deal with photo-realistic rendering?
Conclusions • Fast, accurate and automatic model construction is essential to mobile augmented reality systems. • Our goal is to provide photo-realistic rendering of objects, scenes, buildings, and cities, to enable, visualization, navigation and interaction.