Inferring the three-dimensional structure of an object or scene from multi-view images is a hard inverse problem. In this paper, we propose a depth map-based method to reconstruct a 3D model from multiple calibrated images. The proposed method uses global and local optimization in two stages to estimate highly accurate depth maps. The first stage employs Particle Swarm Optimization to explore the search-space extensively and estimate coarse depth maps from downscaled images. A cross-view depth map completion step ensures that good depth estimates are transferred to neighboring views. This helps in removing outliers and fill in missing depth values with reliable estimates. In the second stage, a local optimizer refines the coarse depths by processing each row/column of pixels iteratively to improve photo-consistency and smoothness among neighboring depths. This procedure utilizes distance maps to handle textured and homogeneous regions adaptively. Experimental results on the Middlebury multi-view stereo benchmark demonstrate the effectiveness of our method in producing accurate and complete 3D models with better spatial consistency and level of detail.
Recovering three-dimensional structure from images is a long-standing ill-posed inverse problem in computer vision. This paper presents a simple and highly scalable method to reconstruct dense 3D point cloud from multi-view images by estimating per-pixel depth using an evolutionary computation technique – CMA-ES. The proposed method uses ZNCCbased template matching to reconstruct fine details of textured regions and DAISY-based feature matching to reconstruct smooth surface of homogeneous regions. We handle the problem of reconstructing large homogeneous regions using distance transform-based adaptive median filtering. The proposed method is highly scalable since pixels are processed independently at all stages of reconstruction – depth map estimation, refinement, and fusion. This enables the proposed method to be parallelized at the pixel-level, unlike most existing methods that can only be parallelized at the image-level. Experimental results on Middlebury benchmark dataset demonstrate the robustness and efficacy of the proposed method in reconstructing textured as well as homogeneous regions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.