Integrating Lidar Prior into Neural Radiance Field for Large-Scale Outdoor Scenes
Mr. Hou Chao (M.Phil. candidate)
Department of Mechanical Engineering
The University of Hong Kong
Date & Time
Friday, 21 April 2023
Room 7-34, Haking Wong Building, HKU
3D reconstruction from diverse sensor data remains a critical and long-standing subject in the field of robotics. The primary challenge lies in recovering a seamless and undistorted scene representation from discontinuous and limited sensor inputs. Recently, Neural Radiance Fields (NeRF) have gained popularity as a 3D representation due to their capacity to synthesize photorealistic novel viewpoints. NeRF addresses this issue by directly optimizing the parameters of a continuous 5D implicit scene representation to minimize photometric error when rendering a set of captured images. This implicit representation is a fully-connected deep network that takes in a single continuous 5D coordinate (spatial location (x, y, z) and viewing direction (θ, φ)) and outputs the volume density and view-dependent emitted radiance at that spatial location. The rendered color image is obtained by querying the coordinates along camera rays and accumulating the results using a classic volume rendering function. The difference between rendered and observed images is backpropagated to update network parameters. As the entire process is differentiable, complex geometry and appearance can be trained end-to-end.
Nonetheless, deploying NeRF in large-scale outdoor environments (e.g., Street View) remains a challenging task. Vanilla NeRF has only been demonstrated to synthesize objects or small bounded scenes within controlled settings. Moreover, existing NeRF-like models often produce blurry or low-resolution renderings due to the inherent shape-radiance ambiguity of reconstructing a large scene from a limited set of images.Since NeRF reconstructs 3D scenes from a series of 2D images, accurate geometry can only be estimated with sufficient observations, often impractical for wild captures that only provide a glimpse of the scene. Insufficient observations may result in overfitting at specific perspectives, harming the capability of novel view synthesis. To eliminate the inherent shape-radiance ambiguity, Urban Radiance Field is the first work to incorporate lidar information alongside RGB signals. The introduction of a series of lidar-based losses enables accurate surface estimation for both solid structures, such as buildings, and volumetric formations like trees and vegetation.
In this seminar, a pre-built point cloud map will provide strong guidance for discerning empty spaces. Since empty areas do not contribute to the final rendered output, skipping the empty space can suppress floating and blurry rendering and expedite the rendering process. The radiance field will be trained under the supervision of the point cloud prior constraint to alleviate shape ambiguity. We will discuss how to construct accurate surfaces while maintaining high-quality novel views photorealistic synthesis.