VolumeDeform: Real-time Volumetric Non-rigid Reconstruction

Paper

Abstract

We present a novel approach for the reconstruction of dynamic geometric shapes using a single hand-held consumer-grade RGB-D sensor at real-time rates. Our method builds up the scene model from scratch during the scanning process, thus it does not require a pre-defined shape template to start with. Geometry and motion are parameterized in a unified manner by a volumetric representation that encodes a distance field of the surface geometry as well as the non-rigid space deformation. Motion tracking is based on a set of extracted sparse color features in combination with a dense depth constraint. This enables accurate tracking and drastically reduces drift inherent to standard model-to-depth alignment. We cast finding the optimal deformation of space as a non-linear regularized variational optimization problem by enforcing local smoothness and proximity to the input constraints. The problem is tackled in real-time at the camera’s capture rate using a data-parallel flip-flop optimization strategy. Our results demonstrate robust tracking even for fast motion and scenes that lack geometric features.

Video

Downloads

Supplemental

Dataset

We provide a dataset containing RGB-D data of a variety of objects, for the purpose of real-time non-rigid reconstruction. The RGB-D data contains sequences taken from a PrimeSense sensor (color and depth images). Additionally, we provide meshes extracted for several frame (live reconstruction and canonical model).
Please refer to this publication when using the dataset.

Format

For each scene, we provide a zip file containing a sequence of RGB-D camera frames (X_data.zip). Each sequence contains:

Color frames (frame-XXXXXX.color.png): RGB, 24-bit, PNG
Depth frames (frame-XXXXXX.depth.png): depth (mm), 16-bit, PNG (invalid depth is set to 0)

Camera Calibration:

The color and depth camera intrinsics for each sequence are provided in colorIntrinsics.txt and depthIntrinsics.txt. Note that these are the default values provided and we did not perform any calibration.

Meshes:

The extracted meshes are contained in X_canonical.zip (frame-XXXXXX.canonical.ply) and X_reconstruction.zip (frame-XXXXXX.mesh.ply) for every 100th frame and the last frame. The transformations from worldspace to cameraspace for these frames are also provided in X_reconstruction.zip (frame-XXXXXX.world-to-camera.txt).

License

The data has been released under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License.