Explore the Full Program of SIGGRAPH Asia 2025!
Close

Presentation

PAD3R: Pose-Aware Dynamic 3D Reconstruction from Casual Videos
DescriptionWe present a method for dynamic 3D reconstruction of deformable objects from casually captured, unposed monocular videos. Unlike existing approaches, our method handles long video sequences featuring substantial object deformation, large-scale camera movement, and limited view coverage that typically challenge conventional systems. Specifically, our approach first trains a personalized, object-centric pose estimation model utilizing a pre-trained image-to-3D diffusion model. This guides the optimization of a deformable 3D Gaussian representation and a neural skinning model, enhanced by a long-term point tracking regularization over the entire input video. By combining diffusion priors and differentiable rendering, our method reconstructs high-fidelity, articulated 3D representations of category-agnostic objects. Extensive qualitative and quantitative results show that our approach is robust and generalizes well across challenging scenarios, highlighting its potential for dynamic scene understanding and 3D content creation.