Presentation
ASIA: Adaptive 3D Segmentation using Few Image Annotations
DescriptionWe introduce ASIA (Adaptive 3D Segmentation using few Image Annotations), a novel framework that enables segmentation of possibly non-semantic and non-text describable “parts" in 3D. Our segmentation is controllable through a few user-annotated images, which are easier to gather than multi-view images, less demanding to annotate than 3D models, and more precise than potentially ambiguous text descriptions. Our method leverages the rich priors of text-to-image diffusion models, such as Stable Diffusion, to transfer segmentations from image space to 3D, even when the annotated and target objects differ significantly in geometry or structure. To ensure cross-view consistency and precision, we incorporate edge-guided ControlNet conditioning, fine-tune with LoRA, and introduce a novel cross-attention consistency loss. Final segmentations are fused via a UV map projection with a voting mechanism and refined through per-view noise optimization. ASIA provides a practical and generalizable solution for both semantic and non-semantic 3D segmentation tasks, outperforming existing methods by a noticeable margin in both quantitative and qualitative evaluations, e.g., 8.7% higher on average mIoU over PartNet-Ensembled dataset.

Event Type
Technical Papers
TimeThursday, 18 December 202510:50am - 11:01am HKT
LocationMeeting Room S421, Level 4


