Presentation
LEGO-Maker: Autoregressive Image-Conditioned LEGO Model Creation
DescriptionThis paper presents LEGO-Maker, a new learning-based generative model that can effectively consider over 100 unique brick types and rapidly generate hundreds of bricks to create LEGO models conditioned on images. This work has three major technical contributions that enable it to achieve surpassing capabilities beyond existing generative approaches. First, we design a compact LEGO tokenization scheme to serialize LEGO models and bricks into tokens for autoregressive learning. Second, we build LEGO-Maker, an autoregressive image-conditioned architecture, with a multi-token prediction strategy to encourage pre-considering multiple brick attributes and a rollback mechanism for collision-free generation. Third, we propose an effective data preparation pipeline with a procedural generator to synthesize LEGO models and a LEGO-to-real image translator distilled from a large vision language model to translate LEGO renderings into associated photorealistic images, leveraging rich prior to address the scarcity of image-to-LEGO data. Extensive evaluations and comparisons are conducted on two object categories, facade and portrait, over metrics in four aspects: geometry, color, semantics, and structural integrity, together with a user study. Experimental results demonstrate the versatility and compelling strengths of LEGO-Maker in producing structures and details given by the reference image. Also, the evaluation scores manifest that our method clearly surpasses the baselines, consistently for all evaluation metrics.

Event Type
Technical Papers
TimeTuesday, 16 December 202510:40am - 10:50am HKT
LocationMeeting Room S421, Level 4
