BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Asia/Hong_Kong
X-LIC-LOCATION:Asia/Hong_Kong
BEGIN:STANDARD
TZOFFSETFROM:+0800
TZOFFSETTO:+0800
TZNAME:HKT
DTSTART:19911015T033000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20251218T030657Z
LOCATION:Meeting Room S221\, Level 2
DTSTART;TZID=Asia/Hong_Kong:20251216T112300
DTEND;TZID=Asia/Hong_Kong:20251216T113400
UID:siggraphasia_SIGGRAPH Asia 2025_sess119_papers_1975@linklings.com
SUMMARY:ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model
DESCRIPTION:Xuangeng Chu, Nabarun Goswami, Ziteng Cui, and Hanqin Wang (Un
 iversity of Tokyo) and Tatsuya Harada (University of Tokyo, RIKEN AIP)\n\n
 Speech-driven 3D facial animation aims to generate realistic lip movements
  and facial expressions for 3D head models from arbitrary audio clips.\nAl
 though existing diffusion-based methods are capable of producing natural m
 otions, their slow generation speed limits their application potential.\nI
 n this paper, we introduce a novel autoregressive model that achieves real
 -time generation of highly synchronized lip movements and realistic head p
 oses and eye blinks by learning a mapping from speech to a multi-scale mot
 ion code.\nFurthermore, our model can adapt to unseen speaking styles, ena
 bling the creation of 3D talking avatars with unique personal styles beyon
 d the identities seen during training. \nExtensive evaluations and user st
 udies demonstrate that our method outperforms existing approaches in lip s
 ynchronization accuracy and perceived quality.\n\nRegistration Category: F
 ull Access, Full Access Supporter\n\nSession Chair: Feng Xu (Tsinghua Univ
 ersity)\n\n
END:VEVENT
END:VCALENDAR
