Presentation
THADT: Temporal Hybrid Attention Diffusion Transformer for Human Pose Prediction
SessionMeeting Advanced AI Tools
DescriptionHuman pose prediction is a key technology for virtual environmental choreography in dance education. However, traditional deterministic prediction methods are unable to capture the diverse distribution of human poses, which limits their practical applicability. Thus, we propose a Temporal Hybrid Attention Diffusion Transformer (THADT) model for 3D to 3D prediction, which consists of forward diffusion and reverse generation processes. During forward diffusion, the discrete cosine transform converts human poses into frequency-domain features while gradually adding noise and training a denoising network to learn the noise distribution. In reverse generation, the model progressively removes noise by integrating historical pose data as conditional input, ultimately reconstructing future pose sequences through inverse transformation. Experimental results on the Human 3.6M and HumanEva-I datasets demonstrate that THADT outperforms existing state-of-the-art methods across key metrics such as ADE, FDE, and MMADE.

Event Type
Educator's Forum
TimeMonday, 15 December 20252:30pm - 2:45pm HKT
LocationMeeting Room S228, Level 2



