BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Asia/Hong_Kong
X-LIC-LOCATION:Asia/Hong_Kong
BEGIN:STANDARD
TZOFFSETFROM:+0800
TZOFFSETTO:+0800
TZNAME:HKT
DTSTART:19911015T033000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20251218T030653Z
LOCATION:Meeting Room S423+S424\, Level 4
DTSTART;TZID=Asia/Hong_Kong:20251218T091000
DTEND;TZID=Asia/Hong_Kong:20251218T092100
UID:siggraphasia_SIGGRAPH Asia 2025_sess147_papers_1676@linklings.com
SUMMARY:Bokeh Diffusion: Defocus Blur Control in Text-to-Image Diffusion M
 odels
DESCRIPTION:Armando Fortes, Tianyi Wei, Shangchen Zhou, and Xingang Pan (N
 anyang Technological University, Singapore)\n\nRecent advances in large-sc
 ale text-to-image models have revolutionized creative fields by generating
  visually captivating outputs from textual prompts; however, while traditi
 onal photography offers precise control over camera settings to shape visu
 al aesthetics—such as depth-of-field via aperture—current diffusion models
  typically rely on prompt engineering to mimic such effects. This approach
  often results in crude approximations and inadvertently alters the scene 
 content. In this work, we propose Bokeh Diffusion, a scene-consistent boke
 h control framework that explicitly conditions a diffusion model on a phys
 ical defocus blur parameter. To overcome the scarcity of paired real-world
  images captured under different camera settings, we introduce a hybrid tr
 aining pipeline that aligns in-the-wild images with synthetic blur augment
 ations, providing diverse scenes and subjects as well as supervision to le
 arn the separation of image content from lens blur. Central to our framewo
 rk is a grounded self-attention mechanism trained on image pairs with diff
 erent bokeh levels of the same scene, enabling blur strength to be adjuste
 d in both directions while preserving the underlying scene structure. Exte
 nsive experiments demonstrate that our approach enables flexible, lens-lik
 e blur control, supports downstream applications such as real image editin
 g via inversion, and generalizes effectively across both Stable Diffusion 
 and FLUX architectures.\n\nRegistration Category: Full Access, Full Access
  Supporter\n\nSession Chair: Fan Tang (Institute of Computing Technology, 
 Chinese Academy of Sciences)\n\n
END:VEVENT
END:VCALENDAR
