Explore the Full Program of SIGGRAPH Asia 2025!
Close

Presentation

Agentic VJ System - Real-time Visual Generation with Multi-modal Agents for Live Performances
DescriptionAgentic VJ System transforms live music, camera feeds, and other interactive signals through gamepads, OSC and MIDI into real-time generative visuals. Built on our open-source ComfyUI Web Viewer nodes, it unifies multimodal inputs—audio analysis, image-to-image synthesis, and inference-time hyperparameter control—within a browser-based, low-latency pipeline and a custom built modular hardware. A single performer can improvise with AI in Full Auto, Semi-Auto, or Manual modes. The system synchronizes visual generation, lighting, and sound in a cohesive feedback loop that turns the entire venue into an interactive instrument. Proven through multiple live performances and immersive installations, this open, reproducible toolkit demonstrates a practical path for making generative AI performable on stage. It offers both a new creative vocabulary for real-time audiovisual expression and a reference architecture for researchers exploring human-in-the-loop control, multimodal synchronization, and adaptive AI co-creation in live digital art.