LTX-2 is natively supported in ComfyUI on Day 0
Hi community!
- Use Case
- Video
- Best For
- Video
- Reading Time
- 4 min
Workflow Overview
Hi community!
Content type: Workflow
Primary intent: Download
Setup Notes
- Install the required models before opening the workflow template.
- Use the download button above to import the workflow JSON into ComfyUI.
Hello everyone! We're thrilled to share that LTX-2, a publicly accessible audiovisual AI framework, has been directly integrated into ComfyUI!
This model produces top-tier visual results while optimizing computational resources and processing velocity. By simultaneously creating movement, speech, ambient sounds, and musical elements in one step, it crafts unified multimedia outputs. Developers gain artistic flexibility within its transparent architecture.
Key Features of the Model
LTX-2 enables synchronized audiovisual creation within ComfyUI, animating scenarios with authentic motions and expressions through versatile input methods. It operates effectively on standard hardware.
Accessible audiovisual core framework
Concurrent production of motion, speech, sound effects, and musical tracks
Video transformation controls using Canny, Depth, and Pose techniques
Creation guided by keyframes
Integrated resolution enhancement and prompt refinement
Sample Results
Converting Text to Video
A close-up of a cheerful girl puppet with curly auburn yarn hair and wide button eyes, holding a small red umbrella above her head. Rain falls gently around her. She looks upward and begins to sing with joy in English: "It's raining, it's raining, I love it when its raining." Her fabric mouth opening and closing to a melodic tune. Her hands grip the umbrella handle as she sways slightly from side to side in rhythm. The camera holds steady as the rain sparkles against the soft lighting. Her eyes blink occasionally as she sings.
A man in a black tuxedo stands motionless in a small, red-tiled bathroom, facing a mirror. The camera sits just behind his right shoulder, framing both his back and his solemn reflection. Suddenly, he opens his mouth and begins to sing opera in Italian: "La donna è mobile, qual piuma al vento." Rich, resonant notes echo through the space. As his voice climbs in pitch, his brows lift, and his expression becomes more passionate, almost vulnerable. The overhead lighting casts a sharp glow on his face and tuxedo, reflecting in the glossy red tiles around him. The camera is static
Obtain the Text-to-Video workflow template
Transforming Images to Video

A close-up shot of a young waitress in a retro 1950s diner, her warm brown eyes meeting the camera with a gentle smile. She wears a black polka-dot dress with an elegant cream lace collar, her reddish-brown hair styled in an elaborate updo with delicate curls framing her freckled face. Soft, warm light from overhead fixtures illuminates her features as she stands behind a yellow counter. The camera begins slightly to her side, then slowly pushes in toward her face, revealing the subtle rosy blush on her cheeks. In the blurred background, the soft teal walls and a glowing red "Diner" sign create a nostalgic atmosphere. The ambient sounds of clinking dishes, distant conversations, and the gentle hum of a jukebox fill the air. She tilts her head slightly and says in a friendly, warm voice: "Welcome to Rosie's. What can I get for you today?" The mood is inviting, timeless, and full of classic American diner charm.

An expansive, moving camera perspective trails a team of mountain cyclists speeding over an untouched snowy terrain during a bright winter day. The lens maintains velocity with the foremost rider in a vivid yellow jacket and orange helmet, who launches airborne over a snow mound, their bicycle suspended against the clear azure sky. Frozen particles burst around them, capturing the golden radiance of the low sun that produces dramatic backlighting and elongated shadows across the landscape. Additional cyclists follow closely, their silhouettes stirring plumes of powdery snow while navigating rolling formations. Audible elements include tire crunching on compacted snow, aerial whooshing during jumps, and distant heavy breathing with exhilarated shouts.
Acquire the Image-to-Video workflow template
Video Control Methods
Acquire LTX-2 Canny transformation workflow
Obtain LTX-2 Depth conversion workflow
Get LTX-2 Pose adaptation workflow
Initial Steps
Ensure your ComfyUI is updated to the most recent release (Desktop and Comfy Cloud versions will be available shortly)
Navigate to the Template Library → Video → select any LTX-2 framework
Adhere to the prompt to acquire models, verify all inputs, and execute the workflow

Happy creating, as usual!