Stable Audio 3.0 Day-0 Support in ComfyUI:From Sound Effects to Longer, More Musical Tracks
We’re excited to share that Stable Audio 3.0—Stability AI’s new family of music models built for artistic experimentation—is coming to ComfyUI. Trained on fully licensed data, these models bring variable-length generation, on-device-friendly small checkpoints, and
- VRAM
- Low VRAM (≤8GB)
- Reading Time
- 3 min
Workflow Overview
We’re excited to share that Stable Audio 3.0—Stability AI’s new family of music models built for artistic experimentation—is coming to ComfyUI. Trained on fully licensed data, these models bring variable-length generation, on-device-friendly small checkpoints, and
Content type: Workflow
Primary intent: Download
Setup Notes
- Install the required models before opening the workflow template.
- Recommended hardware: Low VRAM (≤8GB).
- Use the download button above to import the workflow JSON into ComfyUI.
Introducing Stable Audio 3.0 Integration
We're thrilled to announce that Stable Audio 3.0 - Stability AI's innovative music generation toolkit designed for creative exploration - is now available within ComfyUI. These models utilize legally sourced training data and deliver flexible audio duration, lightweight device-compatible versions, and enhanced structural coherence for longer compositions. Seamlessly transition between brief sound effects and extended musical pieces within your established workflows.
Key Features
Commercial licensing - Developed using fully authorized music datasets
Adaptable duration - Create content ranging from quick effects (~2 minutes with Small) to extended tracks (~6 minutes with Medium)
Efficient lightweight models - Operate Small SFX and Small music versions using standard processors without specialized graphics hardware
Enhanced musicality - Medium version produces richer compositions with stronger architecture when graphics acceleration is available
Model Variants
Small-SFX: Soundscapes and brief ambient segments (≤2:00)
Small-Music: Concise musical pieces and portable loops (≤2:00)
Medium: Extended compositions with improved structural integrity (~6:20)
The Small variant extends to two minutes (significantly longer than previous 11s/47s limitations), while Medium surpasses six minutes for extended sequences.
Sample Creations
Musical Compositions
Conceptual sketches for complete works including genre, instrumentation, atmosphere, and timing.
Lo-fi hip-hop chill track with mellow electric piano, soft vinyl crackle, subtle synth pads, low-pass filtered drums, percussion loops, and soft plucked bass for a relaxed, dreamy vibe. BPM: 75. Length: 150 seconds
Synthwave 80s retro track with arpeggiated synth leads, analog pads, electric bass, punchy electronic drums, gated reverb snares, and atmospheric FX for nostalgic and vibrant energy. BPM: 110. Length: 180 seconds
Instrumentation
Individual or small-group recordings suitable for production and scoring.
Guitar muted strum loop with tight rhythmic feel. BPM: 100. Length: 8 seconds
Pluck sequence loop with bright resonant tone. BPM: 128. Length: 10 seconds
Environmental Sounds
Textures, impacts, and motion effects for visual media and gaming.
Footsteps on gravel, steady walking pace, close perspective. Length: 8 seconds
Car speeding past at high velocity, doppler effect, realistic whoosh. Length: 3 seconds
Isolated Sounds
Brief individual effects for percussion, interfaces, and sample libraries.
Bass pluck with jazzy tone and resonant wooden body. Length: 3 seconds
Latin drums, dynamic Latin drums and percussion ensemble featuring authentic rhythmic patterns. Length: 3 seconds
Implementation Guide
Update ComfyUI to version 0.22.0+ or access via Comfy Cloud
Navigate: Sidebar → Templates → Audio → Select Stable Audio 3.0 Template
Local users: Follow workflow instructions to download and position models correctly
Enter descriptive text, specify duration (seconds), then execute
Medium Base Workflow
Full Medium Workflow
Happy creating!