ACE-Step 1.5 XL: Commercial-Grade Music Generation in ComfyUI
A 4B-parameter open-source music model that generates full songs in seconds — locally on consumer hardware
- VRAM
- High VRAM (24GB+)
- Reading Time
- 2 min
Workflow Overview
A 4B-parameter open-source music model that generates full songs in seconds — locally on consumer hardware
Content type: Workflow
Primary intent: Comparison
Setup Notes
- Install the required models before opening the workflow template.
- Recommended hardware: High VRAM (24GB+).
- Use the download button above to import the workflow JSON into ComfyUI.
The landscape of music synthesis has witnessed notable enhancements. ACE-Step 1.5 XL incorporates a 4-billion-parameter Diffusion Transformer decoder within its structure, yielding sonic fidelity comparable to premium commercial systems while operating directly on discrete graphics hardware.
Three specialized editions are available: xl-base prioritizes broad adaptability, xl-sft excels in acoustic precision, and xl-turbo maximizes processing velocity. Each iteration operates under MIT licensing provisions and employs legally authorized training datasets.
Instrumental Dark Synthwave
Melodic Dubstep featuring Female Vocals
Ambient Electronic with Female Vocals
Essential Attributes
Professional-Grade Output – Statistical assessments position results between Suno v4.5 and v5 versions, with 4B parameters producing fuller audio textures than previous 2B configurations
Rapid Synthesis Capability – Complete musical compositions generated in under 2 seconds using A100 hardware, or below 10 seconds on RTX 3090 systems. The xl-turbo variant reduces processing to just 8 steps (approximately 6× acceleration versus base/sft editions)
Adaptable Composition Length – Create musical segments ranging from brief 10-second motifs to comprehensive 10-minute arrangements
Expansive Sonic Palette – Granular control over tonal characteristics across numerous musical genres using over 1,000 distinct timbres
Multilingual Lyric Integration – Structural and stylistic manipulation using lyrical prompts in over 50 languages
Commercial Usage Authorization – MIT licensing framework utilizing royalty-free materials, public domain resources, and synthesized audio conversions from MIDI data
Model Selection
All three XL versions share identical 4B-parameter frameworks:
XL-Base – Premier versatility for creative exploration
Download Workflow
XL-SFT – Optimal sonic refinement with moderate diversity reduction
Download Workflow
XL-Turbo – Streamlined 8-step processing for accelerated iteration
Download Workflow
Implementation Process
Obtain the updated ComfyUI release
Access the Template Repository and search using "ACE Step" terminology
Choose the corresponding workflow structure
Follow the embedded guidelines to acquire necessary models
Modify prompts and execute the generation sequence
Enjoy your creative exploration process!