How much VRAM is recommended?

High VRAM (24GB+) is recommended for stable generation.

Can this workflow be used commercially?

This workflow is tagged for commercial-style production use. Always verify model and node license terms before client delivery.

ACE-Step 1.5 is Now Available in ComfyUI

2026-02-04 02:01:42

Commercial-grade music generation on consumer hardware

Models: Lora
VRAM: High VRAM (24GB+)
Reading Time: 4 min

Download Workflow JSON View Required Models

Workflow Overview

Commercial-grade music generation on consumer hardware

Content type: Workflow

Primary intent: Download

Required Models

Lora

Setup Notes

Install the required models before opening the workflow template.
Recommended hardware: High VRAM (24GB+).
Use the download button above to import the workflow JSON into ComfyUI.

ACE-Step 1.5 Now Accessible in ComfyUI

We're pleased to announce that ACE-Step 1.5 has been released for ComfyUI! This significant enhancement of the open-source music generation model provides professional-grade quality on your computer, creating entire songs in less than 10 seconds using standard hardware.

Updates in ACE-Step 1.5

ACE-Step 1.5 features a new hybrid design that revolutionizes AI music creation. A Language Model acts as a versatile planner, converting basic user inputs into detailed song blueprints—spanning brief loops to compositions lasting ten minutes.

Professional Sound Quality
Achieves higher quality than most commercial music systems, scoring 4.72 in musical consistency metrics
Rapid Generation
Creates a complete 4-minute track in approximately 1 second on an RTX 5090 or under 10 seconds on an RTX 3090
Standard Hardware Compatibility
Multilingual Support
Accurately follows instructions in 50+ languages, with excellent performance in English, Chinese, Japanese, Korean, Spanish, German, French, Portuguese, Italian, and Russian

‌

Chain-of-Thought Approach

Utilizing sequential reasoning, the model combines metadata, lyrics, and captions to direct the diffusion technique, yielding more unified extended pieces.

‌

LoRA Personalization

ACE-Step 1.5 enables style-specific adjustments through LoRA training. With just a handful of tracks, personal sound characteristics can be acquired and applied locally while maintaining data privacy.

‌

Functionality Overview

ACE-Step 1.5 merges multiple structural advancements:

Combined LM + DiT Framework: Language Model organizes musical elements while a Diffusion Transformer manages sound production
Adaptive Matching Refinement: Uses Z-Image's DMD2 for accelerated production (2 seconds on A100) and superior results
Built-in Reinforcement Learning: Alignment occurs through internal processes, avoiding external bias influences
Self-Improving Tokenizer: The audio tokenizer evolves during DiT training to minimize generation-tokenizing discrepancies

‌

Future Developments

While not yet compatible with ComfyUI, ACE-Step 1.5 has additional capabilities the community will likely implement.

Reinterpretation

Supply any existing track with fresh lyrics and instructions for complete stylistic reinvention

Revision

Regenerate specific segments when a composition is nearly ideal, seamlessly inserting corrections while preserving surrounding content

‌

Sample Vocal Compositions

Neo-Soul: A warm, organic neo-soul track dripping with live instrumentation and effortless groove. A live drummer plays a loose, hip-hop influenced pocket—soft kick drum with lazy swing, snare hits that sit just behind the beat, and brushed hi-hats that breathe and shuffle with human imperfection.

UK Garage: A skippy, energetic UK garage track built on a classic two-step drum pattern with shuffling hi-hats and a punchy, syncopated kick and snare. A warm, wobbling Reese bass line provides the low-end foundation and chopped, pitched-up female vocal samples create the melodic hooks.

K-Pop: A slick, maximalist K-pop track that genre-hops with precision and style. The production shifts seamlessly between sections—a hard-hitting trap-influenced verse with rapid-fire rapping, a softer R&B pre-chorus with breathy vocals and lush harmonies, then an explosive, synth-driven pop chorus with an ear worm hook.

‌

Sample Instrumental Works

Synth-wave: A nostalgic, cinematic ride through neon and chrome. Punchy gated drums with big reverb snare, arpeggiated synth lines running through chorus and delay, warm analog bass, and soaring lead melodies that feel heroic and bittersweet. Driving but emotional, like the credits rolling on a film that never existed.

Meditative Roller: A deep, meditative roller locked into a hypnotic 140 BPM groove, all smooth forward motion and late-night introspection. The bass line is the soul of it—warm, undulating, endlessly cycling through subtle variations like waves lapping at a shore, never jarring, never stopping.

Progressive House: A warm, rolling journey that builds patiently. Soft four-on-the-floor kick with airy hats, a plucky melodic synth hook that repeats and evolves, pads that swell across long phrases, and subtle acid bass bubbling underneath. Emotional but restrained, always moving forward toward a sunrise.

‌

Initial Steps

Desktop & Local Users

Upgrade ComfyUI to the newest version
Visit Template Library → Audio and pick the ACE-Step 1.5 template
Acquire the model when requested (or manually via Hugging Face)
Input style markers and lyrics, then proceed

Download ACE-Step 1.5 Template
Get ACE-Step 1.5 Models

‌

Workflow Suggestions

Style Indicators: Use descriptive terms including genre, instruments, feeling, speed, and vocal characteristics
Example: rock, hard rock, alternative rock, clear male vocalist, powerful voice, energetic, electric guitar, bass, drums, anthem, 120 bpm
Lyrical Organization: Employ section identifiers like [verse], [chorus], [bridge]
Track Length: Begin with 90–120 seconds for improved consistency; 180+ second compositions might need multiple runs
Multiple Generation: Set batch_size between 8-16 and select the optimal output

‌ Happy creating!

FAQ

Related Workflows

Related by Model

Discover the Ultimate Eastern Art Creation Workflow with AI

Unlock Eastern Pixar-style art creation with this workflow! Generate high-quality images with Flux.1 and Lora models. Download now and enhance your digital illustrations!

Unlock Hyper-Realistic Skin Textures: A Step-by-Step Guide

Unlock hyper-realistic skin textures for fashion and portrait close-ups with this expert workflow, featuring Stable Diffusion, LoRA, and VAE models. Learn how to generate stunning 4K images with ease.

From Abstract to Stunning: Mastering AI-Driven Image Generation with LoRA Style Control and Captioning

Transform Images with AI: Style Transfer, Detail Enhancement & Multilingual Support. Discover how to generate stylized images with LoRA and text-to-image models.

Unlock Stunning Images: A Step-by-Step Guide to Flux.1-Based Text-to-Image Generation

Unlock high-quality image generation with Flux.1! Discover a Text-to-Image workflow integrating LoRA enhancement and multilingual support, producing stunning 1024x1280 images. Learn how to harness Flux.1-dev, T5-XXL, CLIP-L, and VAE for artistic and professional photography-style applications.

Low VRAM Alternatives

"Blossoming Architecture: AI-Generated Images that Will Amaze You"

Unlock AI-powered image generation! Discover a workflow that combines architecture and flowers, using Stable Diffusion, ControlNet, and more to create stunning, high-resolution visuals. Learn how to bring your images to life with blooming flowers and rejuvenation effects.

Unlock Stunning Images: A Step-by-Step Guide to Flux.1-Based Text-to-Image Generation

Creating Silver Gradient Cats: A Comprehensive Workflow for Artistic Image Generation

Generate artistic cat images with Flux.1 model & Silver Gradient Cat theme. Customize prompts & optimize quality for stunning art creation.

Unveiling Aolun: A Revolutionary Chinese Mythology Art Workflow

Discover the Aolun Red Carpet Workflow, a powerful Flux.1 model for generating stunning Chinese mythology-style artistic images. Create breathtaking scenes with ease. Learn how to unlock high-quality art with this workflow.

Commercial-Ready Workflows

Unlock Professional-Grade Poster Design with Miluo Advanced Aesthetic Workflow

Unlock stunning poster designs with Miluo Advanced Aesthetic Poster Design workflow, featuring Flux and Lora models for high-end aesthetics and artistic quality. Try now!

Turbocharge Your Portrait Game: 3x Faster Generation with Nunchaku!

Boost portrait quality with this optimized workflow! Discover how to enhance skin, upscale to 4K, and process high-res images 3x faster. Learn more and transform your commercial-grade portraits now!

Unlock Stunning Plant Landscapes: Ultimate Landscape Plant Generator Workflow

Create stunning high-fidelity plant images with the "Ultimate Landscape Plant Generator" workflow, featuring customizable plants, Lumion-style rendering, and auto-watermarking.

Discover the Ultimate 3D Rendering Workflow for Architectural Floor Plans

Unlock 3D architectural renderings from 2D floor plans with AI! Learn how to auto-color, add textures, simulate lighting & upscale to 4K. Discover the workflow & key models behind stunning visualizations.

origin url

Summary

Commercial-grade music generation on consumer hardware

ACE-Step 1.5 is Now Available in ComfyUI

Workflow Overview

Required Models

Setup Notes

ACE-Step 1.5 Now Accessible in ComfyUI

Updates in ACE-Step 1.5

Chain-of-Thought Approach

LoRA Personalization

Functionality Overview

Future Developments

Reinterpretation

Revision

Sample Vocal Compositions

Sample Instrumental Works

Initial Steps

Desktop & Local Users

Workflow Suggestions

FAQ

What models does this workflow require?

How much VRAM is recommended?

Can this workflow be used commercially?

Related Workflows

Related by Model

Low VRAM Alternatives

Commercial-Ready Workflows

origin url

Summary