How much VRAM is recommended?

Generate dynamic videos with text prompts using Aliyun's Wan2.1 model! Learn how to utilize this Text-to-Video workflow with Chinese support, customizable frame rates, and resolutions. Discover the core models, key nodes, and workflow structure.

Use Case: Video
Best For: Video
Models: Wan2.1
VRAM: Low VRAM (≤8GB)
Reading Time: 4 min

View Required Models More Video Workflows

Workflow Overview

Content type: Workflow

Primary intent: Download

Required Models

Wan2.1

Setup Notes

Install the required models before opening the workflow template.
Recommended hardware: Low VRAM (≤8GB).

1. Workflow Overview

m8zpwo2ykk6cl4j4b8j185e2c240252d08f726a5b8c213b87216d8e704d6b208b277ccc4420c415eb18.gif

This workflow utilizes Aliyun's Wan2.1 model for Text-to-Video (T2V) generation. It integrates text encoding, video diffusion, and VAE decoding to produce dynamic video content. Key features:

Supports Chinese prompts (e.g., "滑雪的男人" - "a man skiing")
Configurable frame rate (default: 16fps) and resolution (480x768)
Includes negative prompts for quality filtering

2. Core Models

Model Name	Function	Installation
Wan2.1-T2V-1.3B	Video diffusion backbone	Manual download (`.safetensors`)
umt5-xxl-enc	Chinese text encoder	Place in `models/wan_t5`
Wan2.1_VAE	Latent space decoder	Manual download

3. Key Nodes

LoadWanVideoT5TextEncoder
Loads the Chinese text encoder (umt5-xxl-enc). Use bf16 precision to save VRAM.
WanVideoTextEncode
Processes positive/negative prompts. Example negative prompts filter low-quality content.
WanVideoModelLoader
Loads the main video model with options for fp32/fp16 and VRAM optimization.
WanVideoSampler
Core sampler parameters:
- steps: 10 (lower for faster video generation)
- cfg_scale: 6 (lower for creative freedom)
- sampler: dpm++
VHS_VideoCombine
Combines frames into MP4 video with configurable:
- Frame rate (16fps)
- Output format (H.264, CRF=19)
- Filename prefix (WanVideo2_1_T2V)

4. Workflow Structure

Group 1: Text Processing

Input: Chinese prompt
Output: Text embeddings
Key nodes: LoadWanVideoT5TextEncoder → WanVideoTextEncode

Group 2: Video Generation

Input: Text embeds + empty image embeds (480x768)
Output: Latent video data
Key nodes: WanVideoSampler

Group 3: Video Export

Input: Decoded image sequence
Output: MP4 file
Key nodes: WanVideoDecode → VHS_VideoCombine

5. I/O Specifications

Input Parameters:

Resolution: 480x768 (set in WanVideoEmptyEmbeds)
Seed: Fixed/Random (example: 1057359483639287)
Prompts: Natural Chinese language (avoid complex syntax)

Output:

MP4 video (saved to ComfyUI output folder)
Includes generation metadata

6. Notes

⚠️ VRAM Requirements

Minimum 12GB (16GB recommended)
Enable offload_device for optimization

⚠️ Model Installation

Download Wan2.1 models manually from official sources
Text encoder path: models/wan_t5/umt5-xxl-enc-bf16.safetensors

⚠️ Dependencies

Requires ComfyUI-WanVideoWrapper & VideoHelperSuite
Install via ComfyUI Manager

FAQ

Related Workflows

Related by Use Case

Unlock Anime-Style Video Magic: A Step-by-Step WAN2.1 Workflow Guide

Generate Anime-Style Videos with WAN2.1 Model: Learn how to convert input videos to anime style with dynamic prompts, HunyuanLoom technology, and outputs 16fps MP4 videos. Try this workflow now!

Transform Your Videos into Stylized Animations with Advanced AI Technology

Unlock the power of video stylization with our workflow! Transform input videos into stunning animations using Wan2.1 model, AnimeLineArt, and DepthAnything. Discover how to harness ControlNet, T5 text encoding, and frame interpolation for dynamic content. Learn more and get started now!

Wan2.7 Is Now Available in ComfyUI via Partner Nodes

Wan2.7 is a comprehensive upgrade over 2.6 — better quality, audio, dynamics, and a full suite of creative workflows now available in ComfyUI via Partner Nodes

Create Stunning Animated Videos with Ease: A Flux.1 and WanVideo Tutorial

Generate stunning images and videos with Flux.1 and WanVideo plugins. Learn how to integrate these models for high-quality image and video creation. Get started now!

Related by Model

Unlock Anime-Style Video Magic: A Step-by-Step WAN2.1 Workflow Guide

Generate Anime-Style Videos with WAN2.1 Model: Learn how to convert input videos to anime style with dynamic prompts, HunyuanLoom technology, and outputs 16fps MP4 videos. Try this workflow now!

Transform Your Videos into Stylized Animations with Advanced AI Technology

From Images to Videos: A Deep Dive into the Wan2.1-I2V Workflow

Unlock AI-powered video generation with Alibaba's Wan2.1 model! Learn how to create stunning videos from static images using this workflow guide.

Transform Your Videos into Anime-Style Masterpieces with Advanced AI Models

Unlock AI-powered video stylization and generation with this workflow! Transform videos into anime-style sequences with customizable prompts and frame-to-frame consistency. Discover how to leverage WAN2.1 models and ComfyUI plugins to create stunning visuals.

Looking for more Video workflows? Browse the Video hub for additional templates and guides.

Unlock 3D Magic: A Step-by-Step Workflow for Converting 2D Line Art

Unlock Lip-Synced Cartoon Avatar Videos with This AI-Powered Workflow

Summary

Chapter

workflow:

CustomNodes:

LoadWanVideoT5TextEncoder WanV...

workflow

Unlock the Power of Text-to-Video Generation with Aliyun's Wan2.1 Model