From Photos to Masterpieces: A Workflow for Generating Stylized Images with ControlNet and LoRA

CN
ComfyUI.org
2025-04-14 10:25:47

Generate stylized images from photos with ControlNet and LoRAs: preserve structure, apply ethnic or autumn styles, and more.

Key Nodes
Controlnet
VRAM
Low VRAM (≤8GB)
Reading Time
3 min
View Required Models

Workflow Overview

Generate stylized images from photos with ControlNet and LoRAs: preserve structure, apply ethnic or autumn styles, and more.

Content type: Workflow

Primary intent: Download

Required Models

  • Flux
  • Sdxl
  • Controlnet
  • Lora

Required Nodes

  • Controlnet

Setup Notes

  • Install the required models before opening the workflow template.
  • Recommended hardware: Low VRAM (≤8GB).

1. Workflow Overview

m9gxgj00dx3vog492tbbf28b4f206cf07bc0a3c6065f672bb47a5edb87c9cf9f1b8313115ea9d0462b.jpg
  • Purpose: Generate stylized images from input photos (e.g., model poses) using ControlNet for structure preservation and LoRAs for style (e.g., ethnic costumes, autumn forest themes).

  • Core Models:

    • 基础算法_F.1: Base text-to-image model (likely SDXL variant).

    • FLUX.1-dev-ControlNet-Union-Pro-InstantX: Multi-ControlNet for pose/structure control.

    • Meta-Llama-3.1-8B-bnb-4bit: Image captioning model (prompt reverse engineering).

2. Key Nodes

Node Name

Function

Installation

Dependencies

ControlNetLoader

Loads ControlNet model.

Manually place in ComfyUI/models/controlnet.

FLUX.1-dev-ControlNet-Union-Pro-InstantX.safetensors

LoraLoader

Applies style LoRAs (ethnic/autumn).

Place files in ComfyUI/models/loras.

少数民族服饰_V1.0.safetensors

Joy_caption

Reverse-engineers prompts via Llama-3.

Install unsloth/Meta-Llama-3.1-8B-bnb-4bit (HuggingFace).

Requires 4bit quantization libs.

3. Workflow Groups

  1. Reference Image Group

    • Input: User-uploaded photo (e.g., lQDPKGyzHiAGKAfNB9DNBQOwJksZaqj6fsIH2j_m_4e8AA_1283_2000.jpg).

    • Process: Generates depth map via DepthAnythingV2 for ControlNet.

  2. LoRA Group

    • Loads two LoRAs: 少数民族服饰_V1.0 (weight=0.2) and 秋日森林_秋天女孩_V1.0 (weight=0.7).

  3. Generation Group

    • Output: 1280x2000 image after latent upscaling and VAE decoding.

4. Inputs & Outputs

  • Inputs:

    • Reference image (required).

    • Resolution: Default 768x1024 (adjustable via EmptyLatentImage).

    • Negative prompt: "Imperfect, non-standard, poor quality".

  • Output: Stylized model image (e.g., in ethnic costume).

5. Tips & Warnings

  • ⚠️ Errors:

    • Missing ControlNet/LoRA files trigger "Missing model" errors.

    • Llama-3 requires ≥8GB VRAM; disable Joy_caption on low-end devices.

  • Optimization:

    • Use fp8_e4m3fn precision to save VRAM.

    • Adjust ControlNet weight (default: 0.75) in ControlNetApplyAdvanced.

FAQ